Project Overview
End-to-end sales analytics platform combining a Python ETL pipeline, SQL Server data warehouse, and interactive Power BI dashboards. Transforms raw transactional data from multiple sources into real-time business intelligence KPIs.
5Data Sources
AutoDaily ETL
12KPI Dashboards
Real-timeRefresh Rate
ETL Pipeline
Data flows from 5 sources (ERP, CRM, eCommerce, returns log, inventory) through a Python-based ETL pipeline on a daily schedule:
Python
import pandas as pd
from sqlalchemy import create_engine
import schedule, time
engine = create_engine('mssql+pyodbc://user:pass@server/SalesDB?driver=ODBC+Driver+17')
def extract_sales():
erp = pd.read_csv('erp_export.csv', parse_dates=['order_date'])
crm = pd.read_json('crm_export.json')
ecom = pd.read_excel('ecommerce_orders.xlsx')
return pd.concat([erp, crm, ecom], ignore_index=True)
def transform(df):
df['order_date'] = pd.to_datetime(df['order_date'])
df['revenue'] = df['quantity'] * df['unit_price'] * (1 - df['discount'])
df['month'] = df['order_date'].dt.to_period('M').astype(str)
df['region'] = df['region'].str.strip().str.title()
df.dropna(subset=['customer_id','product_id','revenue'], inplace=True)
return df
def load(df):
df.to_sql('fact_sales', engine, if_exists='replace', index=False, chunksize=1000)
print(f"Loaded {len(df):,} rows to SQL Server")
def run_etl():
df = extract_sales()
df = transform(df)
load(df)
schedule.every().day.at("02:00").do(run_etl)
while True: schedule.run_pending(); time.sleep(60)Power BI DAX Measures
DAX
-- Month-over-Month Revenue Growth
MoM Growth % =
VAR CurrentMonth = [Total Revenue]
VAR PrevMonth = CALCULATE([Total Revenue],
DATEADD(fact_sales[order_date], -1, MONTH))
RETURN DIVIDE(CurrentMonth - PrevMonth, PrevMonth, 0) * 100
-- Top 10 Products by Revenue
Top10 Revenue =
CALCULATE([Total Revenue],
TOPN(10, ALL(dim_product[product_name]), [Total Revenue]))Dashboard Views
- Executive Overview: Revenue, orders, AOV, MoM growth in one glance
- Regional Performance: Map visual + bar chart by region/territory
- Product Analysis: Top sellers, slow movers, category breakdown
- Customer Segmentation: RFM analysis — Recency, Frequency, Monetary
Power BISQL ServerPython ETL
DAXPandasschedule