Back to Projects

Project Overview

End-to-end sales analytics platform combining a Python ETL pipeline, SQL Server data warehouse, and interactive Power BI dashboards. Transforms raw transactional data from multiple sources into real-time business intelligence KPIs.

5Data Sources
AutoDaily ETL
12KPI Dashboards
Real-timeRefresh Rate

ETL Pipeline

Data flows from 5 sources (ERP, CRM, eCommerce, returns log, inventory) through a Python-based ETL pipeline on a daily schedule:

Python
import pandas as pd
from sqlalchemy import create_engine
import schedule, time

engine = create_engine('mssql+pyodbc://user:pass@server/SalesDB?driver=ODBC+Driver+17')

def extract_sales():
    erp  = pd.read_csv('erp_export.csv', parse_dates=['order_date'])
    crm  = pd.read_json('crm_export.json')
    ecom = pd.read_excel('ecommerce_orders.xlsx')
    return pd.concat([erp, crm, ecom], ignore_index=True)

def transform(df):
    df['order_date'] = pd.to_datetime(df['order_date'])
    df['revenue']    = df['quantity'] * df['unit_price'] * (1 - df['discount'])
    df['month']      = df['order_date'].dt.to_period('M').astype(str)
    df['region']     = df['region'].str.strip().str.title()
    df.dropna(subset=['customer_id','product_id','revenue'], inplace=True)
    return df

def load(df):
    df.to_sql('fact_sales', engine, if_exists='replace', index=False, chunksize=1000)
    print(f"Loaded {len(df):,} rows to SQL Server")

def run_etl():
    df = extract_sales()
    df = transform(df)
    load(df)

schedule.every().day.at("02:00").do(run_etl)
while True: schedule.run_pending(); time.sleep(60)

Power BI DAX Measures

DAX
-- Month-over-Month Revenue Growth
MoM Growth % =
VAR CurrentMonth = [Total Revenue]
VAR PrevMonth = CALCULATE([Total Revenue],
    DATEADD(fact_sales[order_date], -1, MONTH))
RETURN DIVIDE(CurrentMonth - PrevMonth, PrevMonth, 0) * 100

-- Top 10 Products by Revenue
Top10 Revenue =
CALCULATE([Total Revenue],
    TOPN(10, ALL(dim_product[product_name]), [Total Revenue]))

Dashboard Views

  • Executive Overview: Revenue, orders, AOV, MoM growth in one glance
  • Regional Performance: Map visual + bar chart by region/territory
  • Product Analysis: Top sellers, slow movers, category breakdown
  • Customer Segmentation: RFM analysis — Recency, Frequency, Monetary
Power BISQL ServerPython ETL DAXPandasschedule