End-to-End Analytics Platform: Databricks, Streamlit & GP...

TL;DR: 12-week engagement. Global manufacturer. Petabytes of untouched e-commerce and marketing data. I built a real-time analytics platform covering 14 markets — Databricks medallion architecture, Streamlit multi-page app, GPT-4o natural language querying, z-score anomaly detection, and ML models for churn and LTV. Two production pipeline failures found and fixed. Teams now have answers before the meeting starts.

Every large company has data. Very few actually use it.

When I started this engagement, the client had petabytes of e-commerce, marketing, and customer data sitting in a data lakehouse — largely untouched by the people who needed it most. Marketing managers were making budget decisions based on gut feel. No one could answer basic questions like “which channel actually drives revenue when you account for the full customer journey?” or “what’s happening to conversions in Germany right now?” without waiting days for a data analyst to pull a report.

Twelve weeks later, they had a real-time analytics platform their teams use every day. Here’s what I built and how.

The Stack

Data warehouse: Databricks (Delta Lake, Unity Catalog, SQL Warehouses)
Orchestration: Databricks Workflows — multi-task DAGs with explicit dependency graphs
Frontend: Streamlit — multi-page app, fragment-based tab isolation, session state management
LLM integration: Azure OpenAI GPT-4o via LiteLLM, with Databricks AI/BI Genie as fallback
ML: Databricks-hosted models for churn scoring, LTV cohort modeling, product affinity
CI/CD: GitHub Actions, containerized deployment
Monitoring: Custom z-score anomaly detection, live pipeline health checks

What I Built

10 production dashboards across two product lines

The platform covers two distinct business units — direct-to-consumer e-commerce and B2B dealer/distributor sales — each with their own data models, KPIs, and user personas. Rather than one monolithic dashboard, I built a navigation-driven multi-page app where each page is purpose-built for a specific decision type.

B2C dashboards:

Paid media ROI — ROAS by channel, dayparting heatmaps, demographic bid optimisation
Revenue analysis — 14 markets, WoW/YoY comparisons, anomaly highlighting
LTV & cohort analysis — retention curves, LTV/CAC ratio, acquisition channel quality
Out-of-stock impact — lost revenue estimation, supply chain alerts
Multi-touch attribution — linear vs last-click, identifying under/overvalued channels
ML Intelligence — churn risk scoring, product affinity, session feature analysis
Live notification centre — anomaly alerts, ROAS alerts, funnel collapse detection

AI-powered natural language querying

One of the flagship features is a Chat with Data interface — users ask questions in plain language and get SQL-backed answers with narrative explanations.

The pipeline:

GPT-4o generates SQL from the user’s question using a dynamically-built system prompt with live schema context, actual column sample values, and few-shot examples
A safeguard layer validates the SQL — read-only, allowlisted tables, LIMIT wrapper
Results are executed against Databricks and cached by MD5 hash
A streaming narrative plus follow-up suggestions are generated in one pass (saves 2–3s vs two separate calls)
If the LLM produces no valid SQL, Databricks Genie handles the fallback

The system learns over time — thumbs-up feedback saves successful query patterns as few-shot examples that improve future responses.

Anomaly detection engine

Built a z-score based system that compares each day’s metrics against a 28-day rolling baseline:

z = (today_value − 28d_mean) / 28d_stddev

critical: |z| > 3
warning:  |z| > 2

This runs as a scheduled Databricks job and writes to a gold table that powers both the dedicated anomaly dashboard and the live notification centre. Users see cards ranked by severity the moment they open the app.

Out-of-stock revenue impact model

Built an estimation model that flags products with high search/view intent but zero purchases, quantifies lost revenue using lost_atc × CVR × AOV, and surfaces actionable alerts by market. This has directly driven supply chain escalations — with the data attached.

Pipeline monitoring and self-healing

Part of the engagement was keeping the platform reliable, not just building it. I instrumented all 29 data tables with freshness checks and built a health audit that runs against the Databricks Jobs API to validate:

Last successful run timestamp per table
Days since last write
Job dependency graph integrity

During this I identified and fixed two production pipeline failures:

Race condition — a market scorecard notebook tried to read a table 6 minutes before a parallel task finished creating it. Fixed by adding an explicit task dependency in the Databricks job DAG.
Quota exhaustion death spiral — a third-party API ingest notebook kept retrying after hitting rate limits, burning through the daily quota. Fixed by patching the notebook to cap daily API calls within quota limits, then redeploying via the Databricks Workspace API.

Both fixes applied programmatically via REST API. No manual UI intervention.

The Data Architecture

The platform follows a medallion architecture:

Source	Bronze	Silver	Gold
GA4 / BigQuery	`bronze_ga4_events`	`silver_ga4_session`	`gold_funnel_daily`
Google Ads API	`bronze_gads_raw`	`silver_gads_hourly`	`gold_paid_roas_daily`
Salesforce / SFCC	`bronze_orders`	`silver_order_lines`	`gold_daily_sales`
Microsoft Clarity	`bronze_clarity`	—	`gold_cwv_page_daily`
3rd party feeds	`bronze_feeds`	`silver_*`	`gold_market_scorecard`

One non-obvious challenge: several tables use STRING dates in yyyyMMdd format (GA4’s native format) while others use proper DATE types. Getting this wrong silently returns zero rows — no error, just missing data. I documented a canonical filter model covering every table’s quirks, including micro-currency encoding in the Google Ads cost column (divide by 1,000,000 for USD) and malformed market codes that need filtering before any aggregation.

Results

The platform went from zero to covering 14 markets across Europe:

29 live Databricks tables powering the dashboards
~150M+ rows of session and user feature data queryable in under 3 seconds
7 scheduled pipelines running daily, with dependency validation
2 production failures identified and fixed during the audit phase
Complete stakeholder documentation covering methodology, data lineage, and chart interpretation

More importantly: the teams using it now have answers before the meeting starts — not after.

What I Learned

Schema assumptions will hurt you. I found column names that differed between what the documentation said and what was actually in production. Live schema discovery on every deploy — not hardcoded column names — is non-negotiable.

LLM accuracy depends more on context quality than model quality. Injecting actual sample values, date ranges, and known-bad query patterns as few-shot examples improved SQL accuracy far more than any prompt engineering trick.

Data pipelines fail silently. A notebook that “succeeded” but wrote zero rows to a gold table looks identical to a successful write unless you add explicit row-count checks. Build data quality assertions into the pipeline itself, not just the dashboard.

Interested in building something like this for your team? I work with companies that have the data but aren’t using it. Get in touch.

From Raw Data to Revenue Decisions: Building an End-to-End Analytics Platform

The Stack

What I Built

10 production dashboards across two product lines

AI-powered natural language querying

Anomaly detection engine

Out-of-stock revenue impact model

Pipeline monitoring and self-healing

The Data Architecture

Results

What I Learned

Read Next

Building With AI Agents as Co-Workers: My Stack, My Costs, and What I Actually Learned

Tools Used in This Article

Get insights and updates first

The Stack

What I Built

10 production dashboards across two product lines

AI-powered natural language querying

Anomaly detection engine

Out-of-stock revenue impact model

Pipeline monitoring and self-healing

The Data Architecture

Results

What I Learned

Read Next

Building With AI Agents as Co-Workers: My Stack, My Costs, and What I Actually Learned

Tools Used in This Article

Relaterat innehåll

Beyond Dashboards: How to Build Living Data Experiences with Conversational Analytics

Why Most Business Dashboards Fail (Enterprise, Small Business & Sports Teams) - Data Engineer Insights

Beyond Dashboards: Building Proactive Data Agents That Find and Share Insights Automatically

Get insights and updates first