Lambda Architecture
This pipeline demonstrates the Lambda Architecture pattern — a hybrid approach combining real-time streaming (speed layer) with batch processing for accuracy and completeness.
Pipeline Topology
How Lambda Architecture Works
1. Speed Layer
Processes data in real-time with low latency. Uses Redis as an in-memory cache with pub/sub for instant updates. Sacrifices some accuracy for speed — data has a 90-second TTL. Think of it as the "approximate" view.
2. Batch Layer
Runs periodically (every 5 minutes) to compute accurate, historical views. Uses dbt to transform raw data into a dimensional model (star schema) with data quality tests. This is the "authoritative" view.
3. Serving Layer
Merges speed and batch views — real-time data fills recent gaps while batch provides historical completeness. The API lets you query each layer separately or the merged result.