Lambda Architecture

This pipeline demonstrates the Lambda Architecture pattern — a hybrid approach combining real-time streaming (speed layer) with batch processing for accuracy and completeness.

Pipeline Topology

How Lambda Architecture Works

1. Speed Layer

Processes data in real-time with low latency. Uses Redis as an in-memory cache with pub/sub for instant updates. Sacrifices some accuracy for speed — data has a 90-second TTL. Think of it as the "approximate" view.

2. Batch Layer

Runs periodically (every 5 minutes) to compute accurate, historical views. Uses dbt to transform raw data into a dimensional model (star schema) with data quality tests. This is the "authoritative" view.

3. Serving Layer

Merges speed and batch views — real-time data fills recent gaps while batch provides historical completeness. The API lets you query each layer separately or the merged result.