How do we choose between Flink, Spark Structured Streaming, and ksqlDB?

If you need low latency, exactly-once guarantees, and watermarking that scales, `Flink` is the safest bet today. `Spark Structured Streaming` is fine for higher-latency (seconds to minutes) and simpler semantics, especially if you’re already a Spark shop. `ksqlDB` works for stream-native SQL on Kafka but gets tricky with complex joins and late data. Pick the tool your team can operate at 3 a.m., not the shiniest one.

Can Snowflake/BigQuery alone support real-time?

They can do near‑real‑time with micro‑batching, but for sub‑second decisions and robust late‑data handling you still want a streaming tier (Kafka/Flink/Pinot or ClickHouse). We often land curated streams into Snowflake for finance while powering ops dashboards off Pinot/ClickHouse.

How do we handle schema evolution without breaking consumers?

Use `protobuf`/`avro` with `Schema Registry` enforcing `BACKWARD_TRANSITIVE`. Add fields; never repurpose or remove. Version topics (`orders_agg_v1` → `_v2`) for breaking changes, run both during migration, and flip consumers deliberately.

What’s the minimal set of metrics to alert on?

Alert only on SLO burn: end‑to‑end latency, freshness, and completeness. Add sustained consumer lag, watermark delay, and quality error burn rate. Everything else is dashboard‑only. This cuts noise and MTTR.

How does GitPlumbers engage on this?

We run an assessment to define your decision loops and SLOs, stand up a reference stack (Kafka/Flink/Schema Registry/ClickHouse/Snowflake), wire quality gates and observability, then ship one decision loop in 90 days. After that, we help your team own it via playbooks, game‑days, and handover.

Data-engineering · Oct 3, 2025 · 7 minute read

When ‘Real‑Time’ Lies to Finance: Building Streaming Pipelines You Can Take to the Board

Your CEO doesn’t care about Kafka. They care that the number on the dashboard is right, now. Here’s how to build real‑time data pipelines that are fast, correct, and tied to business value.

Back to all posts

When ‘Real‑Time’ Lies to Finance: Building Streaming Pipelines You Can Take to the Board

Key takeaways

Implementation checklist