Skip to main content

Why use Change Data Capture | Batch Data vs Streaming Data

In this video Michael Wang, a product manager at Cockroach Labs, explains when and why you should use change data capture including a head to head comparison of batch data and streaming data. Here are some chapters so that you can jump to your areas of interest. 00:00 Two Use Cases for Change Data Capture 00:23 What is Change Data Capture? 1:43 How to Stream data from a database to a data warehouse 1:50 What is ETL (Extract Transform Load)? 2:25 The problems with batch loading data 3:38 Using Change Data Capture for data streaming 5:09 What is an Event-Driven Architecture? 6:00 Example of an event driven architecture 8:31 How to use the Outbox Pattern with Change Data Capture 10:54 Other Change Data Capture Use Cases What is Change Data Capture (CDC)? Change data capture, is a set of technologies that allow you to identify and capture data that has changed in your database, so that you can use that data to take action on at a later stage. Use CDC For Streaming Data to Your Data Warehouse Streaming data from your database into your data warehouse typically goes through a process called ETL or ELT. This tradition, unfortunately, has its flaws. If you aren’t familiar with those terms then jump to that section of the video, and then keep watching to learn the shortcomings of ETL/ELT and how CDC is better. Use CDC for Event-Driven Architectures In event-driven architectures, one of the hardest things to accomplish is to safely and consistently deliver data between service boundaries. Typically, an individual service within an event-driven architecture needs to commit changes to both that service’s local database, as well as to a messaging queue, so that any messages or pieces of data that need to be sent to another service can do so. But this is challenging. What happens if your message commits to your database but not to the messaging queue? What happens if the message gets sent to the services but it doesn't actually commit in your database? CDC can solve these problems. Relevant Links: Streaming data out of CockroachDB: https://www.cockroachlabs.com/docs/v20.2/stream-data-out-of-cockroachdb-using-changefeeds.html Change Data Capture in CockroachDB: https://www.cockroachlabs.com/blog/change-data-capture/ Event-Driven Architecture Use Case: https://resources.cockroachlabs.com/case-study/zitadel