Skip to main content

Cockroach Labs' Cost Based Optimizer Helps You Run Faster SQL Queries

Back in 2018 Cockroach Labs released the first iteration of their built-from-scratch, cost-based SQL optimizer. Since 2018 the Cost-Based Optimizer has strengthened its ability to speed up your queries or lower query costs. In this talk, our Chief Product Officer, Nate Stewart, discusses the importance of the Cost-Based Optimizer from the stage at ESCAPE/19 - the multi-cloud conference.

How can you make your developers as productive as possible? How can you respond to changing customer demand, changing customer requirements? And how can you do this in multi-region environments, in these distributed environments? Now, if you're familiar with NOSQL databases, you know that working with eventual consistency or working with incorrect data and consistent data, that puts a lot of burden on the developer. You have to do all these workarounds, all these retries, you have to figure out when is this system actually going to tie out? This is something we solved pretty early on with CockroachDB by supporting those distributed transactions. But we were still exposing too much complexity to the application developers. We still made them think too much about where in the world their data resides. 

And so the other big thing that I wanna talk about for our newest release CockroachDB 19.2, is how we're taking a lot of the heavy lifting, and we're removing it from the application layer and bringing it down into the database. And to do that, we've had to rely on several things.

But the one I wanna talk about is our Cost-Based Optimizer. So the way an optimizer works in a database is it takes your query, it looks at thousands or maybe even millions of ways to execute it, and then tries to pick the plan that has the lowest cost or the best performance. And if you look at a database like Oracle or Postgres, they use different techniques but they're all based around the structure of the data and the table. How many rows are in the data? What are the statistics around the data in a particular table?


But they're missing something critically important for multi-cloud deployments. They're missing something critically important for distributed deployments. And that's taking into account where in the world your data is? It's taking into account where and what regions that this data reside? And so what we've done is we've re-imagined the Cost-Based Optimizer. To take into account the location of the data and use that to come up with better plans and use that to come up with better ways to execute your queries. So developers don't have to spend as much time focusing on, 'Oh is my data in this data center or that data center?' The optimizer can figure it out and the developer can focus on building great applications.


If you're interested in learning more about Cockroach Labs or if you'd like to work at Cockroach Labs you can find the appropriate link below: Careers: CockroachCloud: CockroachDB: Blog: Docs: Cockroach University: Community Slack: