Data engineering

What is dbt, and why is it useful?

Mats LundeNov 19, 2022

What is dbt (data build tool) and why are all the data engineers talking about it?

dbt is a very helpful tool that allows some software engineering best practices to be adopted in the data engineering realm. It also helps data teams build reliable and trusted data pipelines. Today I’d like to highlight three aspects where dbt shines.

Simplifying pipelines

dbt allows analysts and engineers to solely use SQL without needing to branch out to other programming languages. It also allows for smaller, more modular code. Instead of repeating business logic in several places, where inconsistencies might arise, you can write the business logic in one place, and then re-use that component.

Automated testing

Together with the more modular code dbt also introduces continuous integration into the data pipeline. This means that instead of committing the entire pipeline for every change, components can be individually tested, pushed, and monitored. This allows for a much faster discovery of potential bugs, and fixing said bugs.

Data alerts

dbt also works as an orchestrator with integrated monitoring. In other words, you can specify schedules for your workflows, and you can set up alerts to notify you about stale data.

Conclusion

In summary, dbt simplifies data pipelines by introducing software engineering best practices, automated testing, and easy orchestration.