Developing Data Engineering Solutions with Databricks
118 4 Data Solutions Architecture — image created by the author The goal of a data engineering (DE) solution is to provide the right stakeholders with the data they need , in the format they need, when they need it. I emphasise “ solution ” over “ pipeline ” because data processing code is just one part of a data engineering solution. In my opinion, coding transformation and processing logic is not the same thing as developing the overarching solution that ultimately delivers the value. There is a lot of content on the internet on how to develop pipelines with Spark and Databricks. In this article, I take a broader view on general data engineering challenges in developing solutions and how to address them in Databricks. Why We Need Environments and Tests Data is not inherently valuable. It’s only valuable when it’s used by people. People only use data they trust and which provides the information they need. In cons...