Member-only story
What Are The Qualities Of a Smart Azure Data Factory Pipeline?
This article describes my experiences and opinions about useful qualities that need to be included in a ADF pipeline.
Over the past many years of building data pipelines in Azure Data Factory (ADF from hereon) and improvising the pipelines built by ex-employees, I have witnessed a series of commonalities that separates pipelines built by experienced data engineers(DE from hereon) from those built by junior DEs.
So What Do Those Look Like?
👉 Dynamic Datasets
Dynamic datasets refer to source/sink that can be reused across different pipelines by providing custom values.
A smart DE identifies patterns in the data sources and the sinks they need to be written to and looks to build a dynamic dataset for those.
Above is an example that shows a dynamic excel dataset. Different pipelines can now reuse the same dataset by providing their custom values for
* root container
* directory on ADLSG2
* excel sheet name to read the data from
* excel cells range where the…