Azure Data Factory — Copy Dynamic Filenames In Mapping Data Flow
This article explains how to store dynamic filenames with ADF’s Mapping Data Flow.
As a Data Engineer, I find capturing filenames is as important of an information as ensuring data quality. Numerous projects that I worked upon needed me to store the filenames in a separate attribute just so I can have additional information about which file supplied a particular set of data.
When you build a pipeline in Azure Data Factory (ADF), filenames can be captured either through (1) Copy Activity or (2) Mapping Data Flow.
For this article, I will choose the Mapping Data Flow Activity.
Task:
A bunch of excel files with different names are uploaded in Azure Blob Storage. The structure of the excel files is the same but they belong to different months. Establish a Data Pipeline which will run daily to read data from the excel files, upload that into a Azure SQL along with their respective filenames.Prerequisites:
1. Access to Azure Blob Storage
2. Access to Azure Data Factory
3. Linked Service to Azure Blob Storage as Source is established
4. Linked Service to Azure SQL as Sink is established
5. Azure SQL table has an attribute called filenm