Issue
At several customer sites, we migrate existing BI platforms from Azure services such as Azure Synapse, Azure Data Factory, or Azure Databricks to Fabric. One key step in the migration process is transferring Azure Data Factory pipelines to Fabric Data Factory.
In some cases, we work with a large number of small source tables (e.g., from an Azure SQL Database). After the migration, I reviewed the Fabric Capacity Metrics report and was surprised to see that a single execution of the daily load process consumed nearly 30% of the available capacity on an F8 instance. The majority of this usage was attributed to pipeline operations.
Given the size of the capacity, I initially believed that an F8 instance would be more than sufficient for the customer, considering the relatively small amount of data and the complexity of the calculations. So, why was the capacity usage so high?
Test Environment Setup
Next, I conducted an investigation on a Fabric pipeline with a Copy Data task that loads 12 tables from a test database into Parquet. The Copy Data task is executed within a ForEach loop. The goal was to explore ways to optimize the CU (Compute Unit) usage of Copy Data tasks.
What does Microsoft say?
According to the pricing page for Fabric pipelines, the following statement is provided for “Data Movement” tasks (Copy Data activity):
“Data Movement service for Copy activity runs. You are charged based on the Capacity Units consumed during the Copy activity execution duration.”In the pricing breakdown for how “Data Movement” is charged, Microsoft states:
“Metrics are based on the Copy activity run duration (in hours) and the intelligent optimization throughput resources used.“
Source: Pricing for data pipelines – Microsoft Fabric | Microsoft Learn, 18.12.2024But what exactly is “intelligent optimization”? According to Microsoft’s “Copy Activity Performance and Scalability Guide”, several factors are involved, such as parallel copy for partitioned sources and intelligent throughput optimization.
Source: Copy activity performance and scalability guide – Microsoft Fabric | Microsoft Learn
To investigate further, I conducted three tests with different settings, modifying the intelligent throughput optimization (ITO) option by comparing “Max” versus “Auto” and adjusting the batch count in the ForEach loop to 6. The results showed that the batch count significantly impacts the execution duration, while the ITO setting has little to no effect.
The results showed that the batch count significantly impacts the execution duration, while the ITO setting has little to no effect.
Now, let’s turn our attention to the Fabric Metrics App to examine the consumed CUs. What insights does it reveal about the resource usage?
All pipelines are charged the same. However, by examining the details more closely, we can see how many CUs are used by each individual activity. According to Microsoft’s pricing calculation, the duration of the operations is a key factor in determining the cost.
Source: https://learn.microsoft.com/en-us/fabric/data-factory/pricing-pipelines
his suggests that the duration should directly impact the CU calculation and costs. However, when we examine the individual operations, they all consume 360 CUs, regardless of the runtime.
This was quite unexpected.
Referring to a statement from a blog post, this is also what I assumed to be the basis for the calculation:
In my eyes:
- 1.5 CU per hour gives 0,0004166 CU per second.
- Say 30 s duration. 30 * 0,0004166 = 0,0125.
- Now how many intelligent optimization throughput resources are used? Was set to auto, so unclear.
But even assuming a maximum of 256, we only get 256 * 0,0125 = 3,2 CU (s). Far from listed 360!Source: Solved: Minimum CU (s) billing per copy? Or am I just bad … – Microsoft Fabric Community
Let’s take a look at the real-life scenario at the customer mentioned at the start of this post. When we examine the correlation between the duration of the operation and the CUs consumed, we find that nearly all data movement operations are consuming 360 CUs!
In fact, 99% of the operations at the customer result in 360 CUs.
When I look at the duration, it’s clear that the operations with higher CUs are generally the “long-running” ones as well, but there are only few.
Here, we observe another interesting pattern: It appears that the CUs are calculated in 360-unit increments. This could potentially be linked to a time calculation in seconds, perhaps something like ((60 * 60) / 10)?
Conclusion
Based on the findings, it appears that Microsoft’s pricing for data copy activities within Fabric pipelines may not accurately reflect the true consumption based on task duration.
It seems that very small copy tasks are rounded up to at least one minute of usage, leading to an inflated cost. This rounding effect significantly impacts customers with a large number of small objects, as even though the data is minimal, the usage calculation results in high consumption of Capacity Units (CUs).
The implication is that optimizing individual tasks may not have as much impact on billing as expected, while reducing the number of tasks to be processed could have a more substantial effect on overall costs. This discrepancy in how CUs are calculated warrants further clarification from Microsoft, particularly for scenarios involving many small data movements.
Be cautious when working with Data Factory copy activities, especially during the migration of pipelines from Azure Data Factory to Fabric Data Factory. The way usage and costs are calculated differs significantly between the two platforms!
PS: This is a re-written post based on my contribution in Microsoft’s Data Factory forums: Is the pricing of Fabric pipeline data copy activi… – Microsoft Fabric Community