Home > Blogs > Why Not to Use Sort and Aggregate Transformations in SSIS

Why Not to Use Sort and Aggregate Transformations in SSIS

I often come across queries about the SSIS package optimization techniques. There are many, and all depends up on scenarios. However, there are a few fundamental principals to architect SSIS packages. These help us to design a package, which runs faster. I am going to tell about two transformations that I always avoid: Sort Transformation and Aggregate Transformation.

You can categorize all the transformation in two broader categories: Synchronous and Asynchronous. Or, you may divide them into three types: Non-Blocking, Semi-Blocking and Blocking. It’s not right place to discuss all Categories/Types here. I will be discussing only Blocking Transformations here. Just for the sake of your knowledge, the Non-Blocking Transformation belong to Synchronous category, while Semi-Blocking and Blocking belong to Asynchronous category.

The Blocking transformations accept entire input at first i.e. all rows from source, perform the assigned operation, and then give the output of the operation. During this process, you have to wait for the transformation to collect entire required rows before getting output/desired result. This is where it bring degradation in the Data Flow performance, and in turn, slow package.

Both the Sort and Aggregate transformations belong to the Blocking types. This is the reason I always recommend not to use them altogether in your pipeline. Then question arises: How to sort the rows and, if required, to get aggregate?

There are ways! In case of OLE DB Source and SQL Server as source, you need to use SQL query, not the Table Name from drop down list, and perform these operations in the query. These operations will run at the database engine machine and will give you output instantly. In case of flat file/raw file, request your team to calculate these values and put it either in anther file or in first one/two lines into the source file. I am sure you know how to read/escape initial lines while reading flat files. There is an option available. Check it out! This is how you can avoid using Blocking transformations like Sort and Aggregate and, in turn, achieve better package performance.

I will be blogging about other techniques to optimize the SSIS package. So, stay tuned!

 

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: