Advertisment

SQL Server Integration Services

author-image
PCQ Bureau
New Update

Last month we took a look at the new SQL Server Integration Services (SSIS) component that is part of SQL Server 2005 and replaces the old DTS system. In this we saw that SSIS packages can be created in the Business Intelligence Development Studio (BIDS) and can be done using a design view where you can design the entire process. Each package consists of a Control Flow – a workflow like component that decides which steps are to be executed using tasks; and Data Flows – which do the actual data transformations. We take a deeper look into the latter component in this article.

Advertisment

Direct Hit!

Applies To: Database managers



USP: Build high performance data integration solutions


Primary Link:



http://msdn2.microsoft.com/enus/library/ms141026.aspx




Google Keywords: SSIS

One of the tasks in a Control Flow is the Data Flow Task. When you add this task onto the design surface of the Control Flow, you can change the properties of the new task by right-clicking it and selecting ‘Edit.’ When you do that, instead of the normal dialog box that comes up for all other tasks, you are instead taken to another design surface called the Data Flow Design area. This is where you can design what happens in that particular data flow.

In fact, the design surface also gives rise to a completely different set of tools in the toolbox. The tools you can use on this surface are divided into three distinct areas—Data Sources, Data Transformations and Data Destinations. If you notice, these three categories correspond to the Extract, Transform and Load processes of the ETL.

Advertisment

Each Data Flow task must have at least one of each connected to each other by Service Paths. That is, in every data flow there needs to be a Data Source from which data is Extracted, a data transformation which performs the Transformation and a data destination into which the transformed data is Loaded.

In the Flat File

Connection Manager Editor, you can configure the flat file source to read a

comma delimited text file
In the Sort

Transformation Editor, you can configure the sort transformation to sort on

a few columns

So let’s quickly create a simple data flow task that picks up raw data in a comma delimited text file and converts it to an Excel format.

Advertisment

The text file has the file row as headers and is comma delimited. So we can drop a Flat file data source onto the design view and configure it to read from the text file. Make sure the settings for the header and the delimiter are correct and look at a preview to see if the data is being picked up correctly.

Once this is done drop in a Sort transformation onto the page and drag the green service path arrow from the Flat file source to this. Edit the properties of the sort transformation and check the columns on which you wish to sort.

You can then also add a derived column transformation and drag the green arrow from the Sort to this. In the derived column you can derive a new column or replace an existing one using any valid data expression. For instance you can combine two columns into one such as FirstName and LastName into a Full Name column.

Advertisment
Adding a new derived

column called Full Name, that uses First Name and Last Name
In the Excel Destination

Editor, you can add an Excel destination and name the sheet of that file

Finally drop in an Excel file destination and drag the green arrow from the derived column to this. Open the properties of this component and enter the name of the Excel file, the sheet into which the transformed data should be sent to and the column mappings to the sheet.

Once all this is done, save the package and run it. You should now end up with a new Excel file in which the data from the text file is loaded after sorting and creating a new column.

Advertisment

This was of course a very simple example. The Data transformation tasks allow you to perform many different activities. For instance, you can have two different sources – say, one from a text file and another from an Oracle database. You can perform a merge join on the sorted data from them and load them into a SQL Server database. You can perform a number of complex transformations on the data before sending it off to the final destination.

Mapping columns to send

to Excel. Note that First Name and Last Name are ignored. Instead new Full

Name column is used
The completed Data Flow

Task. Notice the items in the toolbox that are used for ETL

All in all, SSIS is a tool that allows you perform extremely powerful data transformations from almost any data format to any other data format. It is an enterprise class system that you can use to consolidate data into a data warehouse after ensuring that this data has been thoroughly cleaned by your data transformation business rules that you have set up using SSIS.

Advertisment