Friday, February 24, 2012

Design Question- Dynamic Configuration of Metadata in Dataflow

I have a design question that I'd like some input on. I am trying to archive data from an extremely large production database. The tables to be archived changes quite often. It is currently along the lines of 80-100 tables with the possibility (likelihood) to grow from there.

If at all possible, I'd like to avoid writing an individual dataflow transformation for each table. I know that SSIS does not offer the same capabilities to change the metadata at runtime as DTS. I am currently exploring the option of programmatically creating/modifying the packages through the .Net framework. (using this link as a guide: http://msdn2.microsoft.com/en-us/library/ms345167.aspx).

I have concerns about the performance of this approach and was wondering if anyone had any feedback, or has implemented something similar, or has any other ideas on a different way to accomplish the same thing.

Thanks so much for your help,

Jessica

JessicaElise wrote:

I have a design question that I'd like some input on. I am trying to archive data from an extremely large production database. The tables to be archived changes quite often. It is currently along the lines of 80-100 tables with the possibility (likelihood) to grow from there.

If at all possible, I'd like to avoid writing an individual dataflow transformation for each table. I know that SSIS does not offer the same capabilities to change the metadata at runtime as DTS. I am currently exploring the option of programmatically creating/modifying the packages through the .Net framework. (using this link as a guide: http://msdn2.microsoft.com/en-us/library/ms345167.aspx).

I have concerns about the performance of this approach and was wondering if anyone had any feedback, or has implemented something similar, or has any other ideas on a different way to accomplish the same thing.

Thanks so much for your help,

Jessica

Assuming that the package you build programatically is the same as the package that you build in the SSIS Designer then performance will be exactly the same. Generating the package using the API shouldn't take too long - depends how good a coder you are

To be honest alot of people have mentioned they are attempting this but nobody has reported that they have achieved it. But then again why would they? People come on here to report problems, not successes.

Does that help at all? Probbaly not!!

-Jamie

|||Programmatically creating packages for each table is the way to go unfortunately. Have a look at this link: http://www.ivolva.com/ssis_code_generator.html I haven't used it myself, but if it does what it says, it should be a big help in writing the code to dynamically generate packages.
|||

Well I'm reporting a success, Jamie. I got it working in a very base, proof-of-concept form. I've only tested it for very small datasets though. We'll see what happens when I increase the load (and run it multi-threaded). But, in case this can be of any use to anyone else:

I created a custom component that will create the a dynamic package based on an inputted source table, destination table and package name (source and destination connection strings are currently hardcoded, but I'm going to change that as I develop the component). I've been successful in using this component to create the package and then using the "Execute Package Task" to execute the package that was just created. Chuck that into a ForEach Loop that goes through a recordset containing a DestinationTable, SourceTable, and PackageName and modifies the expressions of the component and it works. Smile

Here's the meat-and-potatoes of the Custom Component I created.

Code Snippet

using System;

using System.Collections.Generic;

using System.Text;

using Microsoft.SqlServer.Dts.Runtime;

using System.IO;

using System.Data;

using Microsoft.SqlServer.Dts.Pipeline;

using Microsoft.SqlServer.Dts.Pipeline.Wrapper ;

namespace DynamicDataTransfer

{

[

DtsTask(

DisplayName = "Dynamic Data Transfer Task",

Description = "Dynamic Data Transfer.",

UITypeName = "DynamicDataTransfer.DynamicDataTransferTaskUI, DynamicDataTransfer, Version=1.0.0.0, Culture=Neutral, PublicKeyToken=xxxxxxxxxxxxxxxx"

)

]

public class DynamicDataTransfer : Task

{

private string SourceTableNameInternal = "";

public string SourceTableName

{

get

{

return this.SourceTableNameInternal;

}

set

{

this.SourceTableNameInternal = value;

}

}

private string PackageNameInternal = "";

public string PackageName

{

get

{

return this.PackageNameInternal;

}

set

{

this.PackageNameInternal = value;

}

}

private string DestinationTableNameInternal = "";

public string DestinationTableName

{

get

{

return this.DestinationTableNameInternal;

}

set

{

this.DestinationTableNameInternal = value;

}

}

/// <summary>

/// Executes the task

/// </summary>

public override DTSExecResult Execute(Connections connections, VariableDispenser variableDispenser, IDTSComponentEvents componentEvents, IDTSLogging log, object transaction)

{

try

{

Microsoft.SqlServer.Dts.Runtime.Application a = new Microsoft.SqlServer.Dts.Runtime.Application();

String SSISPackageFilePath;

SSISPackageFilePath = PackageNameInternal;

if (File.Exists(SSISPackageFilePath))

File.Delete(SSISPackageFilePath);

Package pkg = new Package();

MainPipe dataFlow;

ConnectionManager conMgrSource = pkg.Connections.Add("OLEDB");

ConnectionManager conMgrDestination = pkg.Connections.Add("OLEDB");

conMgrSource.Name = "OLEDBConnectionSource";

conMgrSource.ConnectionString = "Data Source=SOURCESERVER;Initial Catalog=SOURCEDB;Provider=SQLNCLI.1;Integrated Security=SSPI;Auto Translate=False;";

conMgrDestination.Name = "OLEDBConnectionDestination";

conMgrDestination.ConnectionString = "Data Source=DESTINATIONSERVER;Initial Catalog=DESTINATIONDB;Provider=SQLNCLI.1;Integrated Security=SSPI;Auto Translate=False;";

Executable exe = pkg.Executables.Add("DTS.Pipeline.1");

TaskHost th = exe as TaskHost;

th.Name = "DynamicDataFlowTask";

dataFlow = th.InnerObject as MainPipe;

IDTSComponentMetaDataCollection90 metadataCollection = dataFlow.ComponentMetaDataCollection;

IDTSComponentMetaData90 OLEDBSource = dataFlow.ComponentMetaDataCollection.New();

OLEDBSource.Name = "OLEDBSource";

OLEDBSource.ComponentClassID = "DTSAdapter.OLEDBSource.1";

IDTSComponentMetaData90 OLEDBDestination = dataFlow.ComponentMetaDataCollection.New();

OLEDBDestination.Name = "OLEDBDestination";

OLEDBDestination.ComponentClassID = "DTSAdapter.OLEDBDestination.1";

// Get the design time instance of the component.

CManagedComponentWrapper InstanceSource = OLEDBSource.Instantiate();

// Initialize the component

InstanceSource.ProvideComponentProperties();

// Specify the connection manager.

if (OLEDBSource.RuntimeConnectionCollection.Count > 0)

{

OLEDBSource.RuntimeConnectionCollection[0].ConnectionManager = DtsConvert.ToConnectionManager90(pkg.Connections["OLEDBConnectionSource"]);

OLEDBSource.RuntimeConnectionCollection[0].ConnectionManagerID = pkg.Connections["OLEDBConnectionSource"].ID;

}

InstanceSource.SetComponentProperty("OpenRowset", SourceTableNameInternal);

InstanceSource.SetComponentProperty("AccessMode", 0);

//reinitialize the component

InstanceSource.AcquireConnections(null);

InstanceSource.ReinitializeMetaData();

InstanceSource.ReleaseConnections();

// Get the design time instance of the component.

CManagedComponentWrapper InstanceDestination = OLEDBDestination.Instantiate();

// Initialize the component

InstanceDestination.ProvideComponentProperties();

// Specify the connection manager.

if (OLEDBDestination.RuntimeConnectionCollection.Count > 0)

{

OLEDBDestination.RuntimeConnectionCollection[0].ConnectionManager = DtsConvert.ToConnectionManager90(pkg.Connections["OLEDBConnectionDestination"]);

OLEDBDestination.RuntimeConnectionCollection[0].ConnectionManagerID = pkg.Connections["OLEDBConnectionDestination"].ID;

}

InstanceDestination.SetComponentProperty("OpenRowset", DestinationTableNameInternal);

InstanceDestination.SetComponentProperty("AccessMode", 0);

//reinitialize the component

InstanceDestination.AcquireConnections(null);

InstanceDestination.ReinitializeMetaData();

InstanceDestination.ReleaseConnections();

//map the columns

IDTSPath90 path = dataFlow.PathCollection.New();

path.AttachPathAndPropagateNotifications(OLEDBSource.OutputCollection[0], OLEDBDestination.InputCollection[0]);

IDTSInput90 input = OLEDBDestination.InputCollection[0];

IDTSVirtualInput90 vInput = input.GetVirtualInput();

foreach (IDTSVirtualInputColumn90 vColumn in vInput.VirtualInputColumnCollection)

{

IDTSInputColumn90 vCol = InstanceDestination.SetUsageType(input.ID, vInput, vColumn.LineageID, DTSUsageType.UT_READWRITE);

InstanceDestination.MapInputColumn(input.ID, vCol.ID, input.ExternalMetadataColumnCollection[vColumn.Name].ID);

}

a.SaveToXml(SSISPackageFilePath, pkg, null);

th = null;

pkg = null;

a = null;

OLEDBSource = null;

OLEDBDestination = null;

InstanceSource = null;

InstanceDestination = null;

return DTSExecResult.Success;

}

catch (Exception exc)

{

componentEvents.FireError(0, "DynamicDataTransfer.Execute", "Task Errored: " + exc.ToString(), "", -1);

return DTSExecResult.Failure;

}

}

}

}

(Just as a side-note, your blog has helped me a ton since I've started with SSIS. Thanks for posting all of your findings.)

Jess

|||

Hey Jess,

I'm impressed. This is the first time I've seen a working bit of code that does this. Do you mind if I put a link from my blog to this? I think this needs to be shared.

One thing, the correct nomenclature is a custom task, not a custom component. My god, I must be in a critical mood

Regards

-Jamie

|||Custom Task, got it. Sure thing, link to your blog, I'll be proud. Smile|||

JessicaElise wrote:

Custom Task, got it. Sure thing, link to your blog, I'll be proud.

Cool. Done!

Building Packages Programatically

http://blogs.conchango.com/jamiethomson/archive/2007/03/28/SSIS_3A00_-Building-Packages-Programatically.aspx

Thanks again Jess.

-Jamie

|||

Hi Jamie,

Since the Script task only support VB.NET, I'd like to know how the above code can be deployed in our SSIS Package

Are there any conversion tool available?

|||

Subhash512525 wrote:

Hi Jamie,

Since the Script task only support VB.NET, I'd like to know how the above code can be deployed in our SSIS Package

Are there any conversion tool available?

Yes, I'm sure there is. Not being a .Net expert myself I don't know where to find them but I'm sure they exist.

-Jamie

|||

Thanks for your reply Jamie,

I was successful implementing the above code in VB.NET code with slight modifications and additions required for my package using online code converter available here . I must thank you both a lot for the code.

Subhash Subramanyam

|||

Jess,

I was directed to your post by John Welch. I have a need very similar to yours, however, I am not a developer and I am having difficulties implementing your code snippet. I have VS Express for C# loaded and when I create a class library project I get the following error:

The type or namespace name 'Dts' does not exist in the namespace 'Microsoft.SqlServer' (are you missing an assembly reference?)

Can someone point me in the right direction? What other code elements are needed to make this "meat and potatoes" portion work?

Thanks in advance,

Keith

|||

Hi ,

I am also facing this problem, but I didn't understand where I have to write this code. If it is in script task then where is the main() .Can you please tell me or reply on this email address.(ramanandap@.gmail.com).

Thanks in advance

|||

KC wrote:

Jess,

I was directed to your post by John Welch. I have a need very similar to yours, however, I am not a developer and I am having difficulties implementing your code snippet. I have VS Express for C# loaded and when I create a class library project I get the following error:

The type or namespace name 'Dts' does not exist in the namespace 'Microsoft.SqlServer' (are you missing an assembly reference?)

Can someone point me in the right direction? What other code elements are needed to make this "meat and potatoes" portion work?

Thanks in advance,

Keith

You need to add a reference to the Microsoft.SQLServer.DTSRuntimeWrap assembly to your project. Use Project..Add Reference.

|||

Mahi12345 wrote:

Hi ,

I am also facing this problem, but I didn't understand where I have to write this code. If it is in script task then where is the main() .Can you please tell me or reply on this email address.(ramanandap@.gmail.com).

Thanks in advance

This was created as a custom task in VB.NET. That gets compiled to a .NET assembly (DLL), registered in the GAC, and then can be used from your SSIS packages. If you look under "Developing Custom Tasks" in Books Online, you can see more information about this.

No comments:

Post a Comment