I am transferring data from one database to another directly with some minor changes.
But in one step i need to dublicate a row (if a column value satisfies a condition). How can i achieve this without using "Multicast-Conditional" split (which requires dublication of all rows and then condition proceeds!). By the way the reason i don't want to use Multicast is the table i am processing has about 20 columns and 20 M rows :(
Thanks in advance !
You could use an asynchronous Script Transform. In an asynchronous transform your are responsible for reading the input buffer rows and adding them to the output buffer, so you can do this conditionally, one or more times for each input row. They are a bit painful I find as you have to manually define the output columns in the output buffer by hand which can be tedious.
Perhaps you should reconsider your reluctance to use the Multicast. It does not immediately copy all data, consuming twice the memory. Data is only duplicated as and when required. Up and till that point it uses a pointer like behaviour to reference the existing data. The cost will only come when you force a change in the buffer. I would think that this would not as costly as you may first think. Why not try both methods of a smaller (narrower) dataset, faster to develop a test case.
Creating an Asynchronous Transformation with the Script Component
(http://msdn2.microsoft.com/en-us/library/0d814404-21e4-4a68-894c-96fa47ab25ae.aspx)
No comments:
Post a Comment