2012年3月22日星期四

Flat file Import performance question

Hi there,

I have a question regarding 2 approaches importing data from a flat text file. I'm taking the approach suggested by Phil Brammer to use a checksum to see whether a row has exists, and if it does check whether it has changed.

http://www.ssistalk.com/2007/03/09/ssis-using-a-checksum-to-determine-if-a-row-has-changed

I would like to hear some opinions on 2 approaches:

1) Importing the data int a temp table first then do the data processing from the temp table.
Or
2) Importing the data by doing the data processing direcly from the text file without placing it in a temp table first.

A hunch told me that it will be faster doing approach 1 but I would like to hear some experienced opinions first Smile

Best regards
Mike

I would think #2 would be *slightly* faster just because there are less components at work.

With that said, I prefer #1 always because it quickly loads the data to a table which I can then use SQL to get the data out. This might be advantageous for sorting the data, filtering the data, joining the data to another table, etc... Basically I like to push as much work off to the database engine that I can. Sure, SSIS can merge (join) data sets together, but the database engine will do a much better job. Also, having the data staged will help me to restart the process should it fail, work in batches (perhaps) by loading more than one file to the table if needed, and guarantee that the data will be there when I need it in an environment that might not guarantee that the file will always be around for use.

|||Thanks Phil,

I really appreciate your valuable input.

Best regards
Mike
|||

If you can afford, I would suggest to test both approaches, in your own environment, and then base your decisions on that way collected (real) data.

Thanks,

Bob

没有评论:

发表评论