2012年3月26日星期一

FlatFile Parsing Problem - Urgently Need Assistance!

We have a BIG problem that has been occurring for quite sometime. We have a RAGGED-RIGHT FFS (FlatFileSource) component which receives a FF and then transforms to XML. We define 10 FIELDS with the last field as {CRLF} per ragged right style. Simple right? The source file is sent to us from multiple separate groups, some of which don’t use the last two fields and have early line termination, so they basically are not there. The problem is that when we define X amount of fields in SSIS, it expects X amount of fields to be there. So when we receive the source file that doesn’t contain last two fields, SSIS tries to read in the next line, and in most cases successfully places the fields from the next line into the missing fields. Now what we end up with is a source file with 10 rows, but only half successfully transformed, with every other row fitted into the last two XML fields. SSIS completely ignores the {CRLF} until those specific numbers of spaces/fields have been met – even if it has to go onto the next line to get them. This seems horribly wrong to me. I would think that at the very least there should be some type of validation that prevents it from grabbing the next line. I thought that is what fixed-length SSIS style is for, and ragged right for allowing to parse multiple rows. We could enforce all clients to produce X char spaces (and we do), but there can be cases in any aspect where a file could get accidentally sent over with a shortened row. I have heard from certain individuals that SSIS adapter wasn’t designed with this in mind. Is this true, I am to believe that this would still get parsed without throwing any errors and potentially damage the ERP system it is going into? If this is the case then I feel extremely disheartened about SSIS – especially with all of the great advancements it has made.

We desperately need a workaround or at the very least some way to validate that a new row {CRLF} isn’t getting picked up in the first row of the flat file connection. Please can someone help ASAP!?

You could read the rows and then process in a script component based on your logic for handling these errors.

This is a common issue with processing flat files, and producing generic solutions. Biztalk was a real pain. If I remember from my Biztalk days a lot of the flat file parsing capabilities was driven by EDI. What you may think is logical may not suit some applications.

|||

Unfortunatelly Simon that seems to be the answer all around the board - too bad since the product was already designed (150+ DF transforms have to be modified). So do you recommend to do away with the FF Conn Mgr, and use a script source component to dynamically read in and calc the file rows? Can you perhaps provide an example of how to do what you mention?

Thanks,

|||

Haven't heard a response back, just keeping this on top - as this is a hot issue for us!

Simon (or anyone for that matter), I am interested in hearing how you would approach this, would you use a connection mgr, or use a custom src component and dynamically get in code, etc...

Thanks for helping!

|||

Have a look at the package I/ve created.

http://sqlblogcasts.com/files/3/transforms/entry409.aspx

I haven't done any type casting, or validation of field lengths just used the default 50 character column size.

|||Interesting, thanks Simon. One issue, we still need the ability to filter error rows, as the ff source allows. Can we do this using an intermediate custom transform, or must we manually create a new output for the bad rows? If the latter is the only way then do you just create a new output with columns x,m,z... and then similarly have an if/catch that would contain:

Row.Column(X,M,Z...) = extra_split_columns(col)

or, is there another function/method used to do this new output redirection?

|||

If you create an output and assoicate with the input (set the synchronous input ID to the ID of the input. That way you can redirect the row. Or you can have a seperate output where you have to copy the values from the input row to the error row.

I've updated the package http://sqlblogcasts.com/files/4/integration_services/entry412.aspx

|||Very good Simon, I think this hits the nail on the head... Thanks.

没有评论:

发表评论