2012年3月22日星期四

Flat file CSV problem

Hi all,

I hope someone can help with a problem i'm having.
I want to process a large number of CSV files into various tables in an SQL database.
The CSV file contains entries on a row by row basis relating to specific events (indicated by an eventID in column 0).
Eventually i think i want to be using a conditional split to process each row seperately depending on the eventID but before i get this far i am having a problem with the source data.

Each event can have varying amounts of columns filled in in the CSV file. And each CSV file can have multiple event types in it.

The flat file manager seems to merge a number of different rows into one within the preview pane. It seems to ignore the end of row delimiter of CrLf.

Can anyone please help me to sort this so that each row is on its own and will allow me to pass the structured data set to a conditional split task?

Many thanks in advance,

GrantOk,

I seem to have managed to set all the data for each row into one component and have a script to extract the EventID to one output column and the remaining parameters to a second column. I need to perform a check on the EventID by passing it and another variable into a stored procedure. How do i go about doing this in the data flow section? is it possible or do i have to look at using the control flow section for this?

Thanks,

Grant|||

Grant,

What kind of check does that procedure performed? Keep in mind that any operation you define in the dataflow will be performed in row by row basis; so a call to a procedure in a data flow will be executed as many times as rows you have in the file. Since you already succeed on separating the eventID from the rest I would try to use a conditional split transformation based on the EventID value and then to performed specific transformations/checks to every data pipeline.

Rafael Salas

|||Hi Rafael,

Thanks for the response. The stored procedure i was talking about returns a value based on the event ID. Having thought about what you said i can set this value manually after the conditional split has been carried out and i know what the event ID is. Does that sound more plausible?

I am trying to rewrite a windows application that processes these files, initially this used a stored procedure to process each row of data. The main stored procedure calls other sub procedures within it. The problem was that the stored procedure itself was getting very unwieldly with a large number of if and nested if statement which meant that the addition of new events was complex and time consuming. Using the SSIS package a believe i can make this a much easier process to manage.

Is it still possible to call a stored procedure in the dataflow task once the conditional split has been performed? i realise that there may be other stored procedures required once i have my data row?

I have just attempted to run some SQL code on a per row basis and have discovered that i cannot seem to user variables or parameters in the OLE DB Command task. How would i go about either returning a value dependant on if the current eventID exists in another table in the database or indeed; how to insert data into a specific table if it doesn't exist.
After i do that then i can get on to processing the event data to the table where this is stored. I hope that makes sense.

Many thanks,

Grant|||

Wow a lot of questions!

First to all let me clarify that there is nothing wrong with having OLE DB Command tasks in your Dataflow; it is just that as personal pratice I try to use bulk operations against the DB when possible.

If what you are trying to use thr stored procedures for is to check if a row exists in a table; you can use the lookup transformation in your dataflow; then use the error output as your insert pipeline and the output as the 'already exists' kind of pipeline (or just not use it if you want to ignore them). The lookup can be also returned other columns from your lookup table if that is what you need

Rafael Salas

|||I tend to ask a lot of them yes :)

I looked into the lookup transformation which i can see how i would use the error output etc. I was then using an OLEDB destinbation to insert rows to the table. The problem with the OLEDB destination is that i cannot loop back to the lookup. With an oledb command would i script the insert command and then be able to loop back to the lookup transformation.
E.g.

Should this be ok to enter as a SQL Command:

if (select count(*) from SerialPartRev Where SerialPartRev.SerialNo = ? and SerialPartRev.PartNo = ?) = 0
begin
Insert into SerialPartRev (SerialNo, PartNo, RevisionNo)
Values(?,?,?)
end

The reason i want to do this is so that the SerialPartRev table is update automatically. I require to do the same thing with a Username table. The big problem is how to check again for the entry before processing further. As i have found looping back to the lookup isn't possible due to it only accepting one input path.

I'm quite happy to accept that i am doing this wrong and that maybe someone could suggest another process for implementing this.

Cheers,

Grant|||Rafael,

I have been reading you're previous post again and whilst i understand what you are saying about using the output and error output in the pipeline depending on if the row exists but one question still remains. Firstly i have to say that there will be a couple of instances where i have to check if data from the columns in the flat file exists in SQL tables. Regardless of if i have to insert the row manually or if it already exists after both of these operations the output still has to go to the one conditional split task. Effectively it splits the path in two and then rejoins after carrying out the necessary functionality. Is this possible without the use of scripting? I believe i can achieve this via a script although it will mean exposing a password in plain taxt so that the database connection string will work properly.

Thank you,

Grant|||Actually, forget it. I have managed to call the stored procedures form the OLE DB commands. I have no idea what i was doing wrong previously but it seems to be working now.

Thanks for you're help on this matter it was most appreciated. I'm sure i'll have more questions in time.

Cheers,

Grant

没有评论:

发表评论