Please leave feedback for Microsoft regarding this problem at https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=126493
Ok I'm sure it just me, SSIS has been great so far....but how can you import a straight CSV file with and uneven column count.
For example: (assume CR LF row delimiter)
The,Quick,Brown,Fox,Jumps
Hello,World
This,is,a,test
"Normally" I'd expect this
Ok but what we get is the row delimiter is ignored in preference for the column delimiter and the row delimiter gets sucked into the column and the next row starts to get layed down.
So we get
I'm I not seeing a tick box somewhere that says "over here if you want to terminate a row on the row delimiter even if all columns aren't full and we'll pad NULLs in rest of the row columns which you can fix in the flow transformations"
I'm sure it's there.....help!
(By the way SSIS team, great job on the package love using it)
Try using ragged right instead of delimited.
Thanks,
Matt
|||Ah see if only it was that simple Matt. "Ragged right files are files in which every column has a fixed width, except for the last column, which is delimited by the row delimiter."
So the only thing which can be Ragged in fact is the last column. In my example I've purposely put varying column widths so the old "Ragged Right" doesn’t apply (this is more the norm for ragged files).
So any ideas how this is going to work in SSIS...I must be missing something simple here this is a very common problem that the old DTS, Excel, Access, and other programs deal with quite well.
help!
|||Unfortunately we only support ragged right for fixed width so for this scenario you would have to read it as a single column and then split it using a script component or you could write a custom component that split it.
Matt
|||but this is a bug or changed behavior from the old sql server 2000 dts which when you had a comma or tab deliminited file some of the rows had extra columns after the last valid one.. it ignored those extra columns.
the new functionality is to instead append the commas/delimiters into the data pulled in from the last column..
This is causing us major grief becuase now none of our dts's work in the new sql server 2005 so it is not backwards compatible..
|||This is similar to an issure raised the other day. I produced a sample package on how to handle this.
http://sqlblogcasts.com/files/4/integration_services/entry412.aspx
|||I'm with you Igkahn. It's a bug. No way is this a feature! This is killing us too. If I have to script this as set out below then something fundimentally is wrong with SSIS. I cut one of our large SQL 2000 servers over to SQL 2005 and we are stopping there until fixes are supplied for errors like this. I'm now having to use the SQL 2000 DTS instead of the nice looking however functionality poor SSIS.
Please get this fixed.
Garry Swan
Information Systems Manager
CSIRO
Australia
If you feel strongly about it then you should raise it through the Feedback centre and through your MS representative if you have one.
|||Calling it a bug implies that is supposed to do something else and I'm not sure that is the case. I'm not denying that it MIGHT be a bug - hopefully Matt will reply again and tell us (By the way, just because some behaviour was present in DTS, you shouldn't assume the same would be true of SSIS. This is a replacement, not an upgrade).
Regardless, I can understand why this is causing headache. Have you logged it at the feedback center (http://lab.msdn.microsoft.com/productfeedback/default.aspx)? If not, then you shouldn't expect that the feature will make it into the next version.
I don't understand why you call SSIS functionally poor. As Simon explained this can still be achieved quite easily. Surely the fact that there is "more than one way to skin a cat" so to speak means the product is functionally rich as opposed to poor? Is the fact that something cannot be achieved using your "preferred" method really prohibiting you from moving to a superior product?
-Jamie
|||From my research there has been a lot of posts about this issue in the last couple of months. But im still looking for a proper solution, is there a fix available? Is there a code sample available for a SSIS source component or whatnot, that threats uneven columns properly?
I've looked through the sample for the source component and could not get it to run (something about no compatible component in the dll)
Since people do need to import data from delimited files very often, SSIS should be expending the functionality of what there was in the DTS world... but in this case it was decided to limit it. Whoever came up with the delimiter for each column must have been so proud of himself that he generlized and removed the need for a row delimiter....
Well you haven't, and we want it back. :) It's causing many people, a lot of greef.
Or at least release the source for a flat file handler which people can customize to their liking.
There is so many flat file formats out there.... Why be so restrictive when dealing with them?
|||Whats wrong with the sample I have provided.
SSIS focus was on building a framework that was performant, scalable and extensible. For this reason the components out of the box don't satisfy everyones requirements. But I have shown how it is very easy to extend SSIS with a script component to achieve your goal.
If you want a more polished solution you can develop your own custom component but that is another level of complexity that really isn't needed due to the power of the script component.
|||Someone mentioned that the code provided by Sabin didn't work for him/her. I have not tried that code.
Here is a version (another way to skin the cat) that can be tried. Please note that your first row should contain all the column names for this to work. This script assumes that the input is configured in a way that all data on one row constitute a column. So basically, there is just one column in each row. The output of this script is also one column in each row. However, the output created by this script can be "understood" by SSIS file connection manager. Finally, assumption is that "tab" is column delimiter. You can change that below, if that is not the case.
Private dataRow As Boolean = False
Public Function AppendMissingTabs(ByVal Row As InputBuffer) As String
Dim columns As String() = Nothing
Dim outputRow As String = Nothing
Dim outputString As String = Nothing
Dim buffer As StringBuilder = New StringBuilder()
buffer.Append(Row.BigSingleColumn)
columns = Row.BigSingleColumn.Split(New [Char]() {Chr(9)})
If Not dataRow Then
columnCount = columns.Length
End If
Dim N As Integer
If columns.Length < columnCount Then
For N = 1 To columnCount - columns.Length Step 1
buffer.Append(Chr(9))
Next N
End If
dataRow = True
Return buffer.ToString()
End Function
|||It comes down to if I tell a system that my row finishes with a crlf I don't want it to override my decision and pull the crlf into the data. Why would I want the row delimiter in the data?!? Is this a feature people needed and the SSIS team decided the former was so last week that this new functionality was, in fact, a better solution path...I just doubt it. The required functionality is the way Excel, DTS, Access and many other apps handle this data import. Why is SSIS pulling crlf into my data when I've said this is the end of the row?!?. Just stop and move onto the next row. Put that functionality back in and you've got a great delimited file importer.
Garry Swan
MCDBA
|||What you have actually said is that your column finishes with your delimiter. The flat file source then looks for that delimiter to end the column. The CRLF is not defined as the record delimiter but the delimiter of the last column.
I know that doesn't solve the problem but should explain the reason for the situation.
I still stand by the fact that if you want this functionality using the script component is a valid solution.
|||Last night I uploaded a custom source adapter to SourceForge that would let you parse this file with regular expressions.
http://sourceforge.net/projects/textregexsource
Just connect a file connection manager to it and set a regular expression, and it could produce what you want.
I think a regex like (?'Col1'\w+),*(?'Col2'\w*),*(?'Col3'\w*),*(?'Col4'\w*),*(?'Col5'\w*)\n
might do the trick, though it might need tweaking.
Geof
没有评论:
发表评论