|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Re: Read cvs File Failshttp://www.creativyst.com/Doc/Articles/CSV/CSV01.htm and http://en.wikipedia.org/wiki/Comma-separated_values Where are you getting that this is correct? It also looks like there's a space between the comma and the next quote. This looks like typical bad exports you encounter in the wild where people just drop all contents between quotes. Bruce Dunwiddie http://www.csvreader.com Michael D. Ober wrote: Show quote > The first line "this "item" is bad", "this item is ok" is correct CSV > format. There is no field delimiter (comma) between the two quotes around > "item". Either your ODBC handler has a bug or you misconfiguring it. > Please post your code that configures your ODBC reader. Your other option > is to write a CSV line parser yourself, which isn't too difficult to do. > > Mike Ober. > > "rob" <rmdiv2***@yahoo.com> wrote in message > news:1146296378.467061.100080@j33g2000cwa.googlegroups.com... > > I am using Odbc to read cvs files. Unfortunately, some cvs files are > > not formated correctly (out of my control). One particular problem is > > that a quote within an item is not put in double quotes, i.e. the file > > says > > > > "this "item" is bad", "this item is ok" > > > > rather then > > > > "this ""item"" is bad", "this item is ok" > > > > odbc now thinks 'this ' is the first item rather then 'this "item" is > > bad'. Excel reads the file just fine, though. Is there some workaround, > > short of fixing the file myself, to make the driver more error > > tolerant? If it helps anything bellow is how I read the excel file. > > > > Thanks > > > > > > connectionString = @"Driver={Microsoft Text Driver (*.txt; > > *.csv)};DBQ=" + Path.GetDirectoryName(filename); > > connection = new OdbcConnection(connectionString); > > connection.Open(); > > command = new OdbcCommand("Select * FROM " + > > Path.GetFileName(filename), connection); > > reader = command.ExecuteReader(); > > > > In my first post I actually said that it is not in a correct format.
Michael responded and said it IS a correct csv format. Since I am no expert in csv I did not want to claim otherwise but it was hard to believe. The reason is that then "this ""item"" is bad" should show up in Excel as 'this ""item"" is bad' which it does not, i.e. it shows up as 'this "item" is bad'. The space is an error on my side. I just wrote down an representative example and accidentaly put the space in there. shriop wrote: Show quote > I'd disagree based on > http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm > and > http://en.wikipedia.org/wiki/Comma-separated_values > > Where are you getting that this is correct? It also looks like there's > a space between the comma and the next quote. This looks like typical > bad exports you encounter in the wild where people just drop all > contents between quotes. > > Bruce Dunwiddie > http://www.csvreader.com > > Michael D. Ober wrote: > > The first line "this "item" is bad", "this item is ok" is correct CSV > > format. There is no field delimiter (comma) between the two quotes around > > "item". Either your ODBC handler has a bug or you misconfiguring it. > > Please post your code that configures your ODBC reader. Your other option > > is to write a CSV line parser yourself, which isn't too difficult to do. > > > > Mike Ober. > > > > "rob" <rmdiv2***@yahoo.com> wrote in message > > news:1146296378.467061.100080@j33g2000cwa.googlegroups.com... > > > I am using Odbc to read cvs files. Unfortunately, some cvs files are > > > not formated correctly (out of my control). One particular problem is > > > that a quote within an item is not put in double quotes, i.e. the file > > > says > > > > > > "this "item" is bad", "this item is ok" > > > > > > rather then > > > > > > "this ""item"" is bad", "this item is ok" > > > > > > odbc now thinks 'this ' is the first item rather then 'this "item" is > > > bad'. Excel reads the file just fine, though. Is there some workaround, > > > short of fixing the file myself, to make the driver more error > > > tolerant? If it helps anything bellow is how I read the excel file. > > > > > > Thanks > > > > > > > > > connectionString = @"Driver={Microsoft Text Driver (*.txt; > > > *.csv)};DBQ=" + Path.GetDirectoryName(filename); > > > connection = new OdbcConnection(connectionString); > > > connection.Open(); > > > command = new OdbcCommand("Select * FROM " + > > > Path.GetFileName(filename), connection); > > > reader = command.ExecuteReader(); > > > > > > In the definition for CSV format, quotes are included in the text by using
back-to-back quotes. So "Hi ""George"" Mason" should be interpreted by a good CSV parser as Hi "George" Mason Problem is that a lot of CSV parsers are poorly written, and a lot of CSV exporters are poorly written. Personally I like to go with Tab-delimited format or some other non-CSV format when I can... Makes life a lot easier. You can Google a CSV parser Regular Expression as well. Show quote "rob" wrote: > In my first post I actually said that it is not in a correct format. > Michael responded and said it IS a correct csv format. Since I am no > expert in csv I did not want to claim otherwise but it was hard to > believe. The reason is that then "this ""item"" is bad" should show up > in Excel as 'this ""item"" is bad' which it does not, i.e. it shows up > as 'this "item" is bad'. > > The space is an error on my side. I just wrote down an representative > example and accidentaly put the space in there. > > > shriop wrote: > > I'd disagree based on > > http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm > > and > > http://en.wikipedia.org/wiki/Comma-separated_values > > > > Where are you getting that this is correct? It also looks like there's > > a space between the comma and the next quote. This looks like typical > > bad exports you encounter in the wild where people just drop all > > contents between quotes. > > > > Bruce Dunwiddie > > http://www.csvreader.com > > > > Michael D. Ober wrote: > > > The first line "this "item" is bad", "this item is ok" is correct CSV > > > format. There is no field delimiter (comma) between the two quotes around > > > "item". Either your ODBC handler has a bug or you misconfiguring it. > > > Please post your code that configures your ODBC reader. Your other option > > > is to write a CSV line parser yourself, which isn't too difficult to do. > > > > > > Mike Ober. > > > > > > "rob" <rmdiv2***@yahoo.com> wrote in message > > > news:1146296378.467061.100080@j33g2000cwa.googlegroups.com... > > > > I am using Odbc to read cvs files. Unfortunately, some cvs files are > > > > not formated correctly (out of my control). One particular problem is > > > > that a quote within an item is not put in double quotes, i.e. the file > > > > says > > > > > > > > "this "item" is bad", "this item is ok" > > > > > > > > rather then > > > > > > > > "this ""item"" is bad", "this item is ok" > > > > > > > > odbc now thinks 'this ' is the first item rather then 'this "item" is > > > > bad'. Excel reads the file just fine, though. Is there some workaround, > > > > short of fixing the file myself, to make the driver more error > > > > tolerant? If it helps anything bellow is how I read the excel file. > > > > > > > > Thanks > > > > > > > > > > > > connectionString = @"Driver={Microsoft Text Driver (*.txt; > > > > *.csv)};DBQ=" + Path.GetDirectoryName(filename); > > > > connection = new OdbcConnection(connectionString); > > > > connection.Open(); > > > > command = new OdbcCommand("Select * FROM " + > > > > Path.GetFileName(filename), connection); > > > > reader = command.ExecuteReader(); > > > > > > > > > > |
|||||||||||||||||||||||