GNU bug report logs -
#6277
cut: Please add CSV parsing
Previous Next
Reported by: sandy bas <basic207 <at> gmail.com>
Date: Thu, 27 May 2010 03:03:02 UTC
Severity: wishlist
Tags: wontfix
Done: Bob Proulx <bob <at> proulx.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 6277 in the body.
You can then email your comments to 6277 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#6277
; Package
coreutils
.
(Thu, 27 May 2010 03:03:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
sandy bas <basic207 <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Thu, 27 May 2010 03:03:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Dear People:
Comma delimited files often have fields of the form "big,black,bear" where the
commas within the quotes are not delimiters. A useful option in cut would
be to ignore the commas (delimiters) within the quotation marks.
I would be glad to put it in if you would like the option.
Thank you for all of the work that you do.
sandy
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#6277
; Package
coreutils
.
(Thu, 27 May 2010 15:12:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 6277 <at> debbugs.gnu.org (full text, mbox):
On 27/05/10 04:00, sandy bas wrote:
> Dear People:
>
> Comma delimited files often have fields of the form "big,black,bear" where the
> commas within the quotes are not delimiters. A useful option in cut would
> be to ignore the commas (delimiters) within the quotation marks.
>
> I would be glad to put it in if you would like the option.
Hmm, the CSV format is a bit more complicated than that
so as to support " and \n within fields also.
It would be more general I think to have a separate tool
to parse CSV to a format more easily usable on the shell,
and that it turn could be passed to cut -d, column -s, ...
Aha, the csvutils command from here seems to do this:
http://freshmeat.net/projects/csvutils
cheers,
Pádraig.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#6277
; Package
coreutils
.
(Thu, 27 May 2010 15:28:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 6277 <at> debbugs.gnu.org (full text, mbox):
retitle 6277 cut: Please add CSV parsing
tags 6277 + wishlist wontfix
thanks
sandy bas wrote:
> Comma delimited files often have fields of the form "big,black,bear"
> where the commas within the quotes are not delimiters. A useful
> option in cut would be to ignore the commas (delimiters) within the
> quotation marks.
>
> I would be glad to put it in if you would like the option.
Parsing CSV files is deceptively more complicated than is looks.
Using the Perl Text::CSV module as a guide shows that the result would
add several thousand lines of code. This would fall under the
category of creeping featurism and code bloat because it would
significantly enlarge the code base of the 'cut' program well beyond
its traditional role as a simple cut by field program.
And if CSV parsing is allowed in then wouldn't by comparison other
file format parsing be allowed in as well? Plus the coreutils are the
core utilities that belong on every machine in the universe. Does my
toaster need this capability? Large items like this really should go
into a differently named program. It isn't just the use of the
program on a fully loaded desktop but also the use of the program
across the entire universe of machines.
I am sorry but full CSV parsing really doesn't belong in cut.
I suggest that you use Perl, Python or Ruby for CSV processing. They
include full libraries for dealing with the many varied details of CSV
handling.
Something like the following is a simple example perl script to print
only the second field of a CSV file.
#!/usr/bin/env perl
use Text::CSV;
use strict;
my $csv = Text::CSV->new;
foreach my $filename (@ARGV) {
open(CSV,$filename) or die "Error parsing $filename: $!\n";
while (defined($_ = <CSV>)) {
if (! $csv->parse($_)) {
die("Error parsing: " . $csv->error_input);
}
print(($csv->fields())[1],"\n"); # print second field
}
}
Bob
Changed bug title to 'cut: Please add CSV parsing' from 'cut'
Request was from
Bob Proulx <bob <at> proulx.com>
to
control <at> debbugs.gnu.org
.
(Thu, 27 May 2010 15:28:02 GMT)
Full text and
rfc822 format available.
Added tag(s) wontfix.
Request was from
Bob Proulx <bob <at> proulx.com>
to
control <at> debbugs.gnu.org
.
(Thu, 27 May 2010 15:28:02 GMT)
Full text and
rfc822 format available.
Severity set to 'wishlist' from 'normal'
Request was from
Bob Proulx <bob <at> proulx.com>
to
control <at> debbugs.gnu.org
.
(Thu, 27 May 2010 16:08:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#6277
; Package
coreutils
.
(Fri, 28 May 2010 07:03:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 6277 <at> debbugs.gnu.org (full text, mbox):
Bob Proulx wrote:
> sandy bas wrote:
>> Comma delimited files often have fields of the form "big,black,bear"
>> where the commas within the quotes are not delimiters. A useful
>> option in cut would be to ignore the commas (delimiters) within the
>> quotation marks.
>>
>> I would be glad to put it in if you would like the option.
> I suggest that you use Perl, Python or Ruby for CSV processing. They
> include full libraries for dealing with the many varied details of CSV
> handling.
just to mention another classic UNIX tool: awk
awk -F, '$1 ~ /^big$/ { print $2,$3 }' csv.txt
Have a nice day,
Berny
Reply sent
to
Bob Proulx <bob <at> proulx.com>
:
You have taken responsibility.
(Mon, 07 Jun 2010 23:09:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
sandy bas <basic207 <at> gmail.com>
:
bug acknowledged by developer.
(Mon, 07 Jun 2010 23:09:02 GMT)
Full text and
rfc822 format available.
Message #25 received at 6277-close <at> debbugs.gnu.org (full text, mbox):
Hi Sandy,
I am happy that you are satisfied with the responses. I am going to
close the bug ticket in the bug tracking system with this message.
Please feel free to respond and add additional information if
desired. The group will all see it and the ticket will keep track of
it.
Bob
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 06 Jul 2010 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 15 years and 44 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.