From debbugs-submit-bounces@debbugs.gnu.org Wed May 26 23:02:52 2010 Received: (at submit) by debbugs.gnu.org; 27 May 2010 03:02:52 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OHTMx-0005gA-SB for submit@debbugs.gnu.org; Wed, 26 May 2010 23:02:52 -0400 Received: from mail.gnu.org ([199.232.76.166] helo=mx10.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OHTKs-0005ew-9k for submit@debbugs.gnu.org; Wed, 26 May 2010 23:00:43 -0400 Received: from lists.gnu.org ([199.232.76.165]:47991) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1OHTKZ-0004p0-QY for submit@debbugs.gnu.org; Wed, 26 May 2010 23:00:23 -0400 Received: from [140.186.70.92] (port=45029 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OHTKU-0000wy-DB for bug-coreutils@gnu.org; Wed, 26 May 2010 23:00:22 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,T_DKIM_INVALID, T_TO_NO_BRKTS_FREEMAIL autolearn=no version=3.3.1 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OHTKL-0000Rx-6A for bug-coreutils@gnu.org; Wed, 26 May 2010 23:00:09 -0400 Received: from mail-fx0-f41.google.com ([209.85.161.41]:49665) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OHTKL-0000Rp-0N for bug-coreutils@gnu.org; Wed, 26 May 2010 23:00:09 -0400 Received: by fxm11 with SMTP id 11so274743fxm.0 for ; Wed, 26 May 2010 20:00:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=goD19ZXe+pO/XtisdEw0qF2sBYgjFYKAHAGYMHlEYfU=; b=mUsexD262meq5AQszO1uJH2cs2H1cmbZxi50hemc1rvS0DSEkTFW0hQX9Pfe190zhq lhWHOK9/ZEaSxDRJLEsjrvDpWZVPH+fc087H4RFgQLB55/czcOePBTjYzKDIVErijGH0 0ObSGQpibb37Pz5Nar6w5dqPwxLaNdv+38sPY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=PWyvUc8csB+JLR7pwabscg76yOcWAjMjrEo/dJBOCdATkJJ7vHM9zjeHKVVPyjvFYK H1zWo03YkcYA6DcGmXLF5i6STFBShJT6xMmprd/hp5mt1XhhthvCdJ4IxXNyOMLmGEgz NB7gYtZwGvTucgT0uDsHqdLAq7w2cp5zgW9hc= MIME-Version: 1.0 Received: by 10.204.174.199 with SMTP id u7mr4142896bkz.38.1274929208069; Wed, 26 May 2010 20:00:08 -0700 (PDT) Received: by 10.204.99.66 with HTTP; Wed, 26 May 2010 20:00:08 -0700 (PDT) Date: Wed, 26 May 2010 23:00:08 -0400 Message-ID: Subject: cut From: sandy bas To: bug-coreutils@gnu.org Content-Type: text/plain; charset=ISO-8859-1 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Spam-Score: -5.9 (-----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Wed, 26 May 2010 23:02:50 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.9 (-----) Dear People: Comma delimited files often have fields of the form "big,black,bear" where the commas within the quotes are not delimiters. A useful option in cut would be to ignore the commas (delimiters) within the quotation marks. I would be glad to put it in if you would like the option. Thank you for all of the work that you do. sandy From debbugs-submit-bounces@debbugs.gnu.org Thu May 27 11:11:42 2010 Received: (at 6277) by debbugs.gnu.org; 27 May 2010 15:11:42 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OHekH-0002xX-P2 for submit@debbugs.gnu.org; Thu, 27 May 2010 11:11:41 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OHekF-0002xR-Mb for 6277@debbugs.gnu.org; Thu, 27 May 2010 11:11:40 -0400 Received: (qmail 31523 invoked from network); 27 May 2010 15:11:31 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 27 May 2010 15:11:31 -0000 Message-ID: <4BFE8AF3.4070704@draigBrady.com> Date: Thu, 27 May 2010 16:08:35 +0100 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: sandy bas Subject: Re: bug#6277: cut References: In-Reply-To: X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: -1.5 (-) X-Debbugs-Envelope-To: 6277 Cc: 6277@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.8 (--) On 27/05/10 04:00, sandy bas wrote: > Dear People: > > Comma delimited files often have fields of the form "big,black,bear" where the > commas within the quotes are not delimiters. A useful option in cut would > be to ignore the commas (delimiters) within the quotation marks. > > I would be glad to put it in if you would like the option. Hmm, the CSV format is a bit more complicated than that so as to support " and \n within fields also. It would be more general I think to have a separate tool to parse CSV to a format more easily usable on the shell, and that it turn could be passed to cut -d, column -s, ... Aha, the csvutils command from here seems to do this: http://freshmeat.net/projects/csvutils cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Thu May 27 11:27:09 2010 Received: (at 6277) by debbugs.gnu.org; 27 May 2010 15:27:09 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OHezE-00033j-VT for submit@debbugs.gnu.org; Thu, 27 May 2010 11:27:09 -0400 Received: from joseki.proulx.com ([216.17.153.58]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OHezC-00033K-Eu; Thu, 27 May 2010 11:27:07 -0400 Received: from dementia.proulx.com (dementia.proulx.com [192.168.230.115]) by joseki.proulx.com (Postfix) with ESMTP id EE80621363; Thu, 27 May 2010 09:27:00 -0600 (MDT) Received: by dementia.proulx.com (Postfix, from userid 1000) id E7B8A3CC39A; Thu, 27 May 2010 09:27:00 -0600 (MDT) Date: Thu, 27 May 2010 09:27:00 -0600 From: Bob Proulx To: sandy bas Subject: Re: bug#6277: cut: Please add CSV parsing Message-ID: <20100527152700.GA26296@dementia.proulx.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-Spam-Score: -1.1 (-) X-Debbugs-Envelope-To: 6277 Cc: 6277@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.4 (--) retitle 6277 cut: Please add CSV parsing tags 6277 + wishlist wontfix thanks sandy bas wrote: > Comma delimited files often have fields of the form "big,black,bear" > where the commas within the quotes are not delimiters. A useful > option in cut would be to ignore the commas (delimiters) within the > quotation marks. > > I would be glad to put it in if you would like the option. Parsing CSV files is deceptively more complicated than is looks. Using the Perl Text::CSV module as a guide shows that the result would add several thousand lines of code. This would fall under the category of creeping featurism and code bloat because it would significantly enlarge the code base of the 'cut' program well beyond its traditional role as a simple cut by field program. And if CSV parsing is allowed in then wouldn't by comparison other file format parsing be allowed in as well? Plus the coreutils are the core utilities that belong on every machine in the universe. Does my toaster need this capability? Large items like this really should go into a differently named program. It isn't just the use of the program on a fully loaded desktop but also the use of the program across the entire universe of machines. I am sorry but full CSV parsing really doesn't belong in cut. I suggest that you use Perl, Python or Ruby for CSV processing. They include full libraries for dealing with the many varied details of CSV handling. Something like the following is a simple example perl script to print only the second field of a CSV file. #!/usr/bin/env perl use Text::CSV; use strict; my $csv = Text::CSV->new; foreach my $filename (@ARGV) { open(CSV,$filename) or die "Error parsing $filename: $!\n"; while (defined($_ = )) { if (! $csv->parse($_)) { die("Error parsing: " . $csv->error_input); } print(($csv->fields())[1],"\n"); # print second field } } Bob From debbugs-submit-bounces@debbugs.gnu.org Thu May 27 12:07:55 2010 Received: (at control) by debbugs.gnu.org; 27 May 2010 16:07:55 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OHfcg-0003y9-E6 for submit@debbugs.gnu.org; Thu, 27 May 2010 12:07:54 -0400 Received: from joseki.proulx.com ([216.17.153.58]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OHfcd-0003y3-Fq for control@debbugs.gnu.org; Thu, 27 May 2010 12:07:52 -0400 Received: from hysteria.proulx.com (hysteria.proulx.com [192.168.230.119]) by joseki.proulx.com (Postfix) with ESMTP id 4718921363 for ; Thu, 27 May 2010 10:07:46 -0600 (MDT) Received: by hysteria.proulx.com (Postfix, from userid 1000) id 3526E2DCA2; Thu, 27 May 2010 10:07:46 -0600 (MDT) Date: Thu, 27 May 2010 10:07:46 -0600 From: Bob Proulx To: control@debbugs.gnu.org Subject: Set severity wishlist Message-ID: <20100527160746.GA11670@hysteria.proulx.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.4 (--) severity 6277 wishlist thanks From debbugs-submit-bounces@debbugs.gnu.org Fri May 28 03:03:01 2010 Received: (at 6277) by debbugs.gnu.org; 28 May 2010 07:03:01 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OHtau-0003Qb-Bh for submit@debbugs.gnu.org; Fri, 28 May 2010 03:03:01 -0400 Received: from m0019.fra.mmp.de.bt.com ([62.180.227.30] helo=ms01.m0019.fra.mmp.de.bt.com) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OHtL0-0003Ig-0c for 6277@debbugs.gnu.org; Fri, 28 May 2010 02:46:34 -0400 Received: from senmx11-mx ([62.134.46.9] [62.134.46.9]) by ms01.m0020.fra.mmp.de.bt.com with ESMTP id BT-MMP-321037; Fri, 28 May 2010 08:46:29 +0200 Received: from MCHP064A.global-ad.net (unknown [172.29.37.63]) by senmx11-mx (Server) with ESMTP id BABA81EB82AE; Fri, 28 May 2010 08:46:29 +0200 (CEST) Received: from MCHP058A.global-ad.net ([172.29.37.55]) by MCHP064A.global-ad.net ([172.29.37.63]) with mapi; Fri, 28 May 2010 08:46:29 +0200 From: "Voelker, Bernhard" To: Bob Proulx , sandy bas Date: Fri, 28 May 2010 08:46:29 +0200 Subject: RE: bug#6277: cut: Please add CSV parsing Thread-Topic: bug#6277: cut: Please add CSV parsing Thread-Index: Acr9tW6Tv3jnwWuoS/yrSEzEFrltNgAe0nFA Message-ID: <7856072A9D04C24B82DFE2B1112FE38AE0A3B1C7@MCHP058A.global-ad.net> References: <20100527152700.GA26296@dementia.proulx.com> In-Reply-To: <20100527152700.GA26296@dementia.proulx.com> Accept-Language: de-DE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: de-DE, en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: 6277 X-Mailman-Approved-At: Fri, 28 May 2010 03:02:58 -0400 Cc: "6277@debbugs.gnu.org" <6277@debbugs.gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.0 (---) Bob Proulx wrote: > sandy bas wrote: >> Comma delimited files often have fields of the form "big,black,bear" >> where the commas within the quotes are not delimiters. A useful >> option in cut would be to ignore the commas (delimiters) within the >> quotation marks. >>=20 >> I would be glad to put it in if you would like the option. > I suggest that you use Perl, Python or Ruby for CSV processing. They > include full libraries for dealing with the many varied details of CSV > handling. just to mention another classic UNIX tool: awk awk -F, '$1 ~ /^big$/ { print $2,$3 }' csv.txt Have a nice day, Berny From debbugs-submit-bounces@debbugs.gnu.org Mon Jun 07 19:08:40 2010 Received: (at 6277-close) by debbugs.gnu.org; 7 Jun 2010 23:08:40 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OLlQu-00057u-CT for submit@debbugs.gnu.org; Mon, 07 Jun 2010 19:08:40 -0400 Received: from joseki.proulx.com ([216.17.153.58]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OLlQs-00057p-6I for 6277-close@debbugs.gnu.org; Mon, 07 Jun 2010 19:08:38 -0400 Received: from dementia.proulx.com (dementia.proulx.com [192.168.230.115]) by joseki.proulx.com (Postfix) with ESMTP id 8086A21362; Mon, 7 Jun 2010 17:08:33 -0600 (MDT) Received: by dementia.proulx.com (Postfix, from userid 1000) id 60D133CD899; Mon, 7 Jun 2010 17:08:33 -0600 (MDT) Date: Mon, 7 Jun 2010 17:08:33 -0600 From: Bob Proulx To: sandy bas Subject: Re: bug#6277: cut: Please add CSV parsing Message-ID: <20100607230833.GC585@dementia.proulx.com> References: <20100527152700.GA26296@dementia.proulx.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 6277-close Cc: 6277-close@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.4 (--) Hi Sandy, I am happy that you are satisfied with the responses. I am going to close the bug ticket in the bug tracking system with this message. Please feel free to respond and add additional information if desired. The group will all see it and the ticket will keep track of it. Bob From unknown Sat Aug 16 18:46:47 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Tue, 06 Jul 2010 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator