GNU bug report logs - #6780
Add cut multi-character/expression delimiters

Previous Next

Package: coreutils;

Reported by: Bill <bill3 <at> uniserve.com>

Date: Mon, 2 Aug 2010 15:56:02 UTC

Severity: wishlist

Full log


Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Davide Brini <dave_br <at> gmx.com>
To: bug-coreutils <at> gnu.org
Subject: Re: bug#6780: Problem with the cut command
Date: Mon, 2 Aug 2010 19:56:43 +0100
On Mon, 02 Aug 2010 05:56:31 -0700 Bill <bill3 <at> uniserve.com> wrote:

> I'm not sure if this is a bug, a question or a feature request,
> but there is a problem with the cut command, specifically with
> it's delimiter option '-d'. 
> 
> In older times disk space was scarce and every byte was 
> conserved. Fields in data files were delimited with a single
> character such as ':'. This practise continues today. But 
> sometimes it does not and fields in some files are separated
> with multiple characters. Space is no longer precious.
> 
> Suppose I wish to import information about a disk partition
> into my backup script. I want to assign the type of filesystem
> to a variable. Compare the output of these two commands.
> 
> cat /etc/fstab |grep home | cut -d ' ' -f3
> yields a blank output line
> 
> cat /etc/fstab |grep opt | awk -F " " '{print $3}'
> yields the desired output - reiserfs.
> 
> The problem is that the cut command can't handle multiple 
> instances of the same delimiter. It's designed to handle
> a single character like ':', but can't cope with repeating
> characters like '::' or a series of spaces as in /etc/fstab.
> 
> So my question is shouldn't the cut delimiter handle 
> multiple instances of the same character internally or 
> failing that, shouldn't there be some way of specifying a 
> series of single delimiter characters such as -d':'+  ?

cut is required by POSIX to treat every separator character as delimiting a
field. 

"Output fields shall be separated by a single occurrence of the field
delimiter character."

However, what you suggest might be implemented as an extension, which the
user would have to enable explicitly (although I wouldn't bet that the
maintainers think this is a good idea, but I may be wrong).

On a side note, you mention awk which in your specific example of space as
separator happens to work fine. However, that is specifically special-cased
in awk; with any other single-character separator, awk works exactly like
cut:

echo 'a::b:c' | awk -F':' '{print "-"$1"--"$2"--"$3"--"$4"-"}'
-a----b--c-

note the empty second field. But of course in awk, unlike cut. you can say
-F ':+' and get the behavior you want.

-- 
D.




This bug report was last modified 15 years and 22 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.