GNU bug report logs - #20657
Traditional range expression not accepted in regex/dfa

Previous Next

Package: grep;

Reported by: arnold <at> skeeve.com

Date: Tue, 26 May 2015 02:43:02 UTC

Severity: wishlist

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #8 received at 20657 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: arnold <at> skeeve.com, 20657 <at> debbugs.gnu.org
Subject: Re: bug#20657: Traditional range expression not accepted in regex/dfa
Date: Mon, 25 May 2015 23:53:31 -0700
arnold <at> skeeve.com wrote:

> The bugaboo here is the "---"; it's
> a range expression consisting of minus through minus, and apparently long
> ago was how one got a minus into a bracket expression.

Actually, long ago expressions like '[^0-9-]' worked just as they do now, and it 
wasn't ever necessary to use trailing "---".  That being said, it is true that 
in 7th Edition Unix '[^0-9---]' meant the same thing as '[^0-9-]', so in that 
sense we have an incompatibility with 7th Edition Unix here.

> 	$ ./src/grep '[^0-9---]' /dev/null
> 	./src/grep: Invalid range end
>
> The underlying regex and, I believe, dfa routines don't accept this.

Yes, that's correct.  It's not a bug, though, as the regexp is ambiguous and 
does not conform to POSIX, which says the following about RE bracket 
expressions: "To use a <hyphen> as the starting range point, it shall either 
come first in the bracket expression or be specified as a collating symbol; for 
example, "[][.-.]-0]", which matches either a <right-square-bracket> or any 
character or collating element that collates between <hyphen> and 0, inclusive." 
<http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05> 
In your correspondent's example, the hyphen is a starting range point but is 
neither first in the bracket expression nor is specified as a collating symbol, 
so the regexp doesn't conform to POSIX.

Even though it's not a bug I suppose it wouldn't hurt to make the GNU matchers 
compatible with 7th Edition Unix here, if someone really wants to take that task 
on; it's not urgent, though.




This bug report was last modified 3 years and 33 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.