GNU bug report logs -
#20657
Traditional range expression not accepted in regex/dfa
Previous Next
Reported by: arnold <at> skeeve.com
Date: Tue, 26 May 2015 02:43:02 UTC
Severity: wishlist
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
arnold <at> skeeve.com wrote:
> The bugaboo here is the "---"; it's
> a range expression consisting of minus through minus, and apparently long
> ago was how one got a minus into a bracket expression.
Actually, long ago expressions like '[^0-9-]' worked just as they do now, and it
wasn't ever necessary to use trailing "---". That being said, it is true that
in 7th Edition Unix '[^0-9---]' meant the same thing as '[^0-9-]', so in that
sense we have an incompatibility with 7th Edition Unix here.
> $ ./src/grep '[^0-9---]' /dev/null
> ./src/grep: Invalid range end
>
> The underlying regex and, I believe, dfa routines don't accept this.
Yes, that's correct. It's not a bug, though, as the regexp is ambiguous and
does not conform to POSIX, which says the following about RE bracket
expressions: "To use a <hyphen> as the starting range point, it shall either
come first in the bracket expression or be specified as a collating symbol; for
example, "[][.-.]-0]", which matches either a <right-square-bracket> or any
character or collating element that collates between <hyphen> and 0, inclusive."
<http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05>
In your correspondent's example, the hyphen is a starting range point but is
neither first in the bracket expression nor is specified as a collating symbol,
so the regexp doesn't conform to POSIX.
Even though it's not a bug I suppose it wouldn't hurt to make the GNU matchers
compatible with 7th Edition Unix here, if someone really wants to take that task
on; it's not urgent, though.
This bug report was last modified 3 years and 33 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.