GNU bug report logs -
#36167
[PATCH] Replace manually crafted hex regexes with [[:xdigit:]]
Previous Next
Full log
Message #95 received at 36167 <at> debbugs.gnu.org (full text, mbox):
> From: Andy Moreton <andrewjmoreton <at> gmail.com>
> Date: Wed, 12 Jun 2019 12:44:12 +0100
>
> >> The [:alnum:] and [:alpha:] are based on (unspecified values of) the Unicode
> >> general-category property, but [:digit:] is not. Thus [:alnum:] includes
> >> other numeric characters that are not matched by [:digit:].
> >
> > It's true that [:alnum:] includes more numerical characters that
> > [:digit:], but what exactly needs to be clarified here? Assuming you
> > mean clarified in the manual, that is.
>
> As noted by Paul Eggert, the POSIX behaviour is different. It may be
> worth a note in the manual to draw attention to this difference.
It's strange to say in the manual what we do NOT do. The current text
says:
‘[:digit:]’
This matches ‘0’ through ‘9’. Thus, ‘[-+[:digit:]]’ matches any
digit, as well as ‘+’ and ‘-’.
I believe you suggest to add to this something like "Note that Posix
interpretation of '[:digit:]' is different"? Given the crystal
clarity of the current text, wouldn't such addition confuse the
reader?
Btw, where's the reference for a different interpretation by Posix? I
cannot find anything to that effect; even the Unicode UTS18
(http://www.unicode.org/reports/tr18/) says, while describing the
Posix equivalent of the Unicode regexp notations: "Non-decimal numbers
(like Roman numerals) are normally excluded". What am I missing?
This bug report was last modified 5 years and 321 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.