GNU bug report logs - #15410
[:alnum:] is not [:alpha:] AND [:digit:]... [:alnum:] is [:alpha:] OR [:digit:]

Previous Next

Package: grep;

Reported by: Nick Aganan <thesysad <at> gmail.com>

Date: Wed, 18 Sep 2013 16:13:02 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eric Blake <eblake <at> redhat.com>
Cc: Nick Aganan <thesysad <at> gmail.com>, 15410-done <at> debbugs.gnu.org
Subject: bug#15410: [:alnum:] is not [:alpha:] AND [:digit:]... [:alnum:] is [:alpha:] OR [:digit:]
Date: Wed, 18 Sep 2013 12:39:13 -0600
[Message part 1 (text/plain, inline)]
On 09/18/2013 12:31 PM, Eric Blake wrote:

>> [:alnum:] is defined as
>>
>> Alphanumeric characters: ‘[:alpha:]’ *and* ‘[:digit:]’; in the ‘C’ locale
>> and ASCII character encoding, this is the same as ‘[0-9A-Za-z]’.
> 
> This sense of "and" correctly means the combination, where characters
> from either class satisfy the regex.  Writing '[[:alnum:]]' is the same
> as writing '[[:alpha:][:digit:]]'
> 

> 
> Given that the problem is in your lack of shell quoting, and not in
> grep, I'm closing this as not a bug.  However, feel free to respond if
> you have more comments.
> 

Re-reading what I just wrote, I think I'd better add more, because it
may not just be a problem with shell globbing, but also a
misunderstanding on your part:

>> 
>> ### if [:alnum] functions as ‘[:alpha:]’ *AND* ‘[:digit:]’, it should show
>> x1y1z123 only

In your sample, you specified a regex that matches exactly one byte.  It
matches all three lines, because "a" (in the "adc" line) fits the alnum
category, "x" (in the "x1y1z123" line) fits the alnum category, and "4"
(in the "456" line) fits the alnum category.  Again, it is NOT a regex
that specifies a multi-byte match, where the match has to include at
least one alpha byte and one digit byte, but a regex that specifies a
range of possible matching bytes, and the range includes both alpha and
digit bytes, but only one byte matches.

In just the same way, you can say that the regex "[ab]" matches both "a"
and "b"; or you can state that you will have a match if either "a" or
"b" is encountered; but it's all a matter of wording for which
conjunction feels most natural for the context you are using for
describing the matching.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

This bug report was last modified 11 years and 304 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.