GNU bug report logs -
#25336
Previous Next
Reported by: Zepp Lu <luzepu678 <at> gmail.com>
Date: Mon, 2 Jan 2017 17:32:03 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
Message #10 received at 25336-done <at> debbugs.gnu.org (full text, mbox):
Zepp Lu wrote:
> $ printf '\x53\xef' | grep -aoP '\x53\xef'
> (no output, returns 1)
> $ printf '\x53\xc3\xaf' | grep -aoP '\x53\xef'
> Sï
> $ printf '\x53\xc3\xef' | grep -aoP '\x53\xef'
> (no output, returns 1)
I don't see a bug here. PCRE patterns like \xef match code points, not bytes, so
the PCRE notation differs from the shell printf notation. If your locale uses
UTF-8, the PCRE pattern \xef matches the Unicode character U+00EF LATIN SMALL
LETTER I WITH DIAERESIS, which is represented by the byte pair C3 AF.
If you want \xef to match a single byte, run grep in a single-byte locale, e.g.,
set LC_ALL=C in the environment.
> grep (version 2.12-2) provided by Debian works just fine.
Actually, it's buggy in this area. Sometimes it can dump core.
This bug report was last modified 8 years and 140 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.