GNU bug report logs - #25336

Previous Next

Package: grep;

Reported by: Zepp Lu <luzepu678 <at> gmail.com>

Date: Mon, 2 Jan 2017 17:32:03 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#25336: closed ()
Date: Mon, 02 Jan 2017 18:31:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Mon, 2 Jan 2017 10:30:32 -0800
with message-id <5518f57d-bcd3-e2bc-7f1d-05ed2f68e8e4 <at> cs.ucla.edu>
and subject line Re: bug#25336:
has caused the debbugs.gnu.org bug report #25336,
regarding 
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
25336: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=25336
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Zepp Lu <luzepu678 <at> gmail.com>
To: bug-grep <at> gnu.org
Date: Mon, 2 Jan 2017 21:22:59 +0800
[Message part 3 (text/plain, inline)]
OS: Archlinux
grep version: 2.27-1

Bug description: grep behaves weirdly when searching hex values.

How to reproduce:
$ printf '\x53\xef' | grep -aoP '\x53\xef'
(no output, returns 1)
$ printf '\x53\xc3\xaf' | grep -aoP '\x53\xef'
Sï
$ printf '\x53\xc3\xef' | grep -aoP '\x53\xef'
(no output, returns 1)

grep (version 2.12-2) provided by Debian works just fine.
[Message part 4 (text/html, inline)]
[Message part 5 (message/rfc822, inline)]
From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Zepp Lu <luzepu678 <at> gmail.com>, 25336-done <at> debbugs.gnu.org
Subject: Re: bug#25336:
Date: Mon, 2 Jan 2017 10:30:32 -0800
Zepp Lu wrote:

> $ printf '\x53\xef' | grep -aoP '\x53\xef'
> (no output, returns 1)
> $ printf '\x53\xc3\xaf' | grep -aoP '\x53\xef'
> Sï
> $ printf '\x53\xc3\xef' | grep -aoP '\x53\xef'
> (no output, returns 1)

I don't see a bug here. PCRE patterns like \xef match code points, not bytes, so 
the PCRE notation differs from the shell printf notation. If your locale uses 
UTF-8, the PCRE pattern \xef matches the Unicode character U+00EF LATIN SMALL 
LETTER I WITH DIAERESIS, which is represented by the byte pair C3 AF.

If you want \xef to match a single byte, run grep in a single-byte locale, e.g., 
set LC_ALL=C in the environment.

> grep (version 2.12-2) provided by Debian works just fine.

Actually, it's buggy in this area. Sometimes it can dump core.


This bug report was last modified 8 years and 140 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.