GNU bug report logs - #25336

Previous Next

Package: grep;

Reported by: Zepp Lu <luzepu678 <at> gmail.com>

Date: Mon, 2 Jan 2017 17:32:03 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 25336 in the body.
You can then email your comments to 25336 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#25336; Package grep. (Mon, 02 Jan 2017 17:32:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Zepp Lu <luzepu678 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Mon, 02 Jan 2017 17:32:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Zepp Lu <luzepu678 <at> gmail.com>
To: bug-grep <at> gnu.org
Date: Mon, 2 Jan 2017 21:22:59 +0800
[Message part 1 (text/plain, inline)]
OS: Archlinux
grep version: 2.27-1

Bug description: grep behaves weirdly when searching hex values.

How to reproduce:
$ printf '\x53\xef' | grep -aoP '\x53\xef'
(no output, returns 1)
$ printf '\x53\xc3\xaf' | grep -aoP '\x53\xef'
Sï
$ printf '\x53\xc3\xef' | grep -aoP '\x53\xef'
(no output, returns 1)

grep (version 2.12-2) provided by Debian works just fine.
[Message part 2 (text/html, inline)]

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Mon, 02 Jan 2017 18:31:02 GMT) Full text and rfc822 format available.

Notification sent to Zepp Lu <luzepu678 <at> gmail.com>:
bug acknowledged by developer. (Mon, 02 Jan 2017 18:31:02 GMT) Full text and rfc822 format available.

Message #10 received at 25336-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Zepp Lu <luzepu678 <at> gmail.com>, 25336-done <at> debbugs.gnu.org
Subject: Re: bug#25336:
Date: Mon, 2 Jan 2017 10:30:32 -0800
Zepp Lu wrote:

> $ printf '\x53\xef' | grep -aoP '\x53\xef'
> (no output, returns 1)
> $ printf '\x53\xc3\xaf' | grep -aoP '\x53\xef'
> Sï
> $ printf '\x53\xc3\xef' | grep -aoP '\x53\xef'
> (no output, returns 1)

I don't see a bug here. PCRE patterns like \xef match code points, not bytes, so 
the PCRE notation differs from the shell printf notation. If your locale uses 
UTF-8, the PCRE pattern \xef matches the Unicode character U+00EF LATIN SMALL 
LETTER I WITH DIAERESIS, which is represented by the byte pair C3 AF.

If you want \xef to match a single byte, run grep in a single-byte locale, e.g., 
set LC_ALL=C in the environment.

> grep (version 2.12-2) provided by Debian works just fine.

Actually, it's buggy in this area. Sometimes it can dump core.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 31 Jan 2017 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 140 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.