GNU bug report logs - #18266
grep -P and invalid exits with error

Previous Next

Package: grep;

Reported by: Santiago <santiago <at> debian.org>

Date: Thu, 14 Aug 2014 15:43:02 UTC

Severity: wishlist

Merged with 18455

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log

View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Santiago <santiago <at> debian.org>
Subject: bug#18266: closed (Re: bug#18266: Bug#758105: bug#18266: grep -P
 and invalid exits with error)
Date: Thu, 11 Sep 2014 17:08:02 +0000

[Message part 1 (text/plain, inline)]

Your bug report

#18266: grep -P and invalid exits with error 

which was filed against the grep package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 18266 <at> debbugs.gnu.org.

-- 
18266: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=18266
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems

[Message part 2 (message/rfc822, inline)]

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Vincent Lefevre <vincent <at> vinc17.net>, Santiago <santiago <at> debian.org>
Cc: 18266-done <at> debbugs.gnu.org, 758105 <at> bugs.debian.org, 761157 <at> bugs.debian.org
Subject: Re: bug#18266: Bug#758105: bug#18266: grep -P and invalid exits with
 error
Date: Thu, 11 Sep 2014 10:07:49 -0700

[Message part 3 (text/plain, inline)]

Vincent Lefevre wrote:

> I've just reported a new Debian concerning the performance problem.

It's not clear from http://bugs.debian.org/761157 that the performance 
problem occurs only with -P, but I assume that's what is meant.

Since this is a performance bug with PCRE, I suggest moving the Debian 
bug report to the Debian libpcre3 package.  Grep cannot go back to the 
old way, which could cause grep to crash, and the bug cannot be fixed in 
grep because libpcre3 does not provide a fast way to search arbitrary 
data that may include encoding errors.  It really is a problem that 
requires changes to libpcre3 to fix; grep cannot fix it.

In the meantime, in order to use 'grep' to search for strings in 
arbitrary data, I suggest omitting the '-P'.  Also, I suggest using the 
C locale.

As the GNU bug 18266 "grep -P and invalid exits with error" has been 
fixed, I'm closing that bug report.  Please feel free to open a separate 
GNU bug report for the performance issue.

PS.  While composing this email I noticed another bug in grep -P and 
encoding errors, which I fixed by installing the attached patch.

[0001-grep-fix-false-matches-with-P-.-and-invalid-UTF-8.patch (text/plain, attachment)]

[Message part 5 (message/rfc822, inline)]

From: Santiago <santiago <at> debian.org>
To: bug-grep <at> gnu.org
Cc: 758105 <at> bugs.debian.org
Subject: grep -P and invalid exits with error 
Date: Thu, 14 Aug 2014 17:42:57 +0200

Hi,

Please, revert ca7868cc27db3d9deafaa2e0ac5a2bb0aa8ef373

That commit (re)introduced a regression bug (See http://debbugs.gnu.org/15758).
pcresearch checks again if input is UTF-8 valid. The problem is that
binary files are utf-8 invalid, so grep -P, in unicode locales, exits
with error:

LANG=en_US.UTF-8 grep -P -r x /usr/bin/
grep: invalid UTF-8 byte sequence in input



printf 'j\x82\nj\n'|LC_ALL=en_US.UTF-8 grep -P j|cat -A; echo $?
grep: invalid UTF-8 byte sequence in input
0

should be:
printf 'j\x82\nj\n'|LC_ALL=en_US.UTF-8 src/grep -P j|cat -A; echo $?
jM-^B$
j$
0

Tested on Debian and Archlinux with pcre 8.35.

Thanks,

Santiago

This bug report was last modified 10 years and 347 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #18266 grep -P and invalid exits with error

GNU bug report logs - #18266
grep -P and invalid exits with error