GNU bug report logs - #18266
grep -P and invalid exits with error

Previous Next

Package: grep;

Reported by: Santiago <santiago <at> debian.org>

Date: Thu, 14 Aug 2014 15:43:02 UTC

Severity: wishlist

Merged with 18455

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #49 received at 18266 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Santiago <santiago <at> debian.org>, 758105 <at> bugs.debian.org
Cc: 18266 <at> debbugs.gnu.org, Vincent Lefevre <vincent <at> vinc17.net>
Subject: Re: grep -P and invalid exits with error
Date: Fri, 29 Aug 2014 06:43:45 -0700
Thanks, but that patch seems to depend on libpcre internals, in that it 
"knows" that pcre_exec cannot possibly succeed without first checking 
its entire input buffer for invalid UTF-8 bytes.  Even if that's true 
now, it reflects a performance bug that might be fixed in a future 
libpcre version.

Also, I don't see why grep needs to copy the buffer when there's an 
encoding error.  Why not simply rerun the matcher on the initial prefix 
that doesn't have an encoding-error byte, and then (if that doesn't find 
a match), try matching the suffix after the encoding-error byte?  This 
approach would not only avoid the buffer copy, it would avoid knowledge 
of libpcre internals.




This bug report was last modified 10 years and 249 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.