GNU bug report logs - #18266
grep -P and invalid exits with error

Previous Next

Package: grep;

Reported by: Santiago <santiago <at> debian.org>

Date: Thu, 14 Aug 2014 15:43:02 UTC

Severity: wishlist

Merged with 18455

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #37 received at 18266 <at> debbugs.gnu.org (full text, mbox):

From: Santiago <santiago <at> debian.org>
To: Vincent Lefevre <vincent <at> vinc17.net>
Cc: 18266 <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>,
 758105 <at> bugs.debian.org
Subject: Re: Bug#758105: bug#18266: Bug#758105: bug#18266: grep -P and
 invalid exits with error
Date: Sat, 16 Aug 2014 19:56:37 +0200
El 16/08/14 a las 18:26, Vincent Lefevre escribió:
> On 2014-08-16 16:01:27 +0200, Santiago wrote:
> > Workaround attached. It's too slow against binary files, but I haven't
> > found a simpler solution.
> 
> To avoid the slowness, I think that it would be better to detect
> (directly, not via PCRE) invalid UTF-8 sequences and replace them
> by null bytes *in-place*.
> 
> It might slow down the general case, though. However I'm not sure,
> because if the UTF8 validity check (via the replacement of invalid
> sequences) is done in grep, it doesn't need to be done in PCRE.
> 

I think that'd require a similar work to replace the "invalid" content
from binary files.

Another solution would be to don't check if binary files are valid
(passing PCRE_NO_UTF8_CHECK to pcre_exec), but I don't know if that'd
avoid security holes, and I don't know how to do it either.

Regards,

Santiago




This bug report was last modified 10 years and 249 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.