GNU bug report logs - #18266
grep -P and invalid exits with error

Previous Next

Package: grep;

Reported by: Santiago <santiago <at> debian.org>

Date: Thu, 14 Aug 2014 15:43:02 UTC

Severity: wishlist

Merged with 18455

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #34 received at 18266 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Lefevre <vincent <at> vinc17.net>
To: Santiago <santiago <at> debian.org>
Cc: 18266 <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>,
 758105 <at> bugs.debian.org
Subject: Re: Bug#758105: bug#18266: Bug#758105: bug#18266: grep -P and
 invalid exits with error
Date: Sat, 16 Aug 2014 18:26:21 +0200
On 2014-08-16 16:01:27 +0200, Santiago wrote:
> Workaround attached. It's too slow against binary files, but I haven't
> found a simpler solution.

To avoid the slowness, I think that it would be better to detect
(directly, not via PCRE) invalid UTF-8 sequences and replace them
by null bytes *in-place*.

It might slow down the general case, though. However I'm not sure,
because if the UTF8 validity check (via the replacement of invalid
sequences) is done in grep, it doesn't need to be done in PCRE.

-- 
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)




This bug report was last modified 10 years and 249 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.