GNU bug report logs - #18266
grep -P and invalid exits with error

Previous Next

Package: grep;

Reported by: Santiago <santiago <at> debian.org>

Date: Thu, 14 Aug 2014 15:43:02 UTC

Severity: wishlist

Merged with 18455

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #143 received at 18266 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 758105 <at> bugs.debian.org, Vincent Lefevre <vincent <at> vinc17.net>,
 18266 <at> debbugs.gnu.org
Subject: Re: bug#18266: handling bytes not part of the charset,
 and other garbage
Date: Fri, 12 Sep 2014 15:23:08 -0700
On Fri, Sep 12, 2014 at 2:39 PM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> On 09/12/2014 02:29 PM, Vincent Lefevre wrote:
>
>> an option to control what happens on encoding errors would be better and
>> sufficient.
>
>
> It might suffice for your use cases, but it's more complicated and less
> flexible than being able to match bytes within the regular expression.
> (Plus, someone would have to implement it, which is perhaps the biggest
> objection to either approach ....)  But I take your point that \C is best
> avoided.  This whole area is pretty hairy, I'm afraid.
>
> Speaking of hairy, why doesn't grep use PCRE_MULTILINE?  Using
> PCRE_MULTILINE shouldn't be that hard, and should boost performance quite a
> bit in typical usage.  Or am I being too optimistic here?

When I first saw that implementation, I assumed it was just a first-cut one.
I see no reason not to use PCRE_MULTILINE, but haven't tried it, either.




This bug report was last modified 10 years and 249 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.