GNU bug report logs - #18266
grep -P and invalid exits with error

Previous Next

Package: grep;

Reported by: Santiago <santiago <at> debian.org>

Date: Thu, 14 Aug 2014 15:43:02 UTC

Severity: wishlist

Merged with 18455

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Vincent Lefevre <vincent <at> vinc17.net>
Cc: 18266 <at> debbugs.gnu.org, 758105 <at> bugs.debian.org
Subject: bug#18266: handling bytes not part of the charset, and other garbage
Date: Thu, 11 Sep 2014 20:26:12 -0700
Vincent Lefevre wrote:

> ypig% LC_ALL=C locale charmap
> ANSI_X3.4-1968

That may be what the 'locale' command says, but bytes with the top bit 
on are considered to be valid single-byte characters.  There are no 
encoding errors.  So, in that sense it's not strict ASCII.

> the current behavior breaks the sometimes used "grep ." solution
> to match non-empty lines.

"grep ." matches lines containing one or more characters.  Encoding 
errors are not characters, at least, not as far as plain grep is concerned.

Perhaps PCRE is different, and if libpcre worked with encoding errors we 
could simply use its way of matching them.  But there doesn't seem to be 
a safe way to do that.




This bug report was last modified 10 years and 248 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.