GNU bug report logs - #18266
grep -P and invalid exits with error

Previous Next

Package: grep;

Reported by: Santiago <santiago <at> debian.org>

Date: Thu, 14 Aug 2014 15:43:02 UTC

Severity: wishlist

Merged with 18455

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Vincent Lefevre <vincent <at> vinc17.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 18266 <at> debbugs.gnu.org, 758105 <at> bugs.debian.org
Subject: bug#18266: handling bytes not part of the charset, and other garbage
Date: Sat, 13 Sep 2014 03:17:41 +0200
On 2014-09-12 17:57:39 -0700, Paul Eggert wrote:
> Currently, for example, the tz package <http://www.iana.org/time-zones> has
> a Make rule 'check_character_set' that verifies that the source files are
> all properly encoded.  It executes this shell command:
> 
> ! grep -nv '^.*$' file names
> 
> This relies on GNU grep's behavior that "." does not match an encoding
> error.  But it's a command that is not obvious.  It'd be simpler and clearer
> to write this:
> 
> ! grep -n '[[:error:]]' file names
> 
> if such a feature were available.

But both of these solutions have the drawback of working only in
UTF-8 locales. One may wonder whether grep is the right tool, as
"iconv -f UTF-8 -t UTF-8" can do such a check in any locale.

-- 
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)




This bug report was last modified 10 years and 249 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.