GNU bug report logs -
#18266
grep -P and invalid exits with error
Previous Next
Reported by: Santiago <santiago <at> debian.org>
Date: Thu, 14 Aug 2014 15:43:02 UTC
Severity: wishlist
Merged with 18455
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
Vincent Lefevre wrote:
> I wonder whether anyone is interested in matching individual bytes
> in a file regarded as UTF-8 encoded. This seems weird.
It's not weird at all. For example, suppose we invent the notation
[[:error:]] to match encoding errors. Then the pattern '[[:error:]]'
would match all encoding errors in a file, which could well be a useful
thing.
Currently, for example, the tz package <http://www.iana.org/time-zones>
has a Make rule 'check_character_set' that verifies that the source
files are all properly encoded. It executes this shell command:
! grep -nv '^.*$' file names
This relies on GNU grep's behavior that "." does not match an encoding
error. But it's a command that is not obvious. It'd be simpler and
clearer to write this:
! grep -n '[[:error:]]' file names
if such a feature were available.
This bug report was last modified 10 years and 249 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.