GNU bug report logs -
#18266
grep -P and invalid exits with error
Previous Next
Reported by: Santiago <santiago <at> debian.org>
Date: Thu, 14 Aug 2014 15:43:02 UTC
Severity: wishlist
Merged with 18455
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
On 2014-09-11 18:16:29 -0700, Paul Eggert wrote:
> Vincent Lefevre wrote:
> >the C locale corresponds to ANSI_X3.4-1968,
>
> No it doesn't, at least not on any current platform I'm aware of.
It does on Debian:
ypig% LC_ALL=C locale charmap
ANSI_X3.4-1968
> >I would say that this should be the same for invalid
> >byte sequences in a UTF-8 locale.
>
> One *could* design an encoding with that property, but it wouldn't be UTF-8;
> it would be something else. I don't know of any C library that does that to
> UTF-8. There are good arguments against doing it, e.g., one loses the
> property that one can concatenate character strings by concatenating their
> byte representations.
I'm talking only about grep here.
BTW, the current behavior breaks the sometimes used "grep ." solution
to match non-empty lines.
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
This bug report was last modified 10 years and 248 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.