GNU bug report logs - #18266
grep -P and invalid exits with error

Previous Next

Package: grep;

Reported by: Santiago <santiago <at> debian.org>

Date: Thu, 14 Aug 2014 15:43:02 UTC

Severity: wishlist

Merged with 18455

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #108 received at 18266 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Lefevre <vincent <at> vinc17.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 18266 <at> debbugs.gnu.org, Santiago <santiago <at> debian.org>,
 758105 <at> bugs.debian.org
Subject: Re: handling bytes not part of the charset, and other garbage
Date: Fri, 12 Sep 2014 02:36:59 +0200
On 2014-09-11 09:22:49 -0700, Paul Eggert wrote:
> Vincent Lefevre wrote:
> 
> >There's no reason that '.' matches something that doesn't belong to
> >the charset in C locale, but doesn't match in a UTF-8 locale.
> 
> In the C locale on GNU/Linux, all byte values are members of the charset.

I don't see any valid reason for that (the C locale corresponds
to ANSI_X3.4-1968, which is 7-bit only, so that there is some
inconsistency), except that it could be seen as more practical.
But then, I would say that this should be the same for invalid
byte sequences in a UTF-8 locale.

-- 
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)




This bug report was last modified 10 years and 249 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.