GNU bug report logs - #78276
grep on file with 0xF3 byte in utf-8 locale

Previous Next

Package: grep;

Reported by: Arkadiusz Miśkiewicz <arekm <at> maven.pl>

Date: Tue, 6 May 2025 07:39:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #10 received at 78276-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Arkadiusz Miśkiewicz <arekm <at> maven.pl>
Cc: 78276-done <at> debbugs.gnu.org
Subject: Re: bug#78276: grep on file with 0xF3 byte in utf-8 locale
Date: Tue, 6 May 2025 02:12:25 -0700
On 2025-05-06 00:37, Arkadiusz Miśkiewicz via Bug reports for GNU grep 
wrote:
> Is that expected behavior, no binary file warning and no matching with 
> utf-8 locale, even with -a?

It's allowed behavior, as '.' need not match encoding errors.[1] Also, 
'grep' need not diagnose encoding errors that don't harm the output.[2]

As you mentioned in your email, using LC_ALL=C should let '.' match any 
byte, so that should let you do what you want.

[1]: 
https://www.gnu.org/software/grep/manual/html_node/Fundamental-Structure.html
[2]: 
https://www.gnu.org/software/grep/manual/html_node/File-and-Directory-Selection.html




This bug report was last modified 18 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.