GNU bug report logs - #22838
New 'Binary file' detection considered harmful

Previous Next

Package: grep;

Reported by: Marcello Perathoner <marcello <at> perathoner.de>

Date: Sun, 28 Feb 2016 18:13:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 22838 <at> debbugs.gnu.org, Marcello Perathoner <marcello <at> perathoner.de>
Subject: bug#22838: New 'Binary file' detection considered harmful
Date: Mon, 29 Feb 2016 17:23:45 -0800
On Mon, Feb 29, 2016 at 3:35 PM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> On 02/29/2016 12:34 PM, Marcello Perathoner wrote:
...
>> Since 2.21 I will now have to always specify -a or LC_ALL=C when
>> grepping my files.
>
> I suggest using -a. LC_ALL=C won't work the way that you want on platforms
> where the C locale is UTF-8, or is pure ASCII. For example, on Fedora 23 or
> RHEL 7 with grep 2.23 we have:
>
> $ printf '\200\n' | LC_ALL=C grep .
> Binary file (standard input) matches
>
> This is because the C locale is pure ASCII on these platforms, i.e., '\200'
> is not a valid character the way it is with traditional Unix.  I don't know
> why Red Hat made that change.

Wow. I hadn't noticed that using LC_ALL=C is inadequate.
Disturbing...




This bug report was last modified 8 years and 256 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.