GNU bug report logs - #20526
BUG: text file is detected as binary

Previous Next

Package: grep;

Reported by: Sebastian Poehn <sebastian.poehn <at> gmail.com>

Date: Thu, 7 May 2015 15:41:03 UTC

Severity: normal

Merged with 19230, 19985, 21558

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Kamil Dudka <kdudka <at> redhat.com>
To: Eric Blake <eblake <at> redhat.com>
Cc: 20526 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu, sebastian.poehn <at> gmail.com, =?UTF-8?Q?P=C3=B6hn <at> debbugs.gnu.org
Subject: bug#20526: BUG: text file is detected as binary
Date: Mon, 11 May 2015 13:05:23 +0200
On Thursday 07 May 2015 13:11:49 Eric Blake wrote:
> On 05/07/2015 11:47 AM, Sebastian Pöhn wrote:
> > Thanks for this fast feedback. Your explanation sounds very reasonable. As
> > you may have noticed this a makefile out of openwrt with is mainlined
> > there.
> > 
> > 1) I downgraded to grep 2.20. Issue is gone with the same environment. So
> > this is in my eyes a regression.
> 
> No, it is a bug fix, and documented in the NEWS:
> 
>   If a file contains data improperly encoded for the current locale,
>   and this is discovered before any of the file's contents are output,
>   grep now treats the file as binary.

Which bug does it fix?

The upstream commit in question (cd36abd4) does not refer to any bug report.
Also the fact that the commit had to change existing regression tests to 
prevent them from failing suggests that it can be seen as a regression.

> > 2) I will also open a report at fedora, maybe the use some strange setting
> > in building the new packet.
> 
> But as the change is intentional, there is probably nothing that Fedora
> would do about it.

I already created a bug for Fedora:

https://bugzilla.redhat.com/1219141

Kamil

> > 3) I will send a short notice to openwrt asking if they think it is fine
> > to
> > use ë or ö. I personally have a strong opinion on that ;)
> 
> It would be fine if they would recode their file to use UTF-8, as that
> is pretty much a standard encoding these days.  Latin-1 files are
> getting harder and harder to process, as more people move to multibyte
> UTF-8 locales.




This bug report was last modified 9 years and 138 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.