GNU bug report logs -
#20526
BUG: text file is detected as binary
Previous Next
Full log
View this message in rfc822 format
Kamil Dudka wrote:
> Which bug does it fix?
I don't recall a bug report being filed for it, but the old grep behavior had
real problems: as I remember at times it dumped core, and at other times it spit
out improperly encoded data to the terminal. We've fixed the core dumps I know
about, though I think grep still outputs improperly encoded data at times (and
this should get fixed too -- see below for a suggestion).
At any rate, applications could never assume a particular behavior for
improperly encoded files, so the current behavior is clearly not a bug. Users
may be able to scrape along by setting LC_ALL=C before running 'grep' -- the
problems LC_ALL=C runs into are about the same as the problems with using old
'grep' (except that the new grep doesn't dump core :-).
Perhaps we can improve the behavior of grep by changing its heuristic slightly.
Currently grep reports "Binary file FOO matches" if it finds binary data in
FOO before it finds the first match. Instead, perhaps we could change grep to
report "Binary file FOO matches" when it sees that it's about to generate binary
*output* copied from FOO, regardless of whether this output represents the first
match. That is, when grep sees that it's about to output binary data, grep
instead outputs "Binary file FOO matches" and then stops output for FOO (even if
it already output some lines for ordinary matches in FOO).
This approach would fix the problem of grep trashing the output stream, and it
should be less drastic than grep's current approach, in that it would make grep
more likely to do what Kamil Dudka is asking for (assuming grep is given mostly
valid input interspersed with small amounts of binary data).
This bug report was last modified 9 years and 138 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.