GNU bug report logs -
#29668
grep: Fatal problem with (big) file
Previous Next
Reported by: pg <pasi.vitsa <at> yahoo.com>
Date: Mon, 11 Dec 2017 22:03:02 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
Message #14 received at 29668 <at> debbugs.gnu.org (full text, mbox):
On Tue, 12 Dec 2017 16:28:09 -0800
Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> On 12/11/2017 03:36 PM, Norihiro Tanaka wrote:
> > Perhaps, characters not to be able to recognize in your locale included
> > in Tieliikenne 5.0.csv and volvot.csv are included.
>
> Yes, that's the problem. The original 'grep' output ended in "Binary file Tieliikenne5.0.csv matches" but the user didn't see that. Perhaps we should send that diagnostic to stderr as well.
I don't seem that that's problem. the user pass output of grep to wc -l,
so `Binary file ... matches' line is also counted by `wc' as one line.
$ env LC_ALL=C grep 'Volvo' Tieliikenne\ 5.0.csv | wc -l
266175
$ env LC_ALL=en_US.utf8 grep 'Volvo' Tieliikenne\ 5.0.csv | wc -l
241264
$ env LC_ALL=en_US.utf8 grep 'Volvo' Tieliikenne\ 5.0.csv | tail -1
Binary file Tieliikenne 5.0.csv matches
$ env LC_ALL=C grep N3 volvot.csv | wc -l
17822
$ env LC_ALL=en_US.utf8 grep N3 volvot.csv | wc -l
11741
$ env LC_ALL=en_US.utf8 grep N3 volvot.csv | tail -1
Binary file volvot.csv matches
This bug report was last modified 4 years and 239 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.