GNU bug report logs -
#76291
BUG: grep sometimes spits out random garbage in the moddle of results
Previous Next
To reply to this bug, email your comments to 76291 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-grep <at> gnu.org
:
bug#76291
; Package
grep
.
(Fri, 14 Feb 2025 17:01:03 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
php fan <php4fan <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-grep <at> gnu.org
.
(Fri, 14 Feb 2025 17:01:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
I have no idea what triggers this, so I haven't been able to produce a
minimal reproducing example; and I can't share the actual folder with
which this happens to me all the time.
Sometimes I use "grep -R" on a folder with several repositories, and
after a few legitimate results, I get a blob of dozens of lines of
text, coming from several files, with no indication of the file they
belong to, and which often don't match the pattern.
For example:
```
$ grep -R foo
path/to/file1.txt: lorem ipsum foo bar
path/to/file2.txt: lalalalala foo lalalala
and here's a blob of totally unrelated text
that doen't contain the string
who knows where this comes from
lalalallaal lorem ipsum dolor sit amet
```
I am NOT talking about the case where you have a file that is one very
long line that matches, so you get basically the whole file in the
output and it gets wrapped. In that case, it will still start with the
path of the file followed by a colon. Not in this case.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#76291
; Package
grep
.
(Fri, 14 Feb 2025 20:54:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 76291 <at> debbugs.gnu.org (full text, mbox):
On Fri, Feb 14, 2025 at 9:02 AM php fan <php4fan <at> gmail.com> wrote:
> I have no idea what triggers this, so I haven't been able to produce a
> minimal reproducing example; and I can't share the actual folder with
> which this happens to me all the time.
>
> Sometimes I use "grep -R" on a folder with several repositories, and
> after a few legitimate results, I get a blob of dozens of lines of
> text, coming from several files, with no indication of the file they
> belong to, and which often don't match the pattern.
>
> For example:
>
> ```
> $ grep -R foo
> path/to/file1.txt: lorem ipsum foo bar
> path/to/file2.txt: lalalalala foo lalalala
> and here's a blob of totally unrelated text
> that doen't contain the string
> who knows where this comes from
> lalalallaal lorem ipsum dolor sit amet
> ```
>
> I am NOT talking about the case where you have a file that is one very
> long line that matches, so you get basically the whole file in the
> output and it gets wrapped. In that case, it will still start with the
> path of the file followed by a colon. Not in this case.
Thanks for the report. Can you tell me how to reproduce that? I have
never seen such a failure.
We'd need system type, version of grep, and names/contents of the
files in the searched directory.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#76291
; Package
grep
.
(Mon, 17 Feb 2025 16:05:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 76291 <at> debbugs.gnu.org (full text, mbox):
On Fri, Feb 14, 2025 at 12:52 PM Jim Meyering <jim <at> meyering.net> wrote:
> On Fri, Feb 14, 2025 at 9:02 AM php fan <php4fan <at> gmail.com> wrote:
> > I have no idea what triggers this, so I haven't been able to produce a
> > minimal reproducing example; and I can't share the actual folder with
> > which this happens to me all the time.
> >
> > Sometimes I use "grep -R" on a folder with several repositories, and
> > after a few legitimate results, I get a blob of dozens of lines of
> > text, coming from several files, with no indication of the file they
> > belong to, and which often don't match the pattern.
> >
> > For example:
> >
> > ```
> > $ grep -R foo
> > path/to/file1.txt: lorem ipsum foo bar
> > path/to/file2.txt: lalalalala foo lalalala
> > and here's a blob of totally unrelated text
> > that doen't contain the string
> > who knows where this comes from
> > lalalallaal lorem ipsum dolor sit amet
> > ```
> >
> > I am NOT talking about the case where you have a file that is one very
> > long line that matches, so you get basically the whole file in the
> > output and it gets wrapped. In that case, it will still start with the
> > path of the file followed by a colon. Not in this case.
>
> Thanks for the report. Can you tell me how to reproduce that? I have
> never seen such a failure.
> We'd need system type, version of grep, and names/contents of the
> files in the searched directory.
Can you share the names of the files in that directory? I.e., run this:
cd YOUR_DIR && find . -print | cat -A
Also, please rerun your command but pipe its output through "cat -A",
to see if there are unexpected characters in the output:
$ grep -R foo | cat -A
If that shows nothing surprising, please consider sharing a sample of
real output
where you've substituted XXX for any sensitive file names or contents.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#76291
; Package
grep
.
(Thu, 20 Feb 2025 05:07:04 GMT)
Full text and
rfc822 format available.
Message #14 received at 76291 <at> debbugs.gnu.org (full text, mbox):
php fan <php4fan <at> gmail.com> writes:
> I have no idea what triggers this, so I haven't been able to produce a
> minimal reproducing example; and I can't share the actual folder with
> which this happens to me all the time.
>
> Sometimes I use "grep -R" on a folder with several repositories, and
> after a few legitimate results, I get a blob of dozens of lines of
> text, coming from several files, with no indication of the file they
> belong to, and which often don't match the pattern.
One thing to do is to see if the results are reproducible. So whenever
one of these incidents happens, immediately run the same command again
and see if you get another incident.
Another thing that would help is if the problem is reproducible, run it
again but "tee" it to capture the results in a file. E.g.
$ grep -R foo | tee /tmp/grep-save
Then you can examine the file with an editor.
Likely the actual output is a set of "lines" from grep's point of view,
but the apparent line breaks are driven by long strings of characters
that don't contain newlines, or perhaps non-ASCII bytes that cause the
display to break lines.
Are the files in these repositories likely to contain compressed text?
Dale
This bug report was last modified 115 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.