GNU bug report logs - #76291
BUG: grep sometimes spits out random garbage in the moddle of results

Previous Next

Package: grep;

Reported by: php fan <php4fan <at> gmail.com>

Date: Fri, 14 Feb 2025 17:01:02 UTC

Severity: normal

To reply to this bug, email your comments to 76291 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#76291; Package grep. (Fri, 14 Feb 2025 17:01:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to php fan <php4fan <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Fri, 14 Feb 2025 17:01:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: php fan <php4fan <at> gmail.com>
To: bug-grep <at> gnu.org
Subject: BUG: grep sometimes spits out random garbage in the moddle of results
Date: Fri, 14 Feb 2025 17:28:35 +0100
I have no idea what triggers this, so I haven't been able to produce a
minimal reproducing example; and I can't share the actual folder with
which this happens to me all the time.

Sometimes I use "grep -R" on a folder with several repositories, and
after a few legitimate results, I get a blob of dozens of lines of
text, coming from several files, with no indication of the file they
belong to, and which often don't match the pattern.

For example:

```
$ grep -R foo
path/to/file1.txt: lorem ipsum foo bar
path/to/file2.txt: lalalalala foo lalalala
   and here's a blob of totally unrelated text
   that doen't contain the string
   who knows where this comes from
   lalalallaal lorem ipsum dolor sit amet
```

I am NOT talking about the case where you have a file that is one very
long line that matches, so you get basically the whole file in the
output and it gets wrapped. In that case, it will still start with the
path of the file followed by a colon. Not in this case.




Information forwarded to bug-grep <at> gnu.org:
bug#76291; Package grep. (Fri, 14 Feb 2025 20:54:02 GMT) Full text and rfc822 format available.

Message #8 received at 76291 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: php fan <php4fan <at> gmail.com>
Cc: 76291 <at> debbugs.gnu.org
Subject: Re: bug#76291: BUG: grep sometimes spits out random garbage in the
 moddle of results
Date: Fri, 14 Feb 2025 12:52:57 -0800
On Fri, Feb 14, 2025 at 9:02 AM php fan <php4fan <at> gmail.com> wrote:
> I have no idea what triggers this, so I haven't been able to produce a
> minimal reproducing example; and I can't share the actual folder with
> which this happens to me all the time.
>
> Sometimes I use "grep -R" on a folder with several repositories, and
> after a few legitimate results, I get a blob of dozens of lines of
> text, coming from several files, with no indication of the file they
> belong to, and which often don't match the pattern.
>
> For example:
>
> ```
> $ grep -R foo
> path/to/file1.txt: lorem ipsum foo bar
> path/to/file2.txt: lalalalala foo lalalala
>    and here's a blob of totally unrelated text
>    that doen't contain the string
>    who knows where this comes from
>    lalalallaal lorem ipsum dolor sit amet
> ```
>
> I am NOT talking about the case where you have a file that is one very
> long line that matches, so you get basically the whole file in the
> output and it gets wrapped. In that case, it will still start with the
> path of the file followed by a colon. Not in this case.

Thanks for the report. Can you tell me how to reproduce that? I have
never seen such a failure.
We'd need system type, version of grep, and names/contents of the
files in the searched directory.




Information forwarded to bug-grep <at> gnu.org:
bug#76291; Package grep. (Mon, 17 Feb 2025 16:05:02 GMT) Full text and rfc822 format available.

Message #11 received at 76291 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: php fan <php4fan <at> gmail.com>
Cc: 76291 <at> debbugs.gnu.org
Subject: Re: bug#76291: BUG: grep sometimes spits out random garbage in the
 moddle of results
Date: Mon, 17 Feb 2025 08:04:13 -0800
On Fri, Feb 14, 2025 at 12:52 PM Jim Meyering <jim <at> meyering.net> wrote:
> On Fri, Feb 14, 2025 at 9:02 AM php fan <php4fan <at> gmail.com> wrote:
> > I have no idea what triggers this, so I haven't been able to produce a
> > minimal reproducing example; and I can't share the actual folder with
> > which this happens to me all the time.
> >
> > Sometimes I use "grep -R" on a folder with several repositories, and
> > after a few legitimate results, I get a blob of dozens of lines of
> > text, coming from several files, with no indication of the file they
> > belong to, and which often don't match the pattern.
> >
> > For example:
> >
> > ```
> > $ grep -R foo
> > path/to/file1.txt: lorem ipsum foo bar
> > path/to/file2.txt: lalalalala foo lalalala
> >    and here's a blob of totally unrelated text
> >    that doen't contain the string
> >    who knows where this comes from
> >    lalalallaal lorem ipsum dolor sit amet
> > ```
> >
> > I am NOT talking about the case where you have a file that is one very
> > long line that matches, so you get basically the whole file in the
> > output and it gets wrapped. In that case, it will still start with the
> > path of the file followed by a colon. Not in this case.
>
> Thanks for the report. Can you tell me how to reproduce that? I have
> never seen such a failure.
> We'd need system type, version of grep, and names/contents of the
> files in the searched directory.

Can you share the names of the files in that directory? I.e., run this:

  cd YOUR_DIR && find . -print | cat -A

Also, please rerun your command but pipe its output through "cat -A",
to see if there are unexpected characters in the output:

  $ grep -R foo | cat -A

If that shows nothing surprising, please consider sharing a sample of
real output
where you've substituted XXX for any sensitive file names or contents.




Information forwarded to bug-grep <at> gnu.org:
bug#76291; Package grep. (Thu, 20 Feb 2025 05:07:04 GMT) Full text and rfc822 format available.

Message #14 received at 76291 <at> debbugs.gnu.org (full text, mbox):

From: "Dale R. Worley" <Dale.Worley <at> comcast.net>
To: php fan <php4fan <at> gmail.com>
Cc: 76291 <at> debbugs.gnu.org
Subject: Re: bug#76291: BUG: grep sometimes spits out random garbage in the
 moddle of results
Date: Wed, 19 Feb 2025 21:02:03 -0500
php fan <php4fan <at> gmail.com> writes:
> I have no idea what triggers this, so I haven't been able to produce a
> minimal reproducing example; and I can't share the actual folder with
> which this happens to me all the time.
>
> Sometimes I use "grep -R" on a folder with several repositories, and
> after a few legitimate results, I get a blob of dozens of lines of
> text, coming from several files, with no indication of the file they
> belong to, and which often don't match the pattern.

One thing to do is to see if the results are reproducible.  So whenever
one of these incidents happens, immediately run the same command again
and see if you get another incident.

Another thing that would help is if the problem is reproducible, run it
again but "tee" it to capture the results in a file.  E.g.

$ grep -R foo | tee /tmp/grep-save

Then you can examine the file with an editor.

Likely the actual output is a set of "lines" from grep's point of view,
but the apparent line breaks are driven by long strings of characters
that don't contain newlines, or perhaps non-ASCII bytes that cause the
display to break lines.

Are the files in these repositories likely to contain compressed text?

Dale




This bug report was last modified 115 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.