GNU bug report logs - #44754
Extreme performance degradation in GNU grep 3.4 / 3.6

Previous Next

Package: grep;

Reported by: bug-grep <at> gnu.org, f.heckenbach <at> fh-soft.de

Date: Fri, 20 Nov 2020 05:32:01 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: 44754 <at> debbugs.gnu.org
Cc: jim <at> meyering.net, f.heckenbach <at> fh-soft.de
Subject: bug#44754: Extreme performance degradation in GNU grep 3.4 / 3.6
Date: Thu, 03 Dec 2020 17:26:46 +0900
[Message part 1 (text/plain, inline)]
On Thu, 26 Nov 2020 21:41:20 -1000
Jim Meyering <jim <at> meyering.net> wrote:

> On Thu, Nov 26, 2020 at 9:03 AM Jim Meyering <jim <at> meyering.net> wrote:
> >
> > On Wed, Nov 25, 2020 at 3:12 PM Jim Meyering <jim <at> meyering.net> wrote:
> > > Thank you for the fine bug report.
> > > The grep-3.6 bug you've exposed is due to the fact that your input
> > > triggers excessive hash collisions when using the code modeled after
> > > gnulib/lib/hash-pjw.c. That made the new pattern-preprocessing phase
> > > take O(N^2) time for N patterns. In the attached, I've switched grep
> > > to use the djb2 hash function, and that resolves the problem. I'll
> > > also add a NEWS entry and a test before pushing this.
> >
> > Timings suggest that grep-3.6's preprocessing came closer to O(N^3).
> > Here's an example that would take 2-3 days with grep-3.6 and only
> > seconds with this fix:
> >
> >   : | grep -Ff <(seq 6400000 | tr 0-9 A-J)
> >
> > Here's a complete patch.
> > I'll push it later today.
> 
> Pushed along with two gnulib-related changes.

The fix has improved some performance.  However, it's still quite slow
compared to version 3.3, and that can be remedied.

It converts to grep only if the potential match does not match the word
frequently.
[0001-grep-improvement-of-grep-convertion-for-fgrep.patch (text/plain, attachment)]

This bug report was last modified 4 years and 226 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.