GNU bug report logs - #44754
Extreme performance degradation in GNU grep 3.4 / 3.6

Previous Next

Package: grep;

Reported by: bug-grep <at> gnu.org, f.heckenbach <at> fh-soft.de

Date: Fri, 20 Nov 2020 05:32:01 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Frank Heckenbach <f.heckenbach <at> fh-soft.de>
To: bug-grep <at> gnu.org
Subject: Extreme performance degradation in GNU grep 3.4 / 3.6
Date: Fri, 20 Nov 2020 06:20:55 +0100
Hi,

I have a use case where I run grep with a large number of search
patterns on a large text file. It works well with grep-3.3, but with
grep-3.4 it quickly burned through GBs of memory and almost locked
up my system due to swapping.

To avoid attaching those large files, I could mostly reproduce the
effects like this:

ulimit -d 5000000  # avoid system lockup due to excessive swapping
export LC_ALL=C    # make sure no Unicode case conversions are needed

% time  ./grep-3.3 -Fwif <(seq 300000 | tr 0-9 A-J) <<<foo

real    0m0.054s
user    0m0.048s
sys     0m0.012s

% time  ./grep-3.4 -Fwif <(seq 30000 | tr 0-9 A-J) <<<foo
./grep-3.4: Memory exhausted
Aborted

real    0m1.291s
user    0m0.696s
sys     0m0.599s

% time  ./grep-3.6 -Fwif <(seq 300000 | tr 0-9 A-J) <<<foo

real    0m13.162s
user    0m12.955s
sys     0m0.211s

grep-3.3 behaves well, even with much larger number of patterns.
Time seems to grow linearly, and memory usage is constant.

grep-3.4 behaves the worst of these 3 versions. Even with just 30000
patterns it exceeds the ulimit of 5 GB.

grep-3.6 behaves a bit better than 3.4, but still bad. Time seems to
be quadratic in the number of patterns, and though memory usage in
this case seems to be almost constant, in my actual use case it also
runs out of memory where grep-3.3 works well with just a few 100 MB
used.

Without "-i", grep-3.4 seems to run as fast as grep-3.3, but
grep-3.6 is almost as slow as with "-i".

So there might actually be two different issues here, one that
affects 3.4 with "-i" and one that affects 3.6 with or without "-i".

Regards,
Frank




This bug report was last modified 4 years and 225 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.