GNU bug report logs - #40634
Massive pattern list handling with -E format seems very slow since 2.28.

Previous Next

Package: grep;

Reported by: fryasu <at> yahoo.co.jp

Date: Wed, 15 Apr 2020 02:21:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: fryasu <at> yahoo.co.jp
Subject: bug#40634: closed (Re: bug#40634: Massive pattern list handling
 with -E format seems very slow since 2.28.)
Date: Mon, 21 Sep 2020 19:23:02 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#40634: Massive pattern list handling with -E format seems very slow since 2.28.

which was filed against the grep package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 40634 <at> debbugs.gnu.org.

-- 
40634: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=40634
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>
Cc: fryasu <at> yahoo.co.jp, Gnulib bugs <bug-gnulib <at> gnu.org>,
 40634-done <at> debbugs.gnu.org, Norihiro Tanaka <noritnk <at> kcn.ne.jp>,
 GNU grep developers <grep-devel <at> gnu.org>
Subject: Re: bug#40634: Massive pattern list handling with -E format seems
 very slow since 2.28.
Date: Mon, 21 Sep 2020 12:22:50 -0700
The dust seems to have settled on this, so I'm closing the grep bug report to 
tidy things up.

[Message part 3 (message/rfc822, inline)]
From: fryasu <at> yahoo.co.jp
To: "bug-grep <at> gnu.org" <bug-grep <at> gnu.org>
Subject: Massive pattern list handling with -E format seems very slow since
 2.28.
Date: Wed, 15 Apr 2020 09:26:55 +0900 (JST)
Hi,

Massive pattern list handling with -E format seems very
slow, since grep 2.28.

Conversion from -E format to -F format may have problem
about performance.

When the processing time is measured by the script below,
the result isobviously different between 2.28 and 2.27.

----
#!/bin/bash
export LC_ALL=C

rm -f grep-patterns.txt
for i in {1..2000}; do
     pat=$(echo -n "$i" | sha1sum | cut -f1 -d ' ')
     echo -e "$pat$pat(\$|$pat)" >> grep-patterns.txt
done

echo executing grep...
time grep -E -v -m1 -f grep-patterns.txt /dev/null
----

The following is the results in my PC with fedora's RPM.
https://koji.fedoraproject.org/koji/packageinfo?packageID=1023

- result with grep 2.28

  real 0m11.087s / user 0m11.027s / sys 0m0.037s

- result with grep 2.27

  real 0m0.144s / user 0m0.116s / sys 0m0.027s

With also recent 3.4, result is same.


I hope you find it useful.


regards,




This bug report was last modified 4 years and 328 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.