GNU bug report logs - #40634
Massive pattern list handling with -E format seems very slow since 2.28.

Previous Next

Package: grep;

Reported by: fryasu <at> yahoo.co.jp

Date: Wed, 15 Apr 2020 02:21:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>
Cc: fryasu <at> yahoo.co.jp, Gnulib bugs <bug-gnulib <at> gnu.org>, 40634 <at> debbugs.gnu.org, Norihiro Tanaka <noritnk <at> kcn.ne.jp>, GNU grep developers <grep-devel <at> gnu.org>
Subject: bug#40634: Massive pattern list handling with -E format seems very slow since 2.28.
Date: Sun, 13 Sep 2020 19:03:33 -0700
[Message part 1 (text/plain, inline)]
On 9/11/20 11:41 PM, Jim Meyering wrote:

>> https://bugs.gnu.org/40634#32
>>
>> I'll try to take a look at the later patch.
> 
> Oh! Glad you spotted that.

I took a look and the basic idea sounds good though I admit I did not check 
every detail. While looking into it I found some opportunities for improvements, 
plus I found what appear to be some longstanding bugs in the area, one of which 
causes a grep test failure on Solaris (and I suspect the bug is also on 
GNU/Linux but the grep tests don't catch it). I installed the attached patches 
into Gnulib, updated grep to point to the new Gnulib version, and added a note 
in grep's NEWS file about this.

Patch 1 is what Norihiro Tanaka proposed in Bug#40634#32, except I edited the 
commit message. Patch 2 consists of minor cleanups and performance tweaks for 
Patch 1. (Patches 3 and 4 are omitted as they were installed by others into 
Gnulib at about the same time I was installing these.) Patch 5 fixes a 
dfa-heap-overrun failure on Solaris that appears to be a longstanding bug 
exposed by Patch 1 when running on Solaris. Patch 6 merely cleans up code near 
Patch 5. Patch 7 fixes the use of an uninitialized constraint, which I 
discovered while debugging Patch 5 under Valgrind; this also appears to be a 
longstandiung bug.

Coming up with test cases for all these bugs would be pretty tricky, unfortunately.
[0001-dfa-use-backward-set-in-removal-of-epsilon-closure.patch (text/x-patch, attachment)]
[0002-dfa-epsilon-closure-tweaks-Bug-40634.patch (text/x-patch, attachment)]
[0005-dfa-fix-dfa-heap-overrun-failure.patch (text/x-patch, attachment)]
[0006-dfa-assume-C99-in-reorder_tokens.patch (text/x-patch, attachment)]
[0007-dfa-avoid-use-of-uninitialized-constraint.patch (text/x-patch, attachment)]

This bug report was last modified 4 years and 328 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.