GNU bug report logs - #33249
[PATCH] grep: grouping of patterns including back reference

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Sat, 3 Nov 2018 11:43:01 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: <bug-grep <at> gnu.org>
Subject: [PATCH] grep: grouping of patterns including back reference
Date: Sat, 03 Nov 2018 20:41:46 +0900
[Message part 1 (text/plain, inline)]
Hi,

When grep uses regex, it splits a pattern with multiple lines by
newline character.  Compilation and executution run for each fragment. 
That causes slowdown.  By this change, each fragment is divided into
groups by whether the fragment includes back reference in a pattern or
not. a frgment which includes back reference constitutes group, and all
frgments which include back reference also constitute a group.

This change extremely speeds-up following case.

- before

    $ yes 00000000000000000000000000000000000000000x | head -10 >in
    $ time -p env LC_ALL=C src/grep -f pat in
    real 1.94
    user 1.70
    sys 0.24

    $ yes 00000000000000000000000000000000000000000x | head -100 >in
    $ time -p env LC_ALL=C src/grep -f pat in
    real 13.04
    user 12.78
    sys 0.25

- after

    $ yes 00000000000000000000000000000000000000000x | head -10 >in
    $ time -p env LC_ALL=C src/grep -f pat in
    real 0.75
    user 0.56
    sys 0.17

    $ yes 00000000000000000000000000000000000000000x | head -100 >in
    $ time -p env LC_ALL=C src/grep -f pat in
    real 0.74
    user 0.57
    sys 0.16

Thanks,
Norihiro
[0001-grep-grouping-of-patterns-including-back-reference.patch (text/plain, attachment)]

This bug report was last modified 5 years and 236 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.