GNU bug report logs - #24161
[PATCH 2/2] sed: speed up matching by reguler expression with dfa matcher

Previous Next

Package: sed;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Fri, 5 Aug 2016 14:05:01 UTC

Severity: normal

Tags: patch

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: 24161 <at> debbugs.gnu.org
Subject: bug#24161: [PATCH 2/2] sed: speed up matching by reguler expression with dfa matcher
Date: Fri, 05 Aug 2016 23:03:26 +0900
[Message part 1 (text/plain, inline)]
Hi,

We can speeds up sed by using dfa matcher brought from grep.  gawk users
it, sed does not uses it yet.  It will speed up matching for typical
cases.

$ yes $(printf %040d 0) | head -1000000 >k

Before:

]$ time -p env LC_ALL=C sed/sed -ne /000000000k/p k
real 3.04
user 2.99
sys 0.03
$ time -p env LC_ALL=en_US.utf8 sed/sed -ne /000000000k/p k
real 3.04
user 2.90
sys 0.06
$ time -p env LC_ALL=ja_JP.eucjp sed/sed -ne /000000000k/p k
real 7.09
user 6.77
sys 0.31

After patching:

$ time -p env LC_ALL=C sed/sed -ne /000000000k/p k
real 0.29
user 0.15
sys 0.10
$ time -p env LC_ALL=en_US.utf8 sed/sed -ne /000000000k/p k
real 0.27
user 0.25
sys 0.02
$ time -p env LC_ALL=ja_JP.eucjp sed/sed -ne /000000000k/p k
real 0.33
user 0.29
sys 0.03

I believe that this patch can greatly improve performance of matching by
sed, however I worry about the maintenance as updates for dfa is always
done in grep.

Thanks,
Norihiro
[0002-sed-speed-up-matching-by-reguler-expression-with-dfa.patch (text/plain, attachment)]

This bug report was last modified 8 years and 284 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.