GNU bug report logs - #24009
[PATCH] grep: use fastmap in regex

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Sat, 16 Jul 2016 16:59:02 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Jens Schleusener" <Jens.Schleusener <at> t-online.de>
To: bug-grep <at> gnu.org
Subject: Re: bug#24009: [PATCH] grep: use fastmap in regex
Date: Sat, 16 Jul 2016 22:06:53 +0200 (CEST)
Hi Norihiro.

> sed and gawk use fastmap in regex, but grep does not.  By using fastmap,
> I expect that grep speeds up for patterns as regex is used.
>
> before:
> $ time -p env LC_ALL=ja_JP.eucjp src/grep '\([a-b]\)\1' k
> real 7.83
> user 7.62
> sys 0.07
>
> after:
> $ time -p env LC_ALL=ja_JP.eucjp src/grep '\([a-b]\)\1' k
> real 0.46
> user 0.38
> sys 0.07
>
> However, if grep uses fastmap, fails in case-fold-titlecase test.  It
> means that grep's behavior differ from sed and gawk, as they use fastmap,
> although it seems to be a bug in regex.

Wow, that is a spectacular speed improvement. Since I use grep with regex 
patterns heavily in some of my scripts I could not resist to make some 
first simple tests (including your example pattern with a back reference). 
The non-representative results using grep 2.25 shows a gain of a factor 
5-10 (while the unpatched self-compiled grep 2.25 itself was already a 
factor 1.4-2.8 faster than the grep 2.16 offered by the OS (OpenSUSE Leap 
42.1). At least in my tests all the grep outputs  were identical.

By the way I had to remove one of the two "=" in your patch otherwise gcc 
issued an error (but caution, I am a C-layman).

Regards

Jens




This bug report was last modified 8 years and 322 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.