GNU bug report logs - #16893
[PATCH] Avoid matching line-by-line for case-insensitive with grep

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Thu, 27 Feb 2014 16:03:02 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: submit <at> debbugs.gnu.org
Subject: [PATCH] Avoid matching line-by-line for case-insensitive with grep
Date: Fri, 28 Feb 2014 01:02:40 +0900
[Message part 1 (text/plain, inline)]
Now grep and awk matchers doesn't waste buffer in case-sensisitive matching.
So I think that we can avoid line-by-line matching for them.

It enable to speed up case-sensitive matching with grep or awk matcher
without trivial_case_ignore as fast as when with it.

In bug#16232:
> The following times 2.16, 2.17 and 2.17+patch two ways:
> 
> $ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 > k
> $ for i in 16 17 18; do echo $i; env LC_ALL=en_US.UTF-8 time
> /p/p/grep-2.$i/bin/grep -i foobar k; done
> 16
>        15.96 real        14.57 user         0.12 sys
> 17
>         1.13 real         1.07 user         0.06 sys
> 18
>         1.96 real         1.89 user         0.06 sys
> 
> The above search takes more than 70% longer with the proposed patch.

Therefore, I think 30% slow-down is caused by the line-by-line matching
for them.
[avoid_line_by_line.txt (application/octet-stream, attachment)]

This bug report was last modified 11 years and 84 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.