GNU bug report logs - #21763
poor performance since grep 2.19 when comparing files with grep

Previous Next

Package: grep;

Reported by: "Bennett, Steve" <S.Bennett <at> lancaster.ac.uk>

Date: Mon, 26 Oct 2015 14:19:03 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 21763-done <at> debbugs.gnu.org, 22357-done <at> debbugs.gnu.org, sur-behoffski <sur_behoffski <at> grouse.com.au>, "L.A. Walsh" <law <at> tlinx.org>, JQK <jquan <at> redhat.com>, 22239 <at> debbugs.gnu.org, Trevor Cordes <gnu <at> tecnopolis.ca>, Bruno Haible <bruno <at> clisp.org>, Ondřej Cífka <ondra <at> cifka.com>
Subject: bug#21763: bug#22239: bug#22357: grep -f not only huge memory usage, but also huge time cost
Date: Fri, 23 Dec 2016 17:38:42 -0800
[Message part 1 (text/plain, inline)]
Norihiro Tanaka wrote:
> are you aware of extreme slowdown in the following cases after third patch?
>
>   yes $(printf %040d 0) | head -10000000 >inp
>   printf '0\n1\n' >pat
>   env LC_ALL=C src/grep -w -f pat inp

No. Thanks, I hadn't considered that possibility. I looked into the slowdown and 
installed the attached patches, which cause 'grep' to run about as fast on this 
test case as grep 2.25 (though not as fast as grep 2.26). The main fix is in 
patch 5. On my platform:

  -------grep version------
   v2.25  v2.26  v2.27 master     locale      command
    1.21   0.69  24.95   1.22     C           grep -w -f pat inp
  207.36 203.15 202.03   1.22     en_US.utf8  grep -w -f pat inp
    1.21   0.69  25.95   0.85     C           grep -w -f pat inp -F
   66.33  68.07  67.21   1.22     en_US.utf8  grep -w -f pat inp -F

All numbers are user+system CPU seconds on Fedora 24 x86-64 (AMD Phenom II X4 
910e). "master" means after the attached patches are installed.

Perhaps we can fiddle with the heuristics a bit so that v2.26 is not 
significantly faster than the master in the C locale.
[0001-maint-rewrite-to-avoid-some-macros.patch (text/x-diff, attachment)]
[0002-grep-remove-C-label.patch (text/x-diff, attachment)]
[0003-grep-simplify-Fexecute.patch (text/x-diff, attachment)]
[0004-grep-specialize-word-finding-functions.patch (text/x-diff, attachment)]
[0005-grep-speed-up-wf-in-C-locale.patch (text/x-diff, attachment)]
[0006-grep-standardize-on-localeinfo.multibyte.patch (text/x-diff, attachment)]
[0007-grep-improve-word-checking-with-UTF-8.patch (text/x-diff, attachment)]
[0008-grep-fix-comment-in-searchutils.c.patch (text/x-diff, attachment)]

This bug report was last modified 8 years and 148 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.