GNU bug report logs - #21763
poor performance since grep 2.19 when comparing files with grep

Previous Next

Package: grep;

Reported by: "Bennett, Steve" <S.Bennett <at> lancaster.ac.uk>

Date: Mon, 26 Oct 2015 14:19:03 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #40 received at 21763-done <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 21763-done <at> debbugs.gnu.org, 22357-done <at> debbugs.gnu.org,
 sur-behoffski <sur_behoffski <at> grouse.com.au>, "L.A. Walsh" <law <at> tlinx.org>,
 JQK <jquan <at> redhat.com>, 22239 <at> debbugs.gnu.org,
 Trevor Cordes <gnu <at> tecnopolis.ca>, Bruno Haible <bruno <at> clisp.org>,
 Ond?ej Cifka <ondra <at> cifka.com>
Subject: Re: bug#21763: bug#22239: bug#22357: grep -f not only huge memory
 usage, but also huge time cost
Date: Wed, 28 Dec 2016 09:21:02 +0900
[Message part 1 (text/plain, inline)]
On Mon, 26 Dec 2016 12:07:49 -0800
Paul Eggert <eggert <at> cs.ucla.edu> wrote:

> Norihiro Tanaka wrote:
> > Hmm, how about the following test cases, although it is extreame?
> 
> I don't think we need to worry about performance for the case when -w
> is given, and a pattern matches data that contains non-word
> characters. In practice, such cases are rare. I expect that most
> users would be surprised that -w can match non-word characters, and
> that users wouldn't object to -w rejecting such matches (if this
> wouldn't hurt performance significantly).
> 
> While looking into this I did find a very small performance tweak for
> the test case, and installed the attached.

Thanks.

BTW, with multiple patterns in current master, former uses fgrep matcher,
and later users grep matcher.  I think that it is not reasonable.

  env LC_ALL=C grep -w -f pat inp

  env LC_ALL=C grep -F -w -f pat inp

So I wrote the patch to use fgrep matcher for both.
[0001-grep-imorove-performance-with-multiple-patterns.patch (text/plain, attachment)]

This bug report was last modified 8 years and 148 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.