GNU bug report logs - #19306
[PATCH 1/2] dfa: avoid execution for a pattern including an unsupported expression

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Mon, 8 Dec 2014 15:26:01 UTC

Severity: normal

Tags: patch

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


Message #23 received at 19306 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Jim Meyering <jim <at> meyering.net>
Cc: 19306 <at> debbugs.gnu.org
Subject: Re: [PATCH 1/2] dfa: avoid execution for a pattern including an
 unsupported expression
Date: Tue, 21 Jul 2015 00:14:25 +0900
On Sun, 19 Jul 2015 20:14:52 -0700
Jim Meyering <jim <at> meyering.net> wrote:

> Thank you for the additional information and the test script.
> I like most of this patch, but not the fact that it causes the
> word-delim-multibyte test to fail. I do see that also applying your
> following patch makes that test pass once again.  However, it does so
> at the cost of forcing a new class of regexps (any that contain a use
> of \b, \<  or \>) from DFA into the slower regex matcher.

I think DFA forces regex for BEGWORD, LIMWORD, ENDWORD, instead of
whether patching or not.  Could you remark code in dfassbuild() without
patching?  It seem that DFA rejects their words from before.

        case BEGWORD:
        case ENDWORD:
        case LIMWORD:
        case NOTLIMWORD:
          if (d->multibyte)
            {
              /* These constraints aren't supported in a multibyte locale.
                 Ignore them in the superset DFA, and treat them as
                 backreferences in the main DFA.  */
              sup->tokens[j++] = EMPTY;
              d->tokens[i] = BACKREF;  <<<<
              break;
            }

DFA does not handle word context in multibyte correctly.  Perhaps, if we
fix it, DFA will take a performance penalty.





This bug report was last modified 9 years and 363 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.