GNU bug report logs - #17376
[PATCH] grep: fix the different behaviour for a invalid sequence between KWset and DFA

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Wed, 30 Apr 2014 15:03:01 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #25 received at 17376 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 17376 <at> debbugs.gnu.org
Subject: Re: bug#17376: [PATCH] grep: fix the different behaviour for a invalid
 sequence between KWset and DFA
Date: Mon, 05 May 2014 20:26:37 -0700
[Message part 1 (text/plain, inline)]
While thinking about Bug#17376 I noticed some related bugs, which appear 
to have been in 'grep' since at least grep 2.0.  For example:

$ encode() { echo "$1" | tr ABC '\357\274\241'; }
$ encode ABCABC >exp3
$ encode _____________________ABCABC___ >exp4
$ bca=$(encode BCA)
$ grep "$bca" exp3
$ grep -F "$bca" exp3
$ grep "\\(\\)\\1$bca" exp3
AA

Here the regexp code disagrees with KWset and with the DFA, which is a 
bug: KWset and DFA should affect only performance, not behavior.

$ grep "$bca" exp4
_____________________AA___
$ grep -F "$bca" exp4
_____________________AA___
$ grep "\\(\\)\\1$bca" exp4
_____________________AA___

Here they agree, but only because there's a bug in is_mb_middle!
Fixing that will cause them to disagree again.

I installed the attached patch to fix the bugs I found, and to adjust 
the test cases accordingly.
[0001-grep-fix-encoding-error-incompatibilities-among-rege.patch (text/plain, attachment)]

This bug report was last modified 11 years and 15 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.