GNU bug report logs - #21670
surprising bug in grep -e with anchors

Previous Next

Package: grep;

Reported by: greg boyd <gboyd.ccsf <at> gmail.com>

Date: Sun, 11 Oct 2015 23:57:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Jim Meyering <jim <at> meyering.net>
To: 21670 <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>, gboyd.ccsf <at> gmail.com
Cc: 21670-done <at> debbugs.gnu.org
Subject: bug#21670: surprising bug in grep -e with anchors
Date: Mon, 12 Oct 2015 15:17:42 -0700
On Sun, Oct 11, 2015 at 9:34 PM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> greg boyd wrote:
>>
>> test case (single line)
>> abchelloabc
>>
>> grep does not find the line with grep -e '^hello'  nor with grep -e
>> 'hello$'
>> however, the line is output with
>> grep -e '^hello' -e 'hello$'
>
>
> Oooo, that's a good one.  Give your student extra credit!  As it happens,
> the bug was recently fixed by this patch by Norihiro Tanaka:
>
> http://git.savannah.gnu.org/cgit/grep.git/commit/?id=256a4b494fe1c48083ba73b4f62607234e4fefd5
>
> and the fix should appear in the next grep release.  However, since the
> patch was supposed to affect only performance, it appears that the bug fix
> was due to luck, and I'm taking the liberty of adding your student's test
> case by installing the attached further patch, to help prevent this bug from
> coming back in a future version.

Thanks for adding that test, Paul.
However, note that the bug does not require two uses of "-e" per-se.
Multiple "-e"-specified regexps get translated internally to those regexps
separated by the ERE "|" alternation/"or" operator. A smaller, perhaps
more illustrative test case is to use an explicit "|":

  $ echo axa | grep -E '^x|x$'
  axa

FYI, one can demonstrate that it was a problem in the DFA
matcher without resorting to gdb by inserting a "()" in the ERE,
since that construct cannot work in a DFA and grep resorts
to using glibc's full-blown regex matcher. With that, even the
afflicted versions of grep get the desired result (no match):

  $ echo axa | grep -E '^x()|x$'; echo $?
  $ 1




This bug report was last modified 9 years and 206 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.