GNU bug report logs - #17305
[PATCH] dfa: fix bug that caused NUL to be mishandled in patterns

Previous Next

Package: grep;

Reported by: Paul Eggert <eggert <at> CS.UCLA.EDU>

Date: Mon, 21 Apr 2014 06:24:02 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #10 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paolo Bonzini <bonzini <at> gnu.org>
To: Paul Eggert <eggert <at> CS.UCLA.EDU>, bug-grep <at> gnu.org
Subject: Re: [PATCH] dfa: fix bug that caused NUL to be mishandled in patterns
Date: Mon, 21 Apr 2014 09:07:32 -0400
Il 21/04/2014 02:18, Paul Eggert ha scritto:
> This bug was introduced in the early-2012 patches that fixed some
> context-handling bugs.  Bisecting found commit
> d8951d3f4e1bbd564809aa8e713d8333bda2f802 (2012-02-05 18:00:43 +0100),
> but it apears the underlying problem was introduced in commit
> 8b47c4cf6556933f59226c234b0fe984f6c77dc7 (2012-01-03 11:22:09 +0100).
> * NEWS: Mention bug fix.
> * src/dfa.c (char_context): Consider NUL to be a newline only if -z.
> * tests/Makefile.am (TESTS): Add null-byte.
> * tests/null-byte: New file.

Looks good, thanks!

Paolo

> ---
>  NEWS              |  3 +++
>  src/dfa.c         |  2 +-
>  tests/Makefile.am |  1 +
>  tests/null-byte   | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 57 insertions(+), 1 deletion(-)
>  create mode 100755 tests/null-byte
>
> diff --git a/NEWS b/NEWS
> index 92ce95e..fbb782b 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -11,6 +11,9 @@ GNU grep NEWS                                    -*- outline -*-
>    grep no longer mishandles an empty pattern at the end of a pattern list.
>    [bug introduced in grep-2.5]
>
> +  grep -f no longer mishandles patterns containing NUL bytes.
> +  [bug introduced in grep-2.11]
> +
>    grep -P now works with -w and -x and backreferences. Before,
>    echo aa|grep -Pw '(.)\1' would fail to match, yet
>    echo aa|grep -Pw '(.)\2' would match.
> diff --git a/src/dfa.c b/src/dfa.c
> index 90cf4a9..c93f451 100644
> --- a/src/dfa.c
> +++ b/src/dfa.c
> @@ -694,7 +694,7 @@ static charclass newline;
>  static int
>  char_context (unsigned char c)
>  {
> -  if (c == eolbyte || c == 0)
> +  if (c == eolbyte)
>      return CTX_NEWLINE;
>    if (IS_WORD_CONSTITUENT (c))
>      return CTX_LETTER;
> diff --git a/tests/Makefile.am b/tests/Makefile.am
> index cc79903..91775bd 100644
> --- a/tests/Makefile.am
> +++ b/tests/Makefile.am
> @@ -76,6 +76,7 @@ TESTS =						\
>    max-count-vs-context				\
>    mb-non-UTF8-performance			\
>    multibyte-white-space				\
> +  null-byte					\
>    empty-line-mb					\
>    unibyte-bracket-expr				\
>    unibyte-negated-circumflex			\
> diff --git a/tests/null-byte b/tests/null-byte
> new file mode 100755
> index 0000000..c967dbc
> --- /dev/null
> +++ b/tests/null-byte
> @@ -0,0 +1,52 @@
> +#!/bin/sh
> +# Test NUL bytes in patterns and data.
> +
> +# Copyright 2014 Free Software Foundation, Inc.
> +
> +# This program is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation, either version 3 of the License, or
> +# (at your option) any later version.
> +
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +. "${srcdir=.}/init.sh"; path_prepend_ ../src
> +
> +# Add "." to PATH for the use of get-mb-cur-max.
> +path_prepend_ .
> +
> +locales=C
> +for locale in en_US.iso885915 en_US.UTF-8; do
> +  get-mb-cur-max en_US.UTF-8 >/dev/null 2>&1 && locales="$locales $locale"
> +done
> +
> +fail=0
> +
> +for left in '' a '#' '\0'; do
> +  for right in '' b '#' '\0'; do
> +    data="$left\\0$right"
> +    printf "$data\\n" >in || framework_failure_
> +    for hat in '' '^'; do
> +      for dollar in '' '$'; do
> +        for force_regex in '' '\\(\\)\\1'; do
> +          pat="$hat$force_regex$data$dollar"
> +          printf "$pat\\n" >pat || framework_failure_
> +          for locale in $locales; do
> +            LC_ALL=$locale grep -f pat in ||
> +              fail_ "'$pat' does not match '$data'"
> +            LC_ALL=$locale grep -a -f pat in | cmp -s - in ||
> +              fail_ "-a '$pat' does not match '$data'"
> +          done
> +        done
> +      done
> +    done
> +  done
> +done
> +
> +Exit $fail
>





This bug report was last modified 11 years and 30 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.