GNU bug report logs -
#17305
[PATCH] dfa: fix bug that caused NUL to be mishandled in patterns
Previous Next
Reported by: Paul Eggert <eggert <at> CS.UCLA.EDU>
Date: Mon, 21 Apr 2014 06:24:02 UTC
Severity: normal
Tags: patch
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17305 in the body.
You can then email your comments to 17305 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-grep <at> gnu.org
:
bug#17305
; Package
grep
.
(Mon, 21 Apr 2014 06:24:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Paul Eggert <eggert <at> CS.UCLA.EDU>
:
New bug report received and forwarded. Copy sent to
bug-grep <at> gnu.org
.
(Mon, 21 Apr 2014 06:24:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
This bug was introduced in the early-2012 patches that fixed some
context-handling bugs. Bisecting found commit
d8951d3f4e1bbd564809aa8e713d8333bda2f802 (2012-02-05 18:00:43 +0100),
but it apears the underlying problem was introduced in commit
8b47c4cf6556933f59226c234b0fe984f6c77dc7 (2012-01-03 11:22:09 +0100).
* NEWS: Mention bug fix.
* src/dfa.c (char_context): Consider NUL to be a newline only if -z.
* tests/Makefile.am (TESTS): Add null-byte.
* tests/null-byte: New file.
---
NEWS | 3 +++
src/dfa.c | 2 +-
tests/Makefile.am | 1 +
tests/null-byte | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 57 insertions(+), 1 deletion(-)
create mode 100755 tests/null-byte
diff --git a/NEWS b/NEWS
index 92ce95e..fbb782b 100644
--- a/NEWS
+++ b/NEWS
@@ -11,6 +11,9 @@ GNU grep NEWS -*- outline -*-
grep no longer mishandles an empty pattern at the end of a pattern list.
[bug introduced in grep-2.5]
+ grep -f no longer mishandles patterns containing NUL bytes.
+ [bug introduced in grep-2.11]
+
grep -P now works with -w and -x and backreferences. Before,
echo aa|grep -Pw '(.)\1' would fail to match, yet
echo aa|grep -Pw '(.)\2' would match.
diff --git a/src/dfa.c b/src/dfa.c
index 90cf4a9..c93f451 100644
--- a/src/dfa.c
+++ b/src/dfa.c
@@ -694,7 +694,7 @@ static charclass newline;
static int
char_context (unsigned char c)
{
- if (c == eolbyte || c == 0)
+ if (c == eolbyte)
return CTX_NEWLINE;
if (IS_WORD_CONSTITUENT (c))
return CTX_LETTER;
diff --git a/tests/Makefile.am b/tests/Makefile.am
index cc79903..91775bd 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -76,6 +76,7 @@ TESTS = \
max-count-vs-context \
mb-non-UTF8-performance \
multibyte-white-space \
+ null-byte \
empty-line-mb \
unibyte-bracket-expr \
unibyte-negated-circumflex \
diff --git a/tests/null-byte b/tests/null-byte
new file mode 100755
index 0000000..c967dbc
--- /dev/null
+++ b/tests/null-byte
@@ -0,0 +1,52 @@
+#!/bin/sh
+# Test NUL bytes in patterns and data.
+
+# Copyright 2014 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+# Add "." to PATH for the use of get-mb-cur-max.
+path_prepend_ .
+
+locales=C
+for locale in en_US.iso885915 en_US.UTF-8; do
+ get-mb-cur-max en_US.UTF-8 >/dev/null 2>&1 && locales="$locales $locale"
+done
+
+fail=0
+
+for left in '' a '#' '\0'; do
+ for right in '' b '#' '\0'; do
+ data="$left\\0$right"
+ printf "$data\\n" >in || framework_failure_
+ for hat in '' '^'; do
+ for dollar in '' '$'; do
+ for force_regex in '' '\\(\\)\\1'; do
+ pat="$hat$force_regex$data$dollar"
+ printf "$pat\\n" >pat || framework_failure_
+ for locale in $locales; do
+ LC_ALL=$locale grep -f pat in ||
+ fail_ "'$pat' does not match '$data'"
+ LC_ALL=$locale grep -a -f pat in | cmp -s - in ||
+ fail_ "-a '$pat' does not match '$data'"
+ done
+ done
+ done
+ done
+ done
+done
+
+Exit $fail
--
1.9.0
bug closed, send any further explanations to
17305 <at> debbugs.gnu.org and Paul Eggert <eggert <at> CS.UCLA.EDU>
Request was from
Paul Eggert <eggert <at> cs.ucla.edu>
to
control <at> debbugs.gnu.org
.
(Mon, 21 Apr 2014 06:26:03 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#17305
; Package
grep
.
(Mon, 21 Apr 2014 13:08:02 GMT)
Full text and
rfc822 format available.
Message #10 received at submit <at> debbugs.gnu.org (full text, mbox):
Il 21/04/2014 02:18, Paul Eggert ha scritto:
> This bug was introduced in the early-2012 patches that fixed some
> context-handling bugs. Bisecting found commit
> d8951d3f4e1bbd564809aa8e713d8333bda2f802 (2012-02-05 18:00:43 +0100),
> but it apears the underlying problem was introduced in commit
> 8b47c4cf6556933f59226c234b0fe984f6c77dc7 (2012-01-03 11:22:09 +0100).
> * NEWS: Mention bug fix.
> * src/dfa.c (char_context): Consider NUL to be a newline only if -z.
> * tests/Makefile.am (TESTS): Add null-byte.
> * tests/null-byte: New file.
Looks good, thanks!
Paolo
> ---
> NEWS | 3 +++
> src/dfa.c | 2 +-
> tests/Makefile.am | 1 +
> tests/null-byte | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 57 insertions(+), 1 deletion(-)
> create mode 100755 tests/null-byte
>
> diff --git a/NEWS b/NEWS
> index 92ce95e..fbb782b 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -11,6 +11,9 @@ GNU grep NEWS -*- outline -*-
> grep no longer mishandles an empty pattern at the end of a pattern list.
> [bug introduced in grep-2.5]
>
> + grep -f no longer mishandles patterns containing NUL bytes.
> + [bug introduced in grep-2.11]
> +
> grep -P now works with -w and -x and backreferences. Before,
> echo aa|grep -Pw '(.)\1' would fail to match, yet
> echo aa|grep -Pw '(.)\2' would match.
> diff --git a/src/dfa.c b/src/dfa.c
> index 90cf4a9..c93f451 100644
> --- a/src/dfa.c
> +++ b/src/dfa.c
> @@ -694,7 +694,7 @@ static charclass newline;
> static int
> char_context (unsigned char c)
> {
> - if (c == eolbyte || c == 0)
> + if (c == eolbyte)
> return CTX_NEWLINE;
> if (IS_WORD_CONSTITUENT (c))
> return CTX_LETTER;
> diff --git a/tests/Makefile.am b/tests/Makefile.am
> index cc79903..91775bd 100644
> --- a/tests/Makefile.am
> +++ b/tests/Makefile.am
> @@ -76,6 +76,7 @@ TESTS = \
> max-count-vs-context \
> mb-non-UTF8-performance \
> multibyte-white-space \
> + null-byte \
> empty-line-mb \
> unibyte-bracket-expr \
> unibyte-negated-circumflex \
> diff --git a/tests/null-byte b/tests/null-byte
> new file mode 100755
> index 0000000..c967dbc
> --- /dev/null
> +++ b/tests/null-byte
> @@ -0,0 +1,52 @@
> +#!/bin/sh
> +# Test NUL bytes in patterns and data.
> +
> +# Copyright 2014 Free Software Foundation, Inc.
> +
> +# This program is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation, either version 3 of the License, or
> +# (at your option) any later version.
> +
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +
> +. "${srcdir=.}/init.sh"; path_prepend_ ../src
> +
> +# Add "." to PATH for the use of get-mb-cur-max.
> +path_prepend_ .
> +
> +locales=C
> +for locale in en_US.iso885915 en_US.UTF-8; do
> + get-mb-cur-max en_US.UTF-8 >/dev/null 2>&1 && locales="$locales $locale"
> +done
> +
> +fail=0
> +
> +for left in '' a '#' '\0'; do
> + for right in '' b '#' '\0'; do
> + data="$left\\0$right"
> + printf "$data\\n" >in || framework_failure_
> + for hat in '' '^'; do
> + for dollar in '' '$'; do
> + for force_regex in '' '\\(\\)\\1'; do
> + pat="$hat$force_regex$data$dollar"
> + printf "$pat\\n" >pat || framework_failure_
> + for locale in $locales; do
> + LC_ALL=$locale grep -f pat in ||
> + fail_ "'$pat' does not match '$data'"
> + LC_ALL=$locale grep -a -f pat in | cmp -s - in ||
> + fail_ "-a '$pat' does not match '$data'"
> + done
> + done
> + done
> + done
> + done
> +done
> +
> +Exit $fail
>
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 20 May 2014 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 11 years and 29 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.