GNU bug report logs - #21414
-F string with tailing newline always matches

Previous Next

Package: grep;

Reported by: Ian Brown - HNAS <ian.brown <at> hds.com>

Date: Fri, 4 Sep 2015 15:34:03 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21414 in the body.
You can then email your comments to 21414 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#21414; Package grep. (Fri, 04 Sep 2015 15:34:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ian Brown - HNAS <ian.brown <at> hds.com>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Fri, 04 Sep 2015 15:34:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ian Brown - HNAS <ian.brown <at> hds.com>
To: "bug-grep <at> gnu.org" <bug-grep <at> gnu.org>
Subject: -F string with tailing newline always matches 
Date: Fri, 4 Sep 2015 14:45:45 +0000
[Message part 1 (text/plain, inline)]
Grep version 2.20

When using the ouput of another command to pass match strings into grep using -F I was getting unexpected results as it was matching every line. If the terminating newline is removed the grep started to work again.

Easy to work around but this is different behaviour from 2.12 and may cause some scripts to fail.

Ian Brown (HDS)

[Message part 2 (text/html, inline)]

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Fri, 04 Sep 2015 17:35:01 GMT) Full text and rfc822 format available.

Notification sent to Ian Brown - HNAS <ian.brown <at> hds.com>:
bug acknowledged by developer. (Fri, 04 Sep 2015 17:35:02 GMT) Full text and rfc822 format available.

Message #10 received at 21414-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Ian Brown - HNAS <ian.brown <at> hds.com>, 21414-done <at> debbugs.gnu.org
Subject: Re: bug#21414: -F string with tailing newline always matches
Date: Fri, 4 Sep 2015 10:34:12 -0700
On 09/04/2015 07:45 AM, Ian Brown - HNAS wrote:
> Grep version 2.20
>
> When using the ouput of another command to pass match strings into grep using -F I was getting unexpected results as it was matching every line. If the terminating newline is removed the grep started to work again.
>
> Easy to work around but this is different behaviour from 2.12 and may cause some scripts to fail.
>
> Ian Brown (HDS)
>

I assume you're referring to the following sort of behavior:

$ printf 'abc\n\ndef\n' >foo
$ grep -F 'abc
' foo
abc

def

Older versions of GNU grep would ignore the newline after 'abc' in the 
pattern, and would output only 'abc' with the above example. This 
behavior was incompatible with non-GNU grep implementations and with 
POSIX, and the incompatibility seemed to be unintended and not that 
useful and was fixed at some point (sorry, don't know the GNU grep 
version).  Sorry you were relying on it.




Information forwarded to bug-grep <at> gnu.org:
bug#21414; Package grep. (Fri, 04 Sep 2015 22:18:02 GMT) Full text and rfc822 format available.

Message #13 received at 21414 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: 21414 <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>, ian.brown <at> hds.com
Cc: 21414-done <at> debbugs.gnu.org
Subject: Re: bug#21414: -F string with tailing newline always matches
Date: Fri, 4 Sep 2015 15:17:25 -0700
On Fri, Sep 4, 2015 at 10:34 AM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> On 09/04/2015 07:45 AM, Ian Brown - HNAS wrote:
>>
>> Grep version 2.20
>>
>> When using the ouput of another command to pass match strings into grep
>> using -F I was getting unexpected results as it was matching every line. If
>> the terminating newline is removed the grep started to work again.
>>
>> Easy to work around but this is different behaviour from 2.12 and may
>> cause some scripts to fail.
>>
>> Ian Brown (HDS)
>>
>
> I assume you're referring to the following sort of behavior:
>
> $ printf 'abc\n\ndef\n' >foo
> $ grep -F 'abc
> ' foo
> abc
>
> def
>
> Older versions of GNU grep would ignore the newline after 'abc' in the
> pattern, and would output only 'abc' with the above example. This behavior
> was incompatible with non-GNU grep implementations and with POSIX, and the
> incompatibility seemed to be unintended and not that useful and was fixed at
> some point (sorry, don't know the GNU grep version).  Sorry you were relying
> on it.

Thank you for the report.
I too find this behavior surprising:

$ seq 3|grep -F xxx$'\n'
1
2
3

This feels like a bug, since it's an artifact of how grep accumulates
multiple keys internally: it uses newline as the separator
(http://git.savannah.gnu.org/cgit/grep.git/tree/src/grep.c#n2308).
Including a literal newline in the search string conflicts with that.
I haven't investigated feasibility, but we be able to make it use \0
as the separator, to avoid this.




Information forwarded to bug-grep <at> gnu.org:
bug#21414; Package grep. (Fri, 04 Sep 2015 22:18:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#21414; Package grep. (Sat, 05 Sep 2015 03:46:02 GMT) Full text and rfc822 format available.

Message #19 received at 21414 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>, 21414 <at> debbugs.gnu.org, ian.brown <at> hds.com
Cc: 21414-done <at> debbugs.gnu.org
Subject: Re: bug#21414: -F string with tailing newline always matches
Date: Fri, 4 Sep 2015 20:45:10 -0700
Jim Meyering wrote:
> I too find this behavior surprising:
>
> $ seq 3|grep -F xxx$'\n'
> 1
> 2
> 3
>
> This feels like a bug, since it's an artifact of how grep accumulates
> multiple keys internally: it uses newline as the separator
> (http://git.savannah.gnu.org/cgit/grep.git/tree/src/grep.c#n2308).
> Including a literal newline in the search string conflicts with that.

It's not an artifact; it's intended behavior.  POSIX says that xxx$'\n' (which 
expands to three 'x's followed by a newline) is a pattern_list, not a pattern. 
A pattern_list is defined to be a series of patterns separated by newlines (not 
terminated by newlines), so that pattern_list has two patterns, xxx and the 
empty pattern.





Information forwarded to bug-grep <at> gnu.org:
bug#21414; Package grep. (Sat, 05 Sep 2015 03:46:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#21414; Package grep. (Sat, 05 Sep 2015 04:31:01 GMT) Full text and rfc822 format available.

Message #25 received at 21414 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: ian.brown <at> hds.com, 21414 <at> debbugs.gnu.org, 21414-done <at> debbugs.gnu.org
Subject: Re: bug#21414: -F string with tailing newline always matches
Date: Fri, 4 Sep 2015 21:29:58 -0700
On Fri, Sep 4, 2015 at 8:45 PM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> Jim Meyering wrote:
>>
>> I too find this behavior surprising:
>>
>> $ seq 3|grep -F xxx$'\n'
>> 1
>> 2
>> 3
>>
>> This feels like a bug, since it's an artifact of how grep accumulates
>> multiple keys internally: it uses newline as the separator
>> (http://git.savannah.gnu.org/cgit/grep.git/tree/src/grep.c#n2308).
>> Including a literal newline in the search string conflicts with that.
>
>
> It's not an artifact; it's intended behavior.  POSIX says that xxx$'\n'
> (which expands to three 'x's followed by a newline) is a pattern_list, not a
> pattern. A pattern_list is defined to be a series of patterns separated by
> newlines (not terminated by newlines), so that pattern_list has two
> patterns, xxx and the empty pattern.

Thanks for explaining. Looks like I'd better reread that part of the spec.




Information forwarded to bug-grep <at> gnu.org:
bug#21414; Package grep. (Sat, 05 Sep 2015 04:31:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#21414; Package grep. (Mon, 07 Sep 2015 15:12:01 GMT) Full text and rfc822 format available.

Message #31 received at 21414 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Ian Brown - HNAS <ian.brown <at> hds.com>
Cc: 21414 <at> debbugs.gnu.org
Subject: Re: bug#21414: -F string with tailing newline always matches
Date: Mon, 7 Sep 2015 08:11:29 -0700
Ian Brown - HNAS wrote:
> the pattern list used was the output of another command, it now needs to have the terminating newline removed.

Typically the issue occurs with grep -f.  If the pattern list is generated via 
the shell, like this:

   grep "$(somecmd)" file

then the shell removes all trailing newlines from the output of SOMECMD, so it 
is not a problem in this case.  The shell even strips multiple trailing 
newlines, which in hindsight is probably a mistake but is standard behavior. 
So, for example:

   printf 'abc\n' | grep "$(printf 'xyz\n\n\n')"

does not output any matches, but:

   printf 'xyz\n\n\n' >pattern; printf 'abc\n' | grep -f pattern

outputs a match for abc, because the empty pattern matches every line.

> I didn't find any mention of this change of behaviour in the change logs

The change is listed in NEWS under grep-2.19 bug fixes, like this:

  grep no longer mishandles an empty pattern at the end of a pattern list.
  [bug introduced in grep-2.5]

This is due to commit 2d3832e1ff772dc1a374bfad5dcc1338350cc48b dated Fri Apr 11 
21:34:11 2014 +0900.  Here is the ChangeLog entry.

2014-04-11  Norihiro Tanaka  <noritnk <at> kcn.ne.jp>

	grep: no match for the empty string included in multiple patterns
	* src/dfasearch.c (EGAcompile): Fix it.
	* src/kwsearch.c (Fcompile): Fix it.

This fixes Bug#17240, which essentially is the negation of your bug report, 
i.e., Bug#17240 asks for the standard grep behavior which we broke in grep 2.5. 
 You can see that bug report here:

http://bugs.gnu.org/17240




Information forwarded to bug-grep <at> gnu.org:
bug#21414; Package grep. (Tue, 08 Sep 2015 12:17:02 GMT) Full text and rfc822 format available.

Message #34 received at 21414 <at> debbugs.gnu.org (full text, mbox):

From: Ian Brown - HNAS <ian.brown <at> hds.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: "21414 <at> debbugs.gnu.org" <21414 <at> debbugs.gnu.org>
Subject: RE: bug#21414: -F string with tailing newline always matches
Date: Tue, 8 Sep 2015 12:16:15 +0000
My code was in Ruby and that preserves the newlines

irb(main):001:0> `printf 'abc\n'|grep "#{`printf 'xyz\n\n\n'`}"`
=> "abc\n"
irb(main):002:0>

Didn't think to look under NEWS .. thanks.

-----Original Message-----
From: Paul Eggert [mailto:eggert <at> cs.ucla.edu] 
Sent: 07 September 2015 16:11
To: Ian Brown - HNAS
Cc: 21414 <at> debbugs.gnu.org
Subject: Re: bug#21414: -F string with tailing newline always matches

Ian Brown - HNAS wrote:
> the pattern list used was the output of another command, it now needs to have the terminating newline removed.

Typically the issue occurs with grep -f.  If the pattern list is generated via the shell, like this:

    grep "$(somecmd)" file

then the shell removes all trailing newlines from the output of SOMECMD, so it is not a problem in this case.  The shell even strips multiple trailing newlines, which in hindsight is probably a mistake but is standard behavior. 
So, for example:

    printf 'abc\n' | grep "$(printf 'xyz\n\n\n')"

does not output any matches, but:

    printf 'xyz\n\n\n' >pattern; printf 'abc\n' | grep -f pattern

outputs a match for abc, because the empty pattern matches every line.

> I didn't find any mention of this change of behaviour in the change 
> logs

The change is listed in NEWS under grep-2.19 bug fixes, like this:

   grep no longer mishandles an empty pattern at the end of a pattern list.
   [bug introduced in grep-2.5]

This is due to commit 2d3832e1ff772dc1a374bfad5dcc1338350cc48b dated Fri Apr 11
21:34:11 2014 +0900.  Here is the ChangeLog entry.

2014-04-11  Norihiro Tanaka  <noritnk <at> kcn.ne.jp>

	grep: no match for the empty string included in multiple patterns
	* src/dfasearch.c (EGAcompile): Fix it.
	* src/kwsearch.c (Fcompile): Fix it.

This fixes Bug#17240, which essentially is the negation of your bug report, i.e., Bug#17240 asks for the standard grep behavior which we broke in grep 2.5. 
  You can see that bug report here:

https://urldefense.proofpoint.com/v2/url?u=http-3A__bugs.gnu.org_17240&d=BQICaQ&c=DZ-EF4pZfxGSU6MfABwx0g&r=oUyQxD5D078svVweHQscGciDxD5YHkgZqUu_H1lDr_I&m=ydFOQe7IjFmk2xPA2Strddr1vCireNQ7FIIh9ERiCZE&s=n855GH9cvWAZ_PXYT1pr1MEszacQw0CsW9bVR5WcUAk&e= 

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 07 Oct 2015 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 262 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.