GNU bug report logs - #39678
'grep --ignore-case --color' does not always color the matches

Previous Next

Package: grep;

Reported by: Benno Schulenberg <bensberg <at> telfort.nl>

Date: Wed, 19 Feb 2020 15:28:01 UTC

Severity: normal

Merged with 51255, 51256, 51257

Full log


Message #67 received at 39678 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>
Cc: 39678 <at> debbugs.gnu.org,
 Tomasz Dziendzielski <tomasz.dziendzielski <at> gmail.com>
Subject: Re: bug#39678: POSIXLY_CORRECT removal, and oddball regex doc
Date: Sun, 22 May 2022 15:22:59 -0700
[Message part 1 (text/plain, inline)]
On 5/21/22 11:40, Jim Meyering wrote:
> In my experience, there are many lurking uses of things like '\a', and
> would like to ease into this gently, so I much prefer your latter
> approach: warn now, and change grep's exit status later

Sounds good.

When I started looking into that, I discovered that the grep manual 
doesn't cover these lurkers well. And although I installed a patch 
yesterday about this, after looking at the POSIX spec again today I 
discovered that I'd missed quite a few lurkers. So I just now installed 
the attached documentation fix, which attempts to cover all the 
remaining problem regexps, and to give us room to add warnings for some 
of them soon.

We shouldn't warn about all these problems, not without a --pedantic 
flag or something like that (something I'm probably too busy to add). 
But I expect it'd be good to warn about areas where grep's semantics 
don't match any reasonable expectation.

We've already uncovered one area, where \a doesn't work as expected and 
where a warning diagnostic would be helpful. Here's another one, where 
an oddly-placed '*' doesn't work as one would expect:

$ printf '*\na\n*a\n' | grep '\(*\)'
*
*a
$ printf '*\na\n*a\n' | grep -E '(*)'
grep: Unmatched ( or \(
$ printf '*\na\n*a\n' | grep '\(*a\)'
*a
$ printf '*\na\n*a\n' | grep -E '(*a)'
a
*a

Although not a POSIX violation, here 'grep -E' is "wrong" for any 
reasonable definition of "wrong" that I can think of. The attached patch 
changes the doc to say that this regular expression has unspecified 
behavior (something that POSIX allows).

(Who would have thought regular expressions were so complicated? :-)
[0001-doc-document-regex-corner-cases-better.patch (text/x-patch, attachment)]

This bug report was last modified 3 years and 23 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.