GNU bug report logs - #75806
Trailing spaces; pattern "\s" before "[[:cntrl:]]" faulty

Previous Next

Package: grep;

Reported by: Andreas BROCKMANN <andreas.brockmann <at> diehl.com>

Date: Fri, 24 Jan 2025 14:50:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Peter White <peter.white <at> posteo.net>
To: bug-grep <at> gnu.org
Subject: Re: bug#75806: Trailing spaces; pattern "\s" before "[[:cntrl:]]"
 faulty
Date: Fri, 24 Jan 2025 19:26:00 +0000
On Fri, Jan 24, 2025 at 01:27:13PM +0000, Andreas BROCKMANN via Bug reports for GNU grep wrote:
> Hi,
> 
> The 1st command below correctly reports trailing spaces, for Unix and Windows format files.
> The 2nd one incorrectly reports all lines.
> 
>   grep -sHn -i " [[:cntrl:]]*$" *.vhd
>   grep -sHn -i "\s[[:cntrl:]]*$" *.vhd

As someone who just today made a similar mistake I would like to point
out that the pattern does as intended because '*' matches *zero* or more
occurrences of the preceding atom. So the second pattern matches
any line that contains a *literal* 's' followed by zero or more control
chars, which is any line because of the newline at the end which is a
control char. Since you did not ask for perl regex (-P) grep uses basic
POSIX regex instead; at least I *think* you want perl syntax given that
'\s' is only valid in PCRE, IIRC.

Also [:cntrl:] is not the correct char class for white space, why not
[:space:] or [:blank:]? Your first pattern just happens to match the
literal space in it *and* any following string of zero or more control
chars.


PW




This bug report was last modified 119 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.