GNU bug report logs -
#77462
"/s" instability? I think this is a bug.
Previous Next
Full log
View this message in rfc822 format
On Wed, Apr 2, 2025 at 8:41 AM gnudborgonly <at> s-epost.no via
<bug-sed <at> gnu.org> wrote:
> This seems to qualify as a bug:
Thanks for the report.
You can fix your usage by not putting "[...]" around those uses of "\s" or "\S".
> The sed version included in my Linux seems to be unstable when using
> the '\s' and/or '\S' regex extensions:
>
> Example:
>
> id <at> pc:~$ echo '[ subCA2 ]' | sed -n 's|^\[\s\+\([^\s]\+[0-9]\+\)\s\+\]\s*$|\1|p'
> id <at> pc:~$ echo '[ subCA2 ]' | sed -n 's:^\[\s\+\([\S]\+[0-9]\+\)\s\+\]\s*$:\1:p'
Drop the square brackets and it works. I.e., change the latter to this:
$ echo '[ subCA2 ]' | sed -n 's:^\[\s\+\(\S\+[0-9]\+\)\s\+\]\s*$:\1:p'
subCA2
Or better still, use sed's -E option to make the regular expression
more readable, eliding **six** backslashes:
echo '[ subCA2 ]' | sed -nE 's:^\[\s+(\S+[0-9]+)\s+\]\s*$:\1:p'
I admit this is an unpleasant irregularity about GNU sed's "\S" extension,
since it's different from how things work in PCRE.
This is one of the reasons I urge people use Perl instead of sed
(another is because PCRE lets you use "\d" and non-greedy modifiers
like "\S+?" below):
$ echo '[ subCA2 ]' | perl -nle 'm{^\[\s+(\S+?\d+)\s+\]\s*$} and print $1'
subCA2
Searching Sed's sources/docs for references to \S and \s vs ranges, I
found no trace, but did see this 4.1 NEWS entry:
* removed documentation for \s and \S which worked incorrectly
I'll leave this bug report open, because this is a wart that needs to
be documented.
This bug report was last modified 72 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.