GNU bug report logs -
#40242
n as delimiter alias
Previous Next
Reported by: Oğuz <oguzismailuysal <at> gmail.com>
Date: Thu, 26 Mar 2020 15:31:02 UTC
Severity: normal
Tags: confirmed
Merged with 40239
Done: Jim Meyering <jim <at> meyering.net>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your message dated Sun, 23 Oct 2022 23:25:05 -0700
with message-id <CA+8g5KFyNdT6FJUNfNfRh2OySYs7nNLpOm4OUzbtWE+Rru2TWA <at> mail.gmail.com>
and subject line Re: bug#40242: n as delimiter alias
has caused the debbugs.gnu.org bug report #40242,
regarding Bug in how \cregexpc is handled
to be marked as done.
(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)
--
40242: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40242
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
[Message part 3 (text/plain, inline)]
To whom it may concern,
From man sed, I read:
\cregexpc
Match lines matching the regular expression regexp. The c
may be any character.
On the one hand
- sed '\cncd' <<< n correctly shows empty output, since it's the same as sed
'/n/d' <<< n based on the description above;
- sed '\c\ccd' <<< c correctly shows an empty output too, but in this
case the letter needed to be escaped for obvious reasons.
On the other hand:
- sed '\n\nnd' <<< n results in an output equal to the single character n,
revealing that the backslash is having a double effect:
1. it prevents the following n from closing the opening \n.
2. it interprets the n as a newline instead of the literal letter n;
this is confirmed by executing echo -e 'a\na' | sed -n 'N;\n\nnp'.
The is means that using n in \nregexpn prevevents the use of the literal n
in the regexp.
The issue has come to light in this StackOverflow
<https://stackoverflow.com/questions/60853746/what-is-n-nnd-supposed-to-do>
question.
Kind regards,
Enrico Maria De Angelis
[Message part 4 (text/html, inline)]
[Message part 5 (message/rfc822, inline)]
[Message part 6 (text/plain, inline)]
On Tue, Mar 31, 2020 at 6:36 AM Eric Blake <eblake <at> redhat.com> wrote:
> On 3/31/20 2:00 AM, Oğuz wrote:
> > Thanks for the reply. This might not be a bug though; I sent a similar mail
> > (https://www.mail-archive.com/austin-group-l <at> opengroup.org/msg05881.html)
> > to Austin Group mailing list asking what's the expected behavior in this
> > case, and I was told (
> > https://www.mail-archive.com/austin-group-l <at> opengroup.org/msg05891.html)
> > both behaviors -yielding n or empty line- are correct and standard should
> > *probably* be amended to explicitly state that this is unspecified. And
> > apparently (
> > https://www.mail-archive.com/austin-group-l <at> opengroup.org/msg05893.html)
> > some other UNIXes adopted the same practice as GNU sed (or vice versa, I
> > don't know which one is older).
>
> The POSIX folks will probably declare that use of a \X sequence (for
> arbitrary X; 'n', 't', '1', and probably others all fit this category)
> inside a regex delimited by X is unspecified behavior. But that still
> doesn't stop us from fixing GNU set to at least be consistent - we
> should either blindly declare that \X represents the special meaning of
> X when such a meaning is present regardless of X also being the regex
> delimiter (our current \n behavior - no way to represent the delimiter
> as a literal match), or that use of X as a delimiter renders the special
> meaning of \X useless for that regex (our \t behavior - no way to
> represent the special behavior as part of the match). My personal
> preference is making things consistent to our \t behavior.
>
> >> In the code, the "match_slash" function [1] is used to find
> >> the delimiters of the "s" command (typically "slashes").
> >> Special handling happens if a slash is found [2],
> >> And in lines 557-8 there's this conditional:
> >>
> >> else if (ch == 'n' && regex)
> >> ch = '\n';
> >>
> >> Which forces any "\n" to be a new-line, regardless if the
> >> delimiter itself was an "n".
> >>
>
> >> Interestingly, removing these two lines does not cause
> >> any test failures, so this might be easy to fix without causing
> >> any regressions.
> >>
> >>
> >> For now I'm leaving this item open until we decide how to deal with it.
>
> I'm thus in favor of removing that special-case of 'n'.
Thank you all. Sorry it's taken so long.
I expect to push the following tomorrow.
[sed-tweak.diff (application/octet-stream, attachment)]
This bug report was last modified 2 years and 294 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.