GNU bug report logs - #40242
n as delimiter alias

Previous Next

Package: sed;

Reported by: Oğuz <oguzismailuysal <at> gmail.com>

Date: Thu, 26 Mar 2020 15:31:02 UTC

Severity: normal

Tags: confirmed

Merged with 40239

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eric Blake <eblake <at> redhat.com>
To: Oğuz <oguzismailuysal <at> gmail.com>, Assaf Gordon <assafgordon <at> gmail.com>
Cc: "40242 <at> debbugs.gnu.org" <40242 <at> debbugs.gnu.org>
Subject: bug#40242: n as delimiter alias
Date: Tue, 31 Mar 2020 08:26:01 -0500
On 3/31/20 2:00 AM, Oğuz wrote:
> Thanks for the reply. This might not be a bug though; I sent a similar mail
> (https://www.mail-archive.com/austin-group-l <at> opengroup.org/msg05881.html)
> to Austin Group mailing list asking what's the expected behavior in this
> case, and I was told (
> https://www.mail-archive.com/austin-group-l <at> opengroup.org/msg05891.html)
> both behaviors -yielding n or empty line- are correct and standard should
> *probably* be amended to explicitly state that this is unspecified. And
> apparently (
> https://www.mail-archive.com/austin-group-l <at> opengroup.org/msg05893.html)
> some other UNIXes adopted the same practice as GNU sed (or vice versa, I
> don't know which one is older).

The POSIX folks will probably declare that use of a \X sequence (for 
arbitrary X; 'n', 't', '1', and probably others all fit this category) 
inside a regex delimited by X is unspecified behavior.  But that still 
doesn't stop us from fixing GNU set to at least be consistent - we 
should either blindly declare that \X represents the special meaning of 
X when such a meaning is present regardless of X also being the regex 
delimiter (our current \n behavior - no way to represent the delimiter 
as a literal match), or that use of X as a delimiter renders the special 
meaning of \X useless for that regex (our \t behavior - no way to 
represent the special behavior as part of the match).  My personal 
preference is making things consistent to our \t behavior.

>> In the code, the "match_slash" function [1] is used to find
>> the delimiters of the "s" command (typically "slashes").
>> Special handling happens if a slash is found [2],
>> And in lines 557-8 there's this conditional:
>>
>>                else if (ch == 'n' && regex)
>>                  ch = '\n';
>>
>> Which forces any "\n" to be a new-line, regardless if the
>> delimiter itself was an "n".
>>

>> Interestingly, removing these two lines does not cause
>> any test failures, so this might be easy to fix without causing
>> any regressions.
>>
>>
>> For now I'm leaving this item open until we decide how to deal with it.

I'm thus in favor of removing that special-case of 'n'.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org





This bug report was last modified 2 years and 294 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.