GNU bug report logs - #51462
sed bug: ASCII NUL not handled in simple pattern

Previous Next

Package: sed;

Reported by: Frances Wingerter <fw <at> immunant.com>

Date: Thu, 28 Oct 2021 16:49:02 UTC

Severity: normal

Full log


View this message in rfc822 format

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Davide Brini <dave_br <at> gmx.com>, 51462 <at> debbugs.gnu.org, Frances Wingerter <fw <at> immunant.com>, Eric Blake <eblake <at> redhat.com>
Subject: bug#51462: sed bug: ASCII NUL not handled in simple pattern
Date: Sat, 30 Oct 2021 01:11:35 -0600
(Adding Eric Blake for POSIX opinion)

Hello,

On 2021-10-28 11:32 a.m., Davide Brini wrote:
> On Thu, 28 Oct 2021 15:25:42 +0000, Frances Wingerter <fw <at> immunant.com>
> wrote:
>>
>> Compare the output of these two sed invocations:
>> ```
>> $ echo -e 'a\nb\n\0\nc\n' | sed -e '/\0/,$d'
>>
> $ echo -ne 'a\nb\n\0\nc\n' | sed -e '/\d000/,$d'
> 
> (\o000, \x00 also work). All documented here:
> https://www.gnu.org/software/sed/manual/sed.html#Escapes
> 
> Whether sed maintainers want to also allow the \0 syntax, up to them of
> course.

Thanks Davide for the reply.

In GNU sed, "\0" in the replacement part acts identically to "&" - 
referencing the whole matched portion.

This is the implemented behavior (though undocumented?) since GNU sed
version 3, released in December 1995 - so not likely to be changed.

For comparison, in BSDs "\0" acts as literal zero (ASCII 48).

Interestingly, POSIX defines a "BACKREF" as:

   [...] The character string consisting of a <backslash> character
   followed by a single-digit numeral, '1' to '9'.
   ( from: 
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_05 
)

And so one could argue that this is a GNU extension that should be
disabled when used with "sed --posix".

I think we should keep "\0" undocumented to prevent proliferation of
this non-standard behavior.

regards,
 - assaf






This bug report was last modified 3 years and 228 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.