GNU bug report logs - #23635
possible bug in \c escape handling

Previous Next

Package: sed;

Reported by: Assaf Gordon <assafgordon <at> gmail.com>

Date: Sat, 28 May 2016 01:09:02 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Jim Meyering <jim <at> meyering.net>
To: Assaf Gordon <assafgordon <at> gmail.com>
Cc: 23635 <at> debbugs.gnu.org
Subject: bug#23635: possible bug in \c escape handling
Date: Sat, 28 May 2016 15:06:25 -0700
On Fri, May 27, 2016 at 6:08 PM, Assaf Gordon <assafgordon <at> gmail.com> wrote:
> Hello,
>
> There might be a small bug in processing of GNU extension escape sequence
> "\c".
>
> When the character following "\c" is a backslash, the code consumes only one
> character, leading to inconsistent and incorrect output.
> Example:
>
>   $ echo a | sed 's/./\c\\/' | od -c
>   0000000 034 \ \n
>   0000003
>   $ echo a | sed 's/./\c\d/' | od -c
>   0000000 034 d \n
>   0000003
>
> but:
>
>   $ echo a | sed 's/./\c\/' | od -c
>   sed: -e expression #1, char 8: unterminated `s' command
>   0000000
>
> Meaning there is no way to generate the character '\x034' alone with "\c".
>
> This is also somewhat inconsistent because it consumes a single backslash
> character
> (whereas everywhere else a single backslash is the escape character itself).
>
> For comparison, other characters behave as expected:
>
>   $ sed 's/./\cA/' in | od -c
>   0000000 001 \n
>   0000002
>   $ sed 's/./\c[/' in | od -c
>   0000000 033 \n
>   0000002
>   $ sed 's/./\c]/' in | od -c
>   0000000 035 \n
>   0000002
>
> As a side effect, it could also be confusing if the syntax allows
> 'recursive' escapes,
> such as "\c\x41", which might be argued to be '\c' of the following
> character,
> which should be first evaluated as \x61, resulting in "\cA".
>
> The attached patch fixes the problem with the following rules:
> 1. '\c\\' = Control-Backslash = ASCII 0x34.
> 2. Any other backslash combinations after "\c" are rejected, and sed aborts.
>
> Tests included. comments are welcomed.

Nice catch. I like the patch.
So far, I can make only two suggestions:
  - add a NEWS entry, since this is a bug fix
  - I have a slight preference for the one-liner printf '%s\n' a a a a
a a a ---- rather than your 7-line here-document to generate that same
output in the test case.

And a comment wording nit:

+# Before sed-4.3, this resulted in '\034d' .
+# now it should be rejected.

I prefer to say e.g.,

# Before sed-4.3, this resulted in '\034d'. Now, it is rejected.

Thank you!




This bug report was last modified 9 years and 50 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.