GNU bug report logs - #21253
sed escape bug

Previous Next

Package: sed;

Reported by: Jacob Young <amazingjacob <at> gmail.com>

Date: Thu, 13 Aug 2015 15:48:02 UTC

Severity: normal

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #10 received at 21253-done <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: 21253-done <at> debbugs.gnu.org
Subject: RE: sed escape bug
Date: Thu, 16 Feb 2017 22:52:32 +0000
Hello,

Sorry for the delayed response.


Jacob Young <amazingjacob <at> gmail.com> wrote:
> The escape '\c\' fails at the end of a regex.
>
> ~$ echo a | sed -e 's/a/\c\/' | hexdump -c
> sed: -e expression #1, char 8: unterminated `s' command
> ~$ echo a | sed -e 's/a/\c\b/' | hexdump -c
> 0000000 034   b  \n
> 0000003
> ~$ sed --version
> sed (GNU sed) 4.2.2

This is an interesting case:
It's a bug in older sed-4.2.2, but the fix is perhaps not your expected 
behaviour of allowing '/' after '\c\'.

A single backslash after '\c' was ambigious in sed-4.2.2:
Should it be parsed as the option to '\c' (e.g. '\c\' is like '\cA'),
or should it be parsed like other back-slashes, where it removes
the special meaning of the following character (e.g. 's/x/\//' replaces
an 'x' with a '/' ).

The previous behaviour lead to inconsistencies, exactly
as you've encountered. Compare:

   # '\t' means TAB, backslash affects the character that follows:
   $ echo x | sed-4.2.2 's/x/\t/' | hexdump -c
   0000000  \t  \n
   0000002

   # but here the '\c\' is taken as one item, and 't' is parsed by itself:
   $ echo x | sed-4.2.2  's/x/\c\t/' | hexdump -c
   0000000 034   t  \n
   0000003

   # yet '/' immediately after '\c\' was rejected:
   $ echo x | sed-4.2.2 -e 's/x/\c\/' | hexdump -c
   sed: -e expression #1, char 8: unterminated `s' command

Commit v4.2.2-99-g156e099 [1] fixed this behaviour.
To use backslash as control character, TWO backslashes are required -
just like using a literal backslash anywhere else:

   ## Two backslashes are needed for CTRL-\
   $ echo x | sed-4.4 's/x/\c\\/' | hexdump -c
   0000000 034  \n
   0000002

   ## A single backslash is not enough:
   $ echo x | sed-4.4 's/x/\c\/' | hexdump -c
   sed: -e expression #1, char 8: unterminated `s' command


   ## Ambigious usage is rejected:
   $ echo x | sed-4.4 's/x/\c\t/' | hexdump -c
   sed: -e expression #1, char 9: recursive escaping after \c not allowed

This behaviour was introduced in sed-4.3.

As such I'm closing this bug, but discussion can continue
by replying to this thread.

regards,
- assaf


[1] https://git.savannah.gnu.org/cgit/sed.git/commit/?id=156e0998







This bug report was last modified 8 years and 174 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.