GNU bug report logs - #21253
sed escape bug

Previous Next

Package: sed;

Reported by: Jacob Young <amazingjacob <at> gmail.com>

Date: Thu, 13 Aug 2015 15:48:02 UTC

Severity: normal

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21253 in the body.
You can then email your comments to 21253 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-sed <at> gnu.org:
bug#21253; Package sed. (Thu, 13 Aug 2015 15:48:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jacob Young <amazingjacob <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-sed <at> gnu.org. (Thu, 13 Aug 2015 15:48:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jacob Young <amazingjacob <at> gmail.com>
To: bug-sed <at> gnu.org
Subject: sed escape bug
Date: Thu, 13 Aug 2015 08:24:33 -0400
[Message part 1 (text/plain, inline)]
The escape '\c\' fails at the end of a regex.

~$ echo a | sed -e 's/a/\c\/' | hexdump -c
sed: -e expression #1, char 8: unterminated `s' command
~$ echo a | sed -e 's/a/\c\b/' | hexdump -c
0000000 034   b  \n
0000003
~$ sed --version
sed (GNU sed) 4.2.2
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html
>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Jay Fenlason, Tom Lord, Ken Pizzini,
and Paolo Bonzini.
GNU sed home page: <http://www.gnu.org/software/sed/>.
General help using GNU software: <http://www.gnu.org/gethelp/>.
E-mail bug reports to: <bug-sed <at> gnu.org>.
Be sure to include the word ``sed'' somewhere in the ``Subject:'' field.
[Message part 2 (text/html, inline)]

Reply sent to Assaf Gordon <assafgordon <at> gmail.com>:
You have taken responsibility. (Thu, 16 Feb 2017 22:54:01 GMT) Full text and rfc822 format available.

Notification sent to Jacob Young <amazingjacob <at> gmail.com>:
bug acknowledged by developer. (Thu, 16 Feb 2017 22:54:02 GMT) Full text and rfc822 format available.

Message #10 received at 21253-done <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: 21253-done <at> debbugs.gnu.org
Subject: RE: sed escape bug
Date: Thu, 16 Feb 2017 22:52:32 +0000
Hello,

Sorry for the delayed response.


Jacob Young <amazingjacob <at> gmail.com> wrote:
> The escape '\c\' fails at the end of a regex.
>
> ~$ echo a | sed -e 's/a/\c\/' | hexdump -c
> sed: -e expression #1, char 8: unterminated `s' command
> ~$ echo a | sed -e 's/a/\c\b/' | hexdump -c
> 0000000 034   b  \n
> 0000003
> ~$ sed --version
> sed (GNU sed) 4.2.2

This is an interesting case:
It's a bug in older sed-4.2.2, but the fix is perhaps not your expected 
behaviour of allowing '/' after '\c\'.

A single backslash after '\c' was ambigious in sed-4.2.2:
Should it be parsed as the option to '\c' (e.g. '\c\' is like '\cA'),
or should it be parsed like other back-slashes, where it removes
the special meaning of the following character (e.g. 's/x/\//' replaces
an 'x' with a '/' ).

The previous behaviour lead to inconsistencies, exactly
as you've encountered. Compare:

   # '\t' means TAB, backslash affects the character that follows:
   $ echo x | sed-4.2.2 's/x/\t/' | hexdump -c
   0000000  \t  \n
   0000002

   # but here the '\c\' is taken as one item, and 't' is parsed by itself:
   $ echo x | sed-4.2.2  's/x/\c\t/' | hexdump -c
   0000000 034   t  \n
   0000003

   # yet '/' immediately after '\c\' was rejected:
   $ echo x | sed-4.2.2 -e 's/x/\c\/' | hexdump -c
   sed: -e expression #1, char 8: unterminated `s' command

Commit v4.2.2-99-g156e099 [1] fixed this behaviour.
To use backslash as control character, TWO backslashes are required -
just like using a literal backslash anywhere else:

   ## Two backslashes are needed for CTRL-\
   $ echo x | sed-4.4 's/x/\c\\/' | hexdump -c
   0000000 034  \n
   0000002

   ## A single backslash is not enough:
   $ echo x | sed-4.4 's/x/\c\/' | hexdump -c
   sed: -e expression #1, char 8: unterminated `s' command


   ## Ambigious usage is rejected:
   $ echo x | sed-4.4 's/x/\c\t/' | hexdump -c
   sed: -e expression #1, char 9: recursive escaping after \c not allowed

This behaviour was introduced in sed-4.3.

As such I'm closing this bug, but discussion can continue
by replying to this thread.

regards,
- assaf


[1] https://git.savannah.gnu.org/cgit/sed.git/commit/?id=156e0998







bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 17 Mar 2017 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 156 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.