GNU bug report logs - #39978
Weird substitution of square brackets

Previous Next

Package: sed;

Reported by: Evangelos Tsagkas <tsagkase <at> gmail.com>

Date: Sat, 7 Mar 2020 20:41:01 UTC

Severity: normal

Tags: notabug

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #15 received at control <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Evangelos Tsagkas <tsagkase <at> gmail.com>
Cc: 39978 <at> debbugs.gnu.org
Subject: Re: bug#39978: Weird substitution of square brackets
Date: Sat, 7 Mar 2020 15:43:43 -0700
tag 39978 notabug
close 39978
stop

Hello,

The commands you list below are correct and expected.
They follow POSIX's basic regular expression's "bracket expression"
behavior.

Note the following POSIX rules:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03
    "9.3.3 - The <period>, <left-square-bracket>, and <backslash> shall be
    special except when used in a bracket expression"
and
    "9.3.5(1) - [...] The <right-square-bracket> ( ']' ) shall lose its
     special meaning and represent itself in a bracket expression if it
     occurs first in the list [...] Otherwise, it shall terminate the
     bracket expression"

On Sat, Mar 07, 2020 at 10:17:28PM +0200, Evangelos Tsagkas wrote:
> $ echo "object[0])" | sed 's/[\]\[]/./g'
> object[0])
> $ echo "object[0])" | sed 's/[\]]/./g'
> object[0])
> $ echo "object[0])" | sed 's/[\[]/./g'
> object.0])
> $ echo "object[])" | sed 's/[\]\[]/./g'
> object[])
> $ echo "object[])" | sed 's/[\[\]]/./g'
> object.)

To illustrate:

    # <backslash> in a bracket-expression is literal, not special
    $ echo '\' | sed 's/[\]/X/'

    # <left-square-bracket> inside a bracket-expression is literal, not special
    $ echo '[' | sed 's/[[]/X/'
    X

    # Combine the above two rules, and even though this might be
    # visually confusing, the bracket-expression only matches literal
    # <backslash> and <left-square-bracket>, regardless of how many
    # times they repeat in the bracket expression
    $ echo '\[' | sed 's/[\[\[\[[\]/X/g'
    XX

    # Here <backslash>-<left-square-bracket> causes the
    # <left-square-bracket> to become literal, not special:
    $ echo '[' | sed 's/\[/X/'

    # <right-square-bracket> appearing without a preceeding *special*
    # <left-square-bracket> is also literal:
    $ echo ']' | sed 's/]/X/'
    X

    # And thus:
    $ echo '[]' | sed 's/\[]/X/'
    X

I hope these explain how you got the above results.

As such, I'm closing this as "not a bug".
Discussion can continue by replying to this thread.

regards,
 - assaf





This bug report was last modified 5 years and 78 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.