Hi Bob, Thank you for the detailed explanation. That was so helpful. Best, John Lin 林自均 於 2017年3月28日 週二 下午10:47寫道: > Hi Bob, > > Thank you for the detailed explanation. That was so helpful. > > Best, > John Lin > > Bob Proulx 於 2017年2月16日 週四 下午5:17寫道: > > 林自均 wrote: > > I want to remove the square brackets in a string: > > > > $ echo '[1,2,3]' | sed 's/\[//g' | sed 's/\]//g' > > 1,2,3 > > > > And it works. > > Yes. But the above isn't strictly correct regular expression usage. > Let's discuss it piece by piece. > > echo '[1,2,3]' | > > Okay. Good test pattern. > > sed 's/\[//g' | > > Okay. Since the [ would start a character class and you want it to > match itself it needs to be escaped. > > sed 's/\]//g' > > This is not strictly correct. You have escaped the ] with \]. But > that is not needed. The ] does not do anything special in that > context. It ends a character class started by a [ but outside of that > it is simply a normal character. Escaping the \] defaults to being > just a ] character. But it is a bad habit to get into because > escaping other characters such as \+ turns on ERE handling. Your > expressoin should be this following instead. > > sed 's/]//g' > > Those two could be combined into one sed command. > > echo '[1,2,3]' | sed -e 's/\[//g' -e 's/]//g' > 1,2,3 > > Or by a combined string split by the ';' separator. > > echo '[1,2,3]' | sed 's/\[//g;s/]//g' > 1,2,3 > > I tend to prefer the latter. But either is fine. > > > However, when I want to do it in a single sed, it does not work: > > > > $ echo '[1,2,3]' | sed 's/[\[\]]//g' > > [1,2,3] > > That is incorrect usage. Do not escape characters inside of [...] > character classes. The above is behaving correctly. But do not > escape characters inside of [...] character classes. > > You are starting a character class to match any of the enclosed > characters. That is good. But then it is broken by escaping the > characters inside the character class. Do not escape them. Inside of > a character class there is nothing special about those characters > because the class turns off special characters. Therefore trying to > escape them is wrong. That is the problem. > > Please review the documentation on regular expressions here: > > > https://www.gnu.org/software/sed/manual/html_node/Character-Classes-and-Bracket-Expressions.html#Character-Classes-and-Bracket-Expressions > > Most meta-characters lose their special meaning inside bracket > expressions: > > ']' ends the bracket expression if it’s not the first list > item. So, if you want to make the ‘]’ character a list item, > you must put it first. > > Therefore you must start the character class, then immediately put in > the ] to match itself literally. It does not end the character class > since an empty class wouldn't make sense. > > [ -- start of the character class > ] -- match a literal ] > [ -- match a literal [ > ] -- end the class > > Here is the working example: > > echo '[1,2,3]' | sed 's/[][]//g' > 1,2,3 > > > I can manage to make it work by a weird regexp: > > > > $ echo '[1,2,3]' | sed 's/[]\[]//g' > > 1,2,3 > > That is also incorrect usage. You have added an additional \ into the > class. You thought you were esaping the [ but since it is inside of a > bracket character class expression already the \ was simply a normal > character and matched itself. > > echo '[1,2,3]\1\2\3' > [1,2,3]\1\2\3 > echo '[1,2,3]\1\2\3' | sed 's/[]\[]//g' > 1,2,3123 > echo '[1,2,3]\1\2\3' | sed 's/[][]//g' > 1,2,3\1\2\3 > > As you can see including the \ also removed the \ characters too. > Because \ was included as part of the character class. > > > Is that a bug? If it is, I would like to spend some time to fix it. > > It is not a bug. It is incorrect usage. I will close the ticket. > But please let us know if this makes sense to you. Feel free to > continue the discussion. > > Bob > >