From unknown Wed Jun 18 00:23:56 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#25750 <25750@debbugs.gnu.org> To: bug#25750 <25750@debbugs.gnu.org> Subject: Status: [sed] Matching square brackets Reply-To: bug#25750 <25750@debbugs.gnu.org> Date: Wed, 18 Jun 2025 07:23:56 +0000 retitle 25750 [sed] Matching square brackets reassign 25750 sed submitter 25750 =E6=9E=97=E8=87=AA=E5=9D=87 severity 25750 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 16 02:15:21 2017 Received: (at submit) by debbugs.gnu.org; 16 Feb 2017 07:15:21 +0000 Received: from localhost ([127.0.0.1]:41630 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ceGHh-0002X4-6g for submit@debbugs.gnu.org; Thu, 16 Feb 2017 02:15:21 -0500 Received: from eggs.gnu.org ([208.118.235.92]:47523) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ceELy-0007wR-QP for submit@debbugs.gnu.org; Thu, 16 Feb 2017 00:11:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ceELs-0000Ey-My for submit@debbugs.gnu.org; Thu, 16 Feb 2017 00:11:33 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, HTML_MESSAGE,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:40088) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ceELs-0000Er-JT for submit@debbugs.gnu.org; Thu, 16 Feb 2017 00:11:32 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38487) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ceELr-0001dg-Hi for bug-sed@gnu.org; Thu, 16 Feb 2017 00:11:32 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ceELq-0000EH-Ne for bug-sed@gnu.org; Thu, 16 Feb 2017 00:11:31 -0500 Received: from mail-it0-x22e.google.com ([2607:f8b0:4001:c0b::22e]:33778) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ceELq-0000E0-Hf; Thu, 16 Feb 2017 00:11:30 -0500 Received: by mail-it0-x22e.google.com with SMTP id d9so1624467itc.0; Wed, 15 Feb 2017 21:11:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=ilbksP+33qtzyMqo4O5OxKy+lXwC6tD8kYW4Hqo+ZS4=; b=SEuh/qnHlLoUhTrC3jgLxqmIo6nQhTNxdhGh+Pf2XKnQNG0jTucz0+nrTSD7jf1KqG bsQSFSAxW+dVoYYqGc13JDVkoOI3ijCc8m+eA1yKYiMWC7h5uH2tWlKi7bT4abLC071d DE5MPOR1AQoOkMLASmDoG7xjfzG+wlM1D842BCYtYlsq4RxxYtam1ITbmxaAkHnjrQ4S ID6hya22ZZw2LQtE8qaMibVngQHCKXFO9bZBzKsC4NHypvZ0KmePLD4/JfIpRb+5f2nl oXeHtiUl+6HJP0FPRwkNUkMC67C6swGvihrQMBNqP29TOhOkDFOOk9ZMtVRbYD1WSsg1 15bA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=ilbksP+33qtzyMqo4O5OxKy+lXwC6tD8kYW4Hqo+ZS4=; b=oG2mFAI9wNY548rPQa2ygET1RRkYBQ+etlZRvTolSb0ceHs6EyWsG5khC+qJzuUjYN ka3o+SARvhpSS6KLu7ZC1zjXaXLNZm81h2A3t7Z/tLQE1SwS02QUTO5mG1iLUAap6K6z jMHAOycFxoBtsfj7jxx5X3PLWazcm2iQzE716u0bUDI8hxLjNthC+nKm7IDuJ94bNz44 rHzieUNi4sWou4JUDMdoPmxpOPv6J0vjB7OCevoh1LZ4NEe8Y3RLkyPjki3pxOK2wFNO ADL2tVPro3vZ0XbTufcTu7GUq+xgiHkpywN/sJRC8ypHliJYLFG3nttxY9E39tPLyE3l a4EA== X-Gm-Message-State: AMke39kD2GzzKZErHlxG3wyutNJwmchxH6NupjUanqcVAaPLpnr//S4VArttrmw60VjWVAt8ySYHFRstQW5fqA== X-Received: by 10.107.185.65 with SMTP id j62mr653713iof.3.1487221889526; Wed, 15 Feb 2017 21:11:29 -0800 (PST) MIME-Version: 1.0 From: =?UTF-8?B?5p6X6Ieq5Z2H?= Date: Thu, 16 Feb 2017 05:11:18 +0000 Message-ID: Subject: [sed] Matching square brackets To: bug-sed@gnu.org, jim@meyering.net, agn@gnu.org Content-Type: multipart/alternative; boundary=94eb2c072b4a6d758e05489ed418 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 16 Feb 2017 02:15:20 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) --94eb2c072b4a6d758e05489ed418 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi sed maintainers, I want to remove the square brackets in a string: $ echo '[1,2,3]' | sed 's/\[//g' | sed 's/\]//g' 1,2,3 And it works. However, when I want to do it in a single sed, it does not work: $ echo '[1,2,3]' | sed 's/[\[\]]//g' [1,2,3] I can manage to make it work by a weird regexp: $ echo '[1,2,3]' | sed 's/[]\[]//g' 1,2,3 Is that a bug? If it is, I would like to spend some time to fix it. Thanks for reading this email. Best, John Lin =E2=80=8B --94eb2c072b4a6d758e05489ed418 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Hi sed maintainers,

I want to remove the square bra= ckets in a string:

$ echo '[1,2,3]' =
| sed 's/\[//g' | sed 's/\]//g'
1,2,3

And it works.

However, when I want to do it i= n a single sed, it does not work:

$ echo '[1,2,3]' =
| sed 's/[\[\]]//g'
[1,2,3]

I can manage to ma= ke it work by a weird regexp:

$ echo '[1,2,3]' =
| sed 's/[]\[]//g'
1,2,3

Is that a bug? If = it is, I would like to spend some time to fix it.

Thanks for reading this email.<= /p>

Best,
John Lin

=E2=80=8B
--94eb2c072b4a6d758e05489ed418-- From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 16 04:17:08 2017 Received: (at 25750) by debbugs.gnu.org; 16 Feb 2017 09:17:08 +0000 Received: from localhost ([127.0.0.1]:41674 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ceIBY-0005Ig-0i for submit@debbugs.gnu.org; Thu, 16 Feb 2017 04:17:08 -0500 Received: from havoc.proulx.com ([96.88.95.61]:40509) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ceIBV-0005I9-OZ for 25750@debbugs.gnu.org; Thu, 16 Feb 2017 04:17:06 -0500 Received: from joseki.proulx.com (localhost [127.0.0.1]) by havoc.proulx.com (Postfix) with ESMTP id 4BC2278C; Thu, 16 Feb 2017 02:16:59 -0700 (MST) Received: from hysteria.proulx.com (hysteria.proulx.com [192.168.230.119]) by joseki.proulx.com (Postfix) with ESMTP id 17AD821230; Thu, 16 Feb 2017 02:16:59 -0700 (MST) Received: by hysteria.proulx.com (Postfix, from userid 1000) id 0E27A2DC5F; Thu, 16 Feb 2017 02:16:59 -0700 (MST) Date: Thu, 16 Feb 2017 02:16:58 -0700 From: Bob Proulx To: =?utf-8?B?5p6X6Ieq5Z2H?= Subject: Re: bug#25750: [sed] Matching square brackets Message-ID: <20170216015435719986078@bob.proulx.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 25750 Cc: 25750@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 林自均 wrote: > I want to remove the square brackets in a string: > > $ echo '[1,2,3]' | sed 's/\[//g' | sed 's/\]//g' > 1,2,3 > > And it works. Yes. But the above isn't strictly correct regular expression usage. Let's discuss it piece by piece. echo '[1,2,3]' | Okay. Good test pattern. sed 's/\[//g' | Okay. Since the [ would start a character class and you want it to match itself it needs to be escaped. sed 's/\]//g' This is not strictly correct. You have escaped the ] with \]. But that is not needed. The ] does not do anything special in that context. It ends a character class started by a [ but outside of that it is simply a normal character. Escaping the \] defaults to being just a ] character. But it is a bad habit to get into because escaping other characters such as \+ turns on ERE handling. Your expressoin should be this following instead. sed 's/]//g' Those two could be combined into one sed command. echo '[1,2,3]' | sed -e 's/\[//g' -e 's/]//g' 1,2,3 Or by a combined string split by the ';' separator. echo '[1,2,3]' | sed 's/\[//g;s/]//g' 1,2,3 I tend to prefer the latter. But either is fine. > However, when I want to do it in a single sed, it does not work: > > $ echo '[1,2,3]' | sed 's/[\[\]]//g' > [1,2,3] That is incorrect usage. Do not escape characters inside of [...] character classes. The above is behaving correctly. But do not escape characters inside of [...] character classes. You are starting a character class to match any of the enclosed characters. That is good. But then it is broken by escaping the characters inside the character class. Do not escape them. Inside of a character class there is nothing special about those characters because the class turns off special characters. Therefore trying to escape them is wrong. That is the problem. Please review the documentation on regular expressions here: https://www.gnu.org/software/sed/manual/html_node/Character-Classes-and-Bracket-Expressions.html#Character-Classes-and-Bracket-Expressions Most meta-characters lose their special meaning inside bracket expressions: ']' ends the bracket expression if it’s not the first list item. So, if you want to make the ‘]’ character a list item, you must put it first. Therefore you must start the character class, then immediately put in the ] to match itself literally. It does not end the character class since an empty class wouldn't make sense. [ -- start of the character class ] -- match a literal ] [ -- match a literal [ ] -- end the class Here is the working example: echo '[1,2,3]' | sed 's/[][]//g' 1,2,3 > I can manage to make it work by a weird regexp: > > $ echo '[1,2,3]' | sed 's/[]\[]//g' > 1,2,3 That is also incorrect usage. You have added an additional \ into the class. You thought you were esaping the [ but since it is inside of a bracket character class expression already the \ was simply a normal character and matched itself. echo '[1,2,3]\1\2\3' [1,2,3]\1\2\3 echo '[1,2,3]\1\2\3' | sed 's/[]\[]//g' 1,2,3123 echo '[1,2,3]\1\2\3' | sed 's/[][]//g' 1,2,3\1\2\3 As you can see including the \ also removed the \ characters too. Because \ was included as part of the character class. > Is that a bug? If it is, I would like to spend some time to fix it. It is not a bug. It is incorrect usage. I will close the ticket. But please let us know if this makes sense to you. Feel free to continue the discussion. Bob From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 16 04:17:33 2017 Received: (at control) by debbugs.gnu.org; 16 Feb 2017 09:17:33 +0000 Received: from localhost ([127.0.0.1]:41677 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ceIBx-0005JQ-CX for submit@debbugs.gnu.org; Thu, 16 Feb 2017 04:17:33 -0500 Received: from havoc.proulx.com ([96.88.95.61]:40513) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ceIBw-0005JA-47 for control@debbugs.gnu.org; Thu, 16 Feb 2017 04:17:32 -0500 Received: from joseki.proulx.com (localhost [127.0.0.1]) by havoc.proulx.com (Postfix) with ESMTP id A9DDA820 for ; Thu, 16 Feb 2017 02:17:26 -0700 (MST) Received: from hysteria.proulx.com (hysteria.proulx.com [192.168.230.119]) by joseki.proulx.com (Postfix) with ESMTP id 87BA421230 for ; Thu, 16 Feb 2017 02:17:26 -0700 (MST) Received: by hysteria.proulx.com (Postfix, from userid 1000) id 80A4B2DC5F; Thu, 16 Feb 2017 02:17:26 -0700 (MST) Date: Thu, 16 Feb 2017 02:17:26 -0700 From: Bob Proulx To: control@debbugs.gnu.org Subject: close 25750 Message-ID: <20170216021549267132754@bob.proulx.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: NeoMutt/20170113 (1.7.2) X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) close 25750 thanks Explanation already sent. From unknown Wed Jun 18 00:23:56 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Thu, 16 Mar 2017 11:24:03 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 28 10:51:28 2017 Received: (at control) by debbugs.gnu.org; 28 Mar 2017 14:51:28 +0000 Received: from localhost ([127.0.0.1]:49452 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cssT2-0000XC-CX for submit@debbugs.gnu.org; Tue, 28 Mar 2017 10:51:28 -0400 Received: from mail-lf0-f47.google.com ([209.85.215.47]:35480) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cssT1-0000Wr-2S for control@debbugs.gnu.org; Tue, 28 Mar 2017 10:51:27 -0400 Received: by mail-lf0-f47.google.com with SMTP id j90so39880494lfk.2 for ; Tue, 28 Mar 2017 07:51:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=CMOmmfYKekWFQR+KTRxUkiHQZqUYDYiWIgBCDBI6AlY=; b=W87sijr2Nq0BtE07ShGLwAOMBUUrHoqKP9XxOSLbd3SdWd4hIljV/8jkupVaxFoJyn eYD21CVLvJlOAYFEFApsncK8XXHKxW1p0/7uht6c2JoZ+v+YzeAlZFoP5L1eScp6r1Wz NQcOOf5b9I4EqU4+XnwflXriJaDoOEJ/RAr6wmmvixAUjcTLuyEE7SGGi9qR66rjLXy7 EdZDnCw/fmC+xXQ4z0MHBi68q7wcpe3M8CvIBM+uKbVvrJx+jL6MqR4ph6TNL0rUqmls eR0z9ra4mUSuhtFlfAIzVtKeF1FfUBHC+aOUl/uR+tV8rLqVJ7mDhKL0DMtZrBmvFhxj D+kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=CMOmmfYKekWFQR+KTRxUkiHQZqUYDYiWIgBCDBI6AlY=; b=nRfRoOnGmgfC0wGFroWOC9Cgh4LONBvvBPWLwY82eu889PlqZM7SwQCR95SwReCtts nnChcLUy4lRuiYknAph7Fqv8eVlpmYOpNawOqJCoAkx/wITCvhBMIyb5W3wHW9eZp7fe AlOnn1w3D0TxIo96WF5YBQEbAb+WKHRCSZqxBnFir2yWDd+cErx1pUhFTN7RjnR6uKab rdSePLpqAUbaHewZsLqifS+UkYQPZjuL6mLubhAVmL4IpGG3EaoOPKi2hOh2Ry1qxB5D 0qKAS02QN7C/IXgMQbMS0jc/Ifiufc7F8Q+O8uBOLS3ZE1OpRqusprzdrJa7B+Zns7Xn OZFA== X-Gm-Message-State: AFeK/H1KTUdpniNK7/8uUrHz94I3tTR+ZFYW4g1L3n6trqZBpOeCZwdgchGllcDvCTF7HTEyJjCWUFnpH0acbA== X-Received: by 10.46.92.65 with SMTP id q62mr1767609ljb.48.1490712681092; Tue, 28 Mar 2017 07:51:21 -0700 (PDT) MIME-Version: 1.0 From: =?UTF-8?B?5p6X6Ieq5Z2H?= Date: Tue, 28 Mar 2017 14:51:10 +0000 Message-ID: Subject: To: control@debbugs.gnu.org Content-Type: multipart/alternative; boundary=f40304366db8d19eb3054bcb97f5 X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: unarchive 25750 unarchive 25750 [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [209.85.215.47 listed in wl.mailspike.net] -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [209.85.215.47 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (johnlinp[at]gmail.com) 0.0 HTML_MESSAGE BODY: HTML included in message -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders 2.0 BLANK_SUBJECT Subject is present but empty 0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: unarchive 25750 unarchive 25750 [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [209.85.215.47 listed in list.dnswl.org] -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [209.85.215.47 listed in wl.mailspike.net] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (johnlinp[at]gmail.com) 0.0 HTML_MESSAGE BODY: HTML included in message -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders 2.0 BLANK_SUBJECT Subject is present but empty 0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid --f40304366db8d19eb3054bcb97f5 Content-Type: text/plain; charset=UTF-8 unarchive 25750 --f40304366db8d19eb3054bcb97f5 Content-Type: text/html; charset=UTF-8
unarchive 25750
--f40304366db8d19eb3054bcb97f5-- From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 28 10:52:54 2017 Received: (at 25750) by debbugs.gnu.org; 28 Mar 2017 14:52:54 +0000 Received: from localhost ([127.0.0.1]:49457 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cssUP-0000Ze-Nn for submit@debbugs.gnu.org; Tue, 28 Mar 2017 10:52:54 -0400 Received: from mail-lf0-f54.google.com ([209.85.215.54]:32944) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cssUN-0000ZQ-Qh for 25750@debbugs.gnu.org; Tue, 28 Mar 2017 10:52:52 -0400 Received: by mail-lf0-f54.google.com with SMTP id h125so39891485lfe.0 for <25750@debbugs.gnu.org>; Tue, 28 Mar 2017 07:52:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=fR8wzO+VLRkennjnpTDjkdY2VGkR1aRxgvmFPqZoTbU=; b=rpcJmh58AnfosyzOXKx8/pfhwvkKqXnh/epwb/zc7D6OS/UY3SBrcZvYsPbZRLg0/P 54Hv7gm5BZRNuOXJfIwgNYS3aFQseSZ/ztfgC16FPuu9QKGdDAQAu633pRI4B1nGWa5L 1exhCmyo1ZgMG4mkahEWtyR9lNDb1+NVEgRAq9YqYJ0XfJZjH4mJbJFB0Jote8aG5Kb3 SygTIfLvwxcWxRzRzFqjCskUbNC9m/T7URcNwELzSgI5J7uU7j0uhIbfmTNHqIqOuTxF hNEhuUQ9yBk/1z2A5er8wtkgRQImMkKNbBjh3DxjBOnAFYqknfZ6uRRgoOsmWpYWbBUz gDUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fR8wzO+VLRkennjnpTDjkdY2VGkR1aRxgvmFPqZoTbU=; b=jP29UzTAkZKn4ue8p3FvpWP7BV3cHgoN+9/IwT9zUtqZATOIcn2M2uJ3pYNlb48Md2 dFQ6AzLpgjNLHVVhXh6AB9Fsvljw31uQPgOb2+YM7XIcQKQ0FLiO3pimxALGLlNLMrTm 4+8d3ghj7C6yq1aT3PN+FowBlYisbM/FMl/lWP/fmecerD88Rj+01D/T/zlxjrmlxBKZ 14CBOyXn8w/HqNhOsOifyQcuDH3AYGkkdEr++Y6NKkj4VBMIEC/anBO0JRnSWbgpbNL2 2GxUF/Ms0LXkPRiMY5cduEwYH60ZXfR70u1dneDdep7pavG8kNdbZYRmiGws7z6GePlV ymDw== X-Gm-Message-State: AFeK/H333NqtoUCwOgEYnxk6JTk0XQzjDXKzJM0rxTPDM1U1sMFXNSIyc4VSP1TINmdZRKSp/1VklgJgVJ7wRQ== X-Received: by 10.46.71.16 with SMTP id u16mr1921682lja.106.1490712766065; Tue, 28 Mar 2017 07:52:46 -0700 (PDT) MIME-Version: 1.0 References: <20170216015435719986078@bob.proulx.com> In-Reply-To: From: =?UTF-8?B?5p6X6Ieq5Z2H?= Date: Tue, 28 Mar 2017 14:52:35 +0000 Message-ID: Subject: Re: bug#25750: [sed] Matching square brackets To: Bob Proulx Content-Type: multipart/alternative; boundary=001a114032e8e23385054bcb9cfa X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 25750 Cc: 25750@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.5 (/) --001a114032e8e23385054bcb9cfa Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Bob, Thank you for the detailed explanation. That was so helpful. Best, John Lin =E6=9E=97=E8=87=AA=E5=9D=87 =E6=96=BC 2017=E5=B9=B43= =E6=9C=8828=E6=97=A5 =E9=80=B1=E4=BA=8C =E4=B8=8B=E5=8D=8810:47=E5=AF=AB=E9= =81=93=EF=BC=9A > Hi Bob, > > Thank you for the detailed explanation. That was so helpful. > > Best, > John Lin > > Bob Proulx =E6=96=BC 2017=E5=B9=B42=E6=9C=8816=E6=97=A5 = =E9=80=B1=E5=9B=9B =E4=B8=8B=E5=8D=885:17=E5=AF=AB=E9=81=93=EF=BC=9A > > =E6=9E=97=E8=87=AA=E5=9D=87 wrote: > > I want to remove the square brackets in a string: > > > > $ echo '[1,2,3]' | sed 's/\[//g' | sed 's/\]//g' > > 1,2,3 > > > > And it works. > > Yes. But the above isn't strictly correct regular expression usage. > Let's discuss it piece by piece. > > echo '[1,2,3]' | > > Okay. Good test pattern. > > sed 's/\[//g' | > > Okay. Since the [ would start a character class and you want it to > match itself it needs to be escaped. > > sed 's/\]//g' > > This is not strictly correct. You have escaped the ] with \]. But > that is not needed. The ] does not do anything special in that > context. It ends a character class started by a [ but outside of that > it is simply a normal character. Escaping the \] defaults to being > just a ] character. But it is a bad habit to get into because > escaping other characters such as \+ turns on ERE handling. Your > expressoin should be this following instead. > > sed 's/]//g' > > Those two could be combined into one sed command. > > echo '[1,2,3]' | sed -e 's/\[//g' -e 's/]//g' > 1,2,3 > > Or by a combined string split by the ';' separator. > > echo '[1,2,3]' | sed 's/\[//g;s/]//g' > 1,2,3 > > I tend to prefer the latter. But either is fine. > > > However, when I want to do it in a single sed, it does not work: > > > > $ echo '[1,2,3]' | sed 's/[\[\]]//g' > > [1,2,3] > > That is incorrect usage. Do not escape characters inside of [...] > character classes. The above is behaving correctly. But do not > escape characters inside of [...] character classes. > > You are starting a character class to match any of the enclosed > characters. That is good. But then it is broken by escaping the > characters inside the character class. Do not escape them. Inside of > a character class there is nothing special about those characters > because the class turns off special characters. Therefore trying to > escape them is wrong. That is the problem. > > Please review the documentation on regular expressions here: > > > https://www.gnu.org/software/sed/manual/html_node/Character-Classes-and-B= racket-Expressions.html#Character-Classes-and-Bracket-Expressions > > Most meta-characters lose their special meaning inside bracket > expressions: > > ']' ends the bracket expression if it=E2=80=99s not the first list > item. So, if you want to make the =E2=80=98]=E2=80=99 character a = list item, > you must put it first. > > Therefore you must start the character class, then immediately put in > the ] to match itself literally. It does not end the character class > since an empty class wouldn't make sense. > > [ -- start of the character class > ] -- match a literal ] > [ -- match a literal [ > ] -- end the class > > Here is the working example: > > echo '[1,2,3]' | sed 's/[][]//g' > 1,2,3 > > > I can manage to make it work by a weird regexp: > > > > $ echo '[1,2,3]' | sed 's/[]\[]//g' > > 1,2,3 > > That is also incorrect usage. You have added an additional \ into the > class. You thought you were esaping the [ but since it is inside of a > bracket character class expression already the \ was simply a normal > character and matched itself. > > echo '[1,2,3]\1\2\3' > [1,2,3]\1\2\3 > echo '[1,2,3]\1\2\3' | sed 's/[]\[]//g' > 1,2,3123 > echo '[1,2,3]\1\2\3' | sed 's/[][]//g' > 1,2,3\1\2\3 > > As you can see including the \ also removed the \ characters too. > Because \ was included as part of the character class. > > > Is that a bug? If it is, I would like to spend some time to fix it. > > It is not a bug. It is incorrect usage. I will close the ticket. > But please let us know if this makes sense to you. Feel free to > continue the discussion. > > Bob > > --001a114032e8e23385054bcb9cfa Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Bob,

Thank you for the detailed explanation. That was so helpful.

Best,
John Lin

=E6=9E=97= =E8=87=AA=E5=9D=87 <johnlinp@gmail= .com> =E6=96=BC 2017=E5=B9=B43=E6=9C=8828=E6=97=A5 =E9=80=B1=E4=BA= =8C =E4=B8=8B=E5=8D=8810:47=E5=AF=AB=E9=81=93=EF=BC=9A
Hi Bob,

Thank you for the detailed exp= lanation. That was so helpful.

Best,
John Lin

Bob Proulx <bob@proulx.com> =E6=96=BC = 2017=E5=B9=B42=E6=9C=8816=E6=97=A5 =E9=80=B1=E5=9B=9B =E4=B8=8B=E5=8D=885:1= 7=E5=AF=AB=E9=81=93=EF=BC=9A
=E6=9E= =97=E8=87=AA=E5=9D=87 wrote:
> I want to remove the square brackets in a string:
>
> $ echo '[1,2,3]' | sed 's/\[//g' | sed 's/\]//g= 9;
> 1,2,3
>
> And it works.

Yes.=C2=A0 But the above isn't strictly correct regular expression usag= e.
Let's discuss it piece by piece.

=C2=A0 echo '[1,2,3]' |

Okay.=C2=A0 Good test pattern.

=C2=A0 sed 's/\[//g' |

Okay.=C2=A0 Since the [ would start a character class and you want it to match itself it needs to be escaped.

=C2=A0 sed 's/\]//g'

This is not strictly correct.=C2=A0 You have escaped the ] with \].=C2=A0 B= ut
that is not needed.=C2=A0 The ] does not do anything special in that
context.=C2=A0 It ends a character class started by a [ but outside of that=
it is simply a normal character.=C2=A0 Escaping the \] defaults to being just a ] character.=C2=A0 But it is a bad habit to get into because
escaping other characters such as \+ turns on ERE handling.=C2=A0 Your
expressoin should be this following instead.

=C2=A0 sed 's/]//g'

Those two could be combined into one sed command.

=C2=A0 echo '[1,2,3]' | sed -e 's/\[//g' -e 's/]//g'= ;
=C2=A0 =C2=A0 1,2,3

Or by a combined string split by the ';' separator.

=C2=A0 echo '[1,2,3]' | sed 's/\[//g;s/]//g'
=C2=A0 =C2=A0 1,2,3

I tend to prefer the latter.=C2=A0 But either is fine.

> However, when I want to do it in a single sed, it does not work:
>
> $ echo '[1,2,3]' | sed 's/[\[\]]//g'
> [1,2,3]

That is incorrect usage.=C2=A0 Do not escape characters inside of [...]
character classes.=C2=A0 The above is behaving correctly.=C2=A0 But do not<= br class=3D"gmail_msg"> escape characters inside of [...] character classes.

You are starting a character class to match any of the enclosed
characters.=C2=A0 That is good.=C2=A0 But then it is broken by escaping the=
characters inside the character class.=C2=A0 Do not escape them.=C2=A0 Insi= de of
a character class there is nothing special about those characters
because the class turns off special characters.=C2=A0 Therefore trying to escape them is wrong.=C2=A0 That is the problem.

Please review the documentation on regular expressions here:

=C2=A0 https://ww= w.gnu.org/software/sed/manual/html_node/Character-Classes-and-Bracket-Expre= ssions.html#Character-Classes-and-Bracket-Expressions

=C2=A0 Most meta-characters lose their special meaning inside bracket expre= ssions:

=C2=A0 ']'=C2=A0 ends the bracket expression if it=E2=80=99s not th= e first list
=C2=A0 =C2=A0 =C2=A0 =C2=A0item. So, if you want to make the =E2=80=98]=E2= =80=99 character a list item,
=C2=A0 =C2=A0 =C2=A0 =C2=A0you must put it first.

Therefore you must start the character class, then immediately put in
the ] to match itself literally.=C2=A0 It does not end the character class<= br class=3D"gmail_msg"> since an empty class wouldn't make sense.

=C2=A0 [=C2=A0 -- start of the character class
=C2=A0 ]=C2=A0 -- match a literal ]
=C2=A0 [=C2=A0 -- match a literal [
=C2=A0 ]=C2=A0 -- end the class

Here is the working example:

=C2=A0 echo '[1,2,3]' | sed 's/[][]//g'
=C2=A0 =C2=A0 1,2,3

> I can manage to make it work by a weird regexp:
>
> $ echo '[1,2,3]' | sed 's/[]\[]//g'
> 1,2,3

That is also incorrect usage.=C2=A0 You have added an additional \ into the=
class.=C2=A0 You thought you were esaping the [ but since it is inside of a=
bracket character class expression already the \ was simply a normal
character and matched itself.

=C2=A0 echo '[1,2,3]\1\2\3'
=C2=A0 [1,2,3]\1\2\3
=C2=A0 echo '[1,2,3]\1\2\3' | sed 's/[]\[]//g'
=C2=A0 1,2,3123
=C2=A0 echo '[1,2,3]\1\2\3' | sed 's/[][]//g'
=C2=A0 1,2,3\1\2\3

As you can see including the \ also removed the \ characters too.
Because \ was included as part of the character class.

> Is that a bug? If it is, I would like to spend some time to fix it.
It is not a bug.=C2=A0 It is incorrect usage.=C2=A0 I will close the ticket= .
But please let us know if this makes sense to you.=C2=A0 Feel free to
continue the discussion.

Bob
--001a114032e8e23385054bcb9cfa-- From unknown Wed Jun 18 00:23:56 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Wed, 26 Apr 2017 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator