From unknown Wed Jun 18 00:24:26 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#56018 <56018@debbugs.gnu.org> To: bug#56018 <56018@debbugs.gnu.org> Subject: Status: sed bug when using extended regular expressions and [] with backslash character class identifiers. Reply-To: bug#56018 <56018@debbugs.gnu.org> Date: Wed, 18 Jun 2025 07:24:26 +0000 retitle 56018 sed bug when using extended regular expressions and [] with b= ackslash character class identifiers. reassign 56018 sed submitter 56018 Bob Power severity 56018 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 16 08:11:13 2022 Received: (at submit) by debbugs.gnu.org; 16 Jun 2022 12:11:13 +0000 Received: from localhost ([127.0.0.1]:40961 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1o1oLA-0002mT-MZ for submit@debbugs.gnu.org; Thu, 16 Jun 2022 08:11:12 -0400 Received: from lists.gnu.org ([209.51.188.17]:55526) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1o1mxh-0004W1-Fh for submit@debbugs.gnu.org; Thu, 16 Jun 2022 06:42:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44864) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o1mxf-00022x-LE for bug-sed@gnu.org; Thu, 16 Jun 2022 06:42:53 -0400 Received: from sonic302-21.consmr.mail.ne1.yahoo.com ([66.163.186.147]:37638) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1o1mxd-0007D2-50 for bug-sed@gnu.org; Thu, 16 Jun 2022 06:42:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1655376165; bh=iFIGYPX6YZRxAP9hODInihKo1B8UGzTX5yjAN2dGzkg=; h=Date:From:To:Subject:References:From:Subject:Reply-To; b=ZCOgHUnZZ8qVlHTDghQDr9gEtOrYyZepgj64D1d/tehhLixWNzz5LVNns+k4qGMdAEs3fnigPKMhSv9+mx5CL2Jxkjh0tFiY3crg+aR0sqGGp4lai4nQzevVx5LRgavYX+c1XeOT48shKs3rpgxFf/CErg8bhUywaTFoOJLlomzGzSCddi3Qkuttv3h0pXzasIvBbW4JsgtQTR0uHKM4zCIhfHajyiVjLjrjZjRJr+CMb9QTpvjQPzBC2qs4qTzWcN8VbP660vOuvE4bxwvhurc9HiwrikSeUtmiLwuJ1azN2c1fOyN1iJLx3vpoDcL1qb+9QL3LcJJrN5AojCO8Hw== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1655376165; bh=liPG/ucM+YGG7CzyX/mDFwXdmyfDOeXakGvmsvqy4Pa=; h=X-Sonic-MF:Date:From:To:Subject:From:Subject; b=lsNW2T/sGcLqdH9SSzGmmVgm6qDOX4RLV2Ad7HgOMvz+VOoDDn4nOSgVupH10cnD+kpxl6qrSarWS8VBTG96pH+EM8Wlu2phij9FEr5BpcS7OIE6LEI5NYRgLl6j9JZURUBkfoM3bmSSOJQ6fRWb5bvxSCogscU+xEKN3b6Yk9uAqfP2T/zbSMoIB2DfGeJFYbRM6d7SVlJMQDTeazV7ndyIc0TMyt9aNDTdG95KusXStyEAWF9a6J/KB7wYiBNOSVss5VhUNMkhv3ofogf/A0IRW+eYkJn+f5BYVQd7SnPjlaFqPKdsnZjRLXOeF2lxEjFP2A979j6ezgsu50GC+Q== X-YMail-OSG: wsGBoQMVM1lv2FwYN8a7OR9XkF1bXDT5ken.vZtDRs4SrRCLW8TMb7MG_J.6SwZ NzjS5JajIUqKg41HaPZVnUllPxHBjIUVRA8VEnf..BCnEltO5lhJziVYThxxIn9aNMsYUKoMhLq3 qS6DYFwkshWszckRLxoZoP4BNN_wfdGtDd9nq7At1iK7olI1Y3xWwyJi6vSMB8XB2Y_JS3lGxqS9 azMoSayip6Y00m4k5aEnAfxmL4ur23YsJ8JsAJd8kNt8zHqLgptQlRL7y.7GxKEoboOe4vOikOfT PIZ9p5cHzocLL8mfiZTeNwobv3vOkClbX1ya9v6YEQflDBpkp0e_tjdvYLZ0wytGZcyVK8XrJz8_ dlzJjzj4qDqdBiiX83yRNnacaa7wRnjuGqhpQvuG0CWARNU0i7KvsKycJHpeKck0vHokmRFkqcn0 7UHeK6SbjFr4dhC8cG.yTZ6euCGOK2sxaGByKdiPqgFBK0Tl7wt0ZHtd2PP0.KAxlh4RonLqRkiq _u1nKVpLvEJ6FJScZRiHQAA7Sj1SL5tOoeX0HcE2wlDQN5CtnKQBkMAXzr8i3CWh.Tl1mZyeeFfz 7.RCcfKMAI8FxN1f4qKFr7zd5PCUxuj79zJa12qXdXcZB3z63U8BjmpwNdNnltNu6T0v1CGS7ryP vUybyP8SzVygwIQ0KKXBw3q1TBmgp5fVTrU0.wmPcHuiKL4hFNaZVt6mNFGKh6heVWLdoSJZRMqH BZztqdllPO4pMPbNju0vT7HoXu2H_VIT3atUoYe8nQcllXERZzCVRDbu17EMGwxEdPrhNhAQuABs z1bwhalDy6nK8CXQT8EJEL3bAkB8WuZmok3mwsC_FFxPzy6N0b_ipAUEH1A99wUR6g29KE081P1. n5N6fLkEEgWaxqNVQJNPabOaDMVBqZFmcRPWC82F4Gepz5S5ti06_YL9GxaJ.IoOM7HxcICONSTG YavFO9qoFOel2Ag0P9fDJcPCGNBoSrERBKSplCituhO9HvfnkGkU5VdvHBTvmvY0Te8JSvtspluB ZwRwAx52fxcHFIUlWYIc.8.Cn_fbjR.DwE8tIECFXNpUK34xbEwL9_JcujrtZcLPSvjbhPR1K5Ne 8zhQzTAmQGte1vKkHh6QQyTzSiC8FRzF9cxEJIRzSCn.ci2_lLxf.vrIgacj5Zyqj7.xgtpVFfN7 H_JZFShb9tnwhOGTCwJqlnnB31j78ODAmAmHGrwfedjqDq5GkucnCopamMh2ZwaaTepa0aLzsnPD _4Z4Pu3beNclwHzP3fHxV9CVD_xbPFU_KDOeYNze4k8kDIrBO0IIq.VECz8VasOgKfJ2i4DWfmSl qZalKSBuEGJiwBvilPWWvWfvokLqDZQ_kqOynZfJBIk5h8IDYfXHnfpuqjBYv6MrKLaELZTgyPG_ eghTOFUZu.bne8IZjoxb9sfhvESO1Hu6JRAQjTpAyFd2TZiVk3oUIidli5APMq3f2Ix59Vd85G11 ulyKmA_i8Q1V8sTE5fvME8F8pC_.CzCKbB3Etj0xF7cCp5HWvaKQ4pSdfBGGwMB3W36rg9PWD_Zx XElcm8g8rJ6KauyKPWANt5mLZB3whbDiyhTdC2w9BYGfJn9GbdIj71r3WLfme7azBaD1yEhyy9oq BRXT6SZln8ozCH8jXA4UbUrRnV7wqIA9C7wvYtQYaaBs5VUGV6gDqyJZdYtsLAxEHyYGl6np99Nw y.XSNlSDxx9S8LTmjZ1b0C1XFXh0giuHc7zKsU5_AcmhyhvHGxHTLSfd1at4.ZSQHpN7CgLJzhyV pb0GnxbElEJQEbVENS0RPL.iexkIL7cgRxXhwcoMk.nG1TcJ1prCncF8OoZjAX3EuYbBhgw2bKfB VrJAYRH.07Zrmitk_hPwtw11sS_SicHkvgceDEqC0i0elZmvb96hSdMznA.clQwx3ZzfzTZgRrl7 auxa2CefYYsw.VZjDd4wD5A_JikKqzRBWvZrTKeBb49RAe432MKcdHdfq8.UvqfMwuxJfLfOSBY5 0_rGnZmZr_S.G.L_yrz45JBB7ByHnbJfPcvFX1zkStXimnZyPK5FRbi0UHIIOEaM0K1Yez.ac_7. 45mR2cWXpE7qoQBODLDXk0XF590FWks0GnVKqJ8PrgBTQoMLWtd6uOg1st.HAmJrd0s_sdihvxkw CjKVLCBZfEBTi0Oo5OMW1OXUndDio9Ix3q8Z5XHbKRzA2FPbBczCNkn9m99DPYhsfIQ-- X-Sonic-MF: Received: from sonic.gate.mail.ne1.yahoo.com by sonic302.consmr.mail.ne1.yahoo.com with HTTP; Thu, 16 Jun 2022 10:42:45 +0000 Date: Thu, 16 Jun 2022 10:42:55 +0000 (UTC) From: Bob Power To: "bug-sed@gnu.org" Message-ID: <47557494.180545.1655376175457@mail.yahoo.com> Subject: sed bug when using extended regular expressions and [] with backslash character class identifiers. MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_180544_854113853.1655376175456" References: <47557494.180545.1655376175457.ref@mail.yahoo.com> X-Mailer: WebService/1.1.20280 YMailNorrin Content-Length: 2170 Received-SPF: pass client-ip=66.163.186.147; envelope-from=b_power@yahoo.com; helo=sonic302-21.consmr.mail.ne1.yahoo.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 16 Jun 2022 08:11:11 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.6 (-) ------=_Part_180544_854113853.1655376175456 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable # sed --version sed (GNU sed) 4.5 Copyright (C) 2018 Free Software Foundation, Inc. Given \s =3D whitespace, [\s] should also be a whitespace. We should get the same results if we use [\s] in place of \s but we don't... test 1: replace all whitespace sequences with xx using \s - OK/Works as exp= ected; echo 'A=C2=A0 BC=C2=A0 D' | sed -E 's/\s+/xx/g'=20 AxxBCxxD test 2: replace all whitespace sequences with xx using [\s] - fails/not as = expected - should be same as test 1 output; echo 'A=C2=A0 BC=C2=A0 C' | sed -E 's/[\s]+/xx/g' A=C2=A0 BC=C2=A0 C After some experimenting it seems that inside [] sed sees all \ as literal = \ characters and not part of class identifiers.. echo 'A=C2=A0 B\C=C2=A0 Csstt' | sed -E 's/[\s]+/xx/g' A=C2=A0 BxxC=C2=A0 Cxxtt ------=_Part_180544_854113853.1655376175456 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
# sed --version
sed (GNU sed) 4.5Copyright (C) 2018 Free Software Foundation, Inc.

Given
\s =3D w= hitespace, [\s] should also be a whitespace.

We should get the same = results if we use [\s] in place of \s

but we don't...

test 1:= replace all whitespace sequences with xx using \s - OK/Works as expected;<= br>
echo 'A  BC  D' | sed -E 's/\s+/xx/g'
AxxBCxxD

= test 2: replace all whitespace sequences with xx using [\s] - fails/not as = expected - should be same as test 1 output;

echo 'A  BC  C= ' | sed -E 's/[\s]+/xx/g'
A  BC  C

After some experimen= ting it seems that inside [] sed sees all \ as literal \ characters and not= part of class identifiers..

echo 'A  B\C  Csstt' | sed -E= 's/[\s]+/xx/g'
A  BxxC  Cxxtt


<= /div> ------=_Part_180544_854113853.1655376175456--