From unknown Sun Aug 17 01:43:48 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#33793 <33793@debbugs.gnu.org> To: bug#33793 <33793@debbugs.gnu.org> Subject: Status: sed bug with regular expressions Reply-To: bug#33793 <33793@debbugs.gnu.org> Date: Sun, 17 Aug 2025 08:43:48 +0000 retitle 33793 sed bug with regular expressions reassign 33793 sed submitter 33793 Uladzimir Panasiuk severity 33793 normal tag 33793 notabug thanks From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 18 12:18:44 2018 Received: (at submit) by debbugs.gnu.org; 18 Dec 2018 17:18:44 +0000 Received: from localhost ([127.0.0.1]:53629 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gZJ12-00025w-EC for submit@debbugs.gnu.org; Tue, 18 Dec 2018 12:18:44 -0500 Received: from eggs.gnu.org ([208.118.235.92]:35761) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gZGfF-0004bB-2Z for submit@debbugs.gnu.org; Tue, 18 Dec 2018 09:48:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gZGf9-0000YV-0V for submit@debbugs.gnu.org; Tue, 18 Dec 2018 09:47:59 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.5 required=5.0 tests=BAYES_05,FREEMAIL_FROM, HTML_MESSAGE autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:51123) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gZGf8-0000Uw-RW for submit@debbugs.gnu.org; Tue, 18 Dec 2018 09:47:58 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44413) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gZGco-00074G-6z for bug-sed@gnu.org; Tue, 18 Dec 2018 09:45:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gZGcm-0004EY-N1 for bug-sed@gnu.org; Tue, 18 Dec 2018 09:45:34 -0500 Received: from mail-lf1-x135.google.com ([2a00:1450:4864:20::135]:46827) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gZGcm-00045c-8s for bug-sed@gnu.org; Tue, 18 Dec 2018 09:45:32 -0500 Received: by mail-lf1-x135.google.com with SMTP id f23so12425609lfc.13 for ; Tue, 18 Dec 2018 06:45:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=V6vEvq36Zw3bW07tCwLshnR28U/mG+S1SnRKD+3fiL0=; b=rwPqhn0JItPROT1dEN0XYK3UjiVImJ+diYR1WuO/8yFMTYhzTPjScubrpndkpzIpjq YtuseIgxRoBM8u8WLDHsrnF2fHvlZavVP3BdkhDgGmohWGFT8T98yJGZllrZusru4h3m y6wIceZT2aHUfCk9HKRFMPyrRHaUYXPK9e2ltlsctGUKlxY0TYRB1AjUGF1KU1TElfxi ZWlMAXnGAqQgdCjebTGe4PLVlCks8sWylCK5KDrFcwZdjs6jwm4NQ0xqWlDyeM5SHKus yI5oUAdsTU8w5XZEEXpq4d12tPRbdqXZdbfW1vlLIt9kZJdaolVsMIFTblGEK80RdlM7 dVAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=V6vEvq36Zw3bW07tCwLshnR28U/mG+S1SnRKD+3fiL0=; b=CcZAlVeAgjiXu79IQ01TRo2SxOFgljq5BDA+81v1v3doqf1qBsa9l+PDkCWVp7ZfuC MtY6khygp9KUXcXZBoiJ3XZFpETTuZZOLS/FOFzPa4pE9w8NH2USnz1SVqDjbjJk9JtN veI37ckySzDgow2vm1FkVdgiAcZZMBrjylyKupuSb5qSglFoGT2ZQ0UvvNm7Fl1OGaph mNYM8nG+TxWW0RakyRQGJ1d9DhNkXykCsHHT4rDvaGwKWD3R8oxjeyvGDVv+zXBrpV81 kVgpS1qJkWtwVtY8GZkaq18zpxFyig5uZMo4tZv9tQxHJzg8PIggKUl69LL+BK0QfYq3 5tcA== X-Gm-Message-State: AA+aEWY58Byfjfa/rULoFFArsegcxSfRf3puoF4S/HrN9ThJHW8qO569 XXt3GYlj4jqrVjav6qQNiR3TE5xg9tnkJcuwAXyuCiDVKpQ= X-Google-Smtp-Source: AFSGD/URLbGyiHf8zr57r6oSNlrFEGbS/z/rJLYLSC8bp+pFAp7ylSuVIKjWp2Bk+U0E5BlvMHKqXM60Vj+GbdLSmw4= X-Received: by 2002:a19:d857:: with SMTP id p84mr9635215lfg.44.1545137460617; Tue, 18 Dec 2018 04:51:00 -0800 (PST) MIME-Version: 1.0 From: Uladzimir Panasiuk Date: Tue, 18 Dec 2018 15:50:49 +0300 Message-ID: Subject: sed bug with regular expressions To: bug-sed@gnu.org Content-Type: multipart/alternative; boundary="0000000000007828a4057d4b5982" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Tue, 18 Dec 2018 12:18:43 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) --0000000000007828a4057d4b5982 Content-Type: text/plain; charset="UTF-8" Hi. I've found the bug using sed. There is how to reproduce: 1) Run bash 2) Exec command \ echo weather -5.0 | sed 's/[^0-9\-\.]//g' 3) You will get "5.0". Expected output is "-5.0" BUT If you exec echo weather -5.0 | sed 's/[^0-9\.\-]//g' you''ll get the correct output "-5.0". I am using GNU sed version 4.5 on Manjaro Linux. --0000000000007828a4057d4b5982 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi. I&#= 39;ve found the bug using sed. There is how to reproduce:
1) Run = bash
2) Exec command \
echo= weather -5.0 | sed 's/[^0-9\-\.]//g'=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0
3) You will get "5.0". Expected output is &= quot;-5.0"

BUT
If you exec
echo weather -5.0 | sed 's/[^0-9\.\-]//g'
you'= 'll get the correct output "-5.0".

I= am using GNU sed version 4.5 on Manjaro Linux.
--0000000000007828a4057d4b5982-- From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 18 13:23:26 2018 Received: (at control) by debbugs.gnu.org; 18 Dec 2018 18:23:26 +0000 Received: from localhost ([127.0.0.1]:53664 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gZK1e-0003mC-3K for submit@debbugs.gnu.org; Tue, 18 Dec 2018 13:23:26 -0500 Received: from mx1.redhat.com ([209.132.183.28]:43892) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gZK1b-0003lt-M3; Tue, 18 Dec 2018 13:23:24 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0C14D58E24; Tue, 18 Dec 2018 18:23:18 +0000 (UTC) Received: from [10.10.122.76] (ovpn-122-76.rdu2.redhat.com [10.10.122.76]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3F8D1600CC; Tue, 18 Dec 2018 18:23:16 +0000 (UTC) Subject: Re: bug#33793: sed bug with regular expressions To: Uladzimir Panasiuk , 33793-done@debbugs.gnu.org, GNU bug control References: From: Eric Blake Organization: Red Hat, Inc. Message-ID: <1e16a005-8beb-8b86-01b8-5fabb4da6d33@redhat.com> Date: Tue, 18 Dec 2018 12:23:16 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 18 Dec 2018 18:23:18 +0000 (UTC) X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) tag 33793 notabug thanks On 12/18/18 6:50 AM, Uladzimir Panasiuk wrote: > Hi. I've found the bug using sed. There is how to reproduce: > 1) Run bash > 2) Exec command \ > echo weather -5.0 | sed > 's/[^0-9\-\.]//g' You used two range expressions in this regex, but the result is the same as if you had used this regex with only one range expression:: 's/[^0-9\.]//g' Either way, you requested all characters except for the 10 digits, a literal backslash, or a literal dot. Remember, a range expression [\-\] selects a single character of the backslash. Since '-' is not excluded from the [] expression, sed correctly strips it. > 3) You will get "5.0". Expected output is "-5.0" You might be remembering the behavior of perl regex, where \ inside [] is an escape character. But that's not how POSIX regex behaves - inside [], \ is literal, and there are no escape characters. > > BUT > If you exec > echo weather -5.0 | sed 's/[^0-9\.\-]//g' Here, your regex only has one range expression, but lists \ twice. The repetition is harmless, but means that your expression is the same as this shorter: 's/[^0-9\.-]//g' It is not obvious from your input whether you intended to be filtering out literal backslash or not, but if not, you probably meant to write: 's/[^0-9.-]//g' with no backslash, and with the - last (as that is one of the few places that you can write - to be matched as itself rather than treated as a range operator between neighboring characters). I'm closing this as not a bug, but feel free to reply with further questions or comments. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org From unknown Sun Aug 17 01:43:48 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Wed, 16 Jan 2019 12:24:06 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator