From unknown Fri Aug 15 04:04:44 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#22237 <22237@debbugs.gnu.org> To: bug#22237 <22237@debbugs.gnu.org> Subject: Status: sed no longer removes high-ascii characters as it did formerly. Reply-To: bug#22237 <22237@debbugs.gnu.org> Date: Fri, 15 Aug 2025 11:04:44 +0000 retitle 22237 sed no longer removes high-ascii characters as it did formerl= y. reassign 22237 sed submitter 22237 Brian Tew severity 22237 normal tag 22237 notabug thanks From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 25 13:51:39 2015 Received: (at submit) by debbugs.gnu.org; 25 Dec 2015 18:51:39 +0000 Received: from localhost ([127.0.0.1]:35304 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aCXSl-0003YN-7A for submit@debbugs.gnu.org; Fri, 25 Dec 2015 13:51:39 -0500 Received: from eggs.gnu.org ([208.118.235.92]:56133) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aCRNa-0001CF-6T for submit@debbugs.gnu.org; Fri, 25 Dec 2015 07:21:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aCRNU-0007zD-91 for submit@debbugs.gnu.org; Fri, 25 Dec 2015 07:21:49 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:50202) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aCRNU-0007z9-5g for submit@debbugs.gnu.org; Fri, 25 Dec 2015 07:21:48 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47104) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aCRNT-0002sR-2p for bug-sed@gnu.org; Fri, 25 Dec 2015 07:21:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aCRNP-0007yy-TJ for bug-sed@gnu.org; Fri, 25 Dec 2015 07:21:47 -0500 Received: from mail-yk0-x22f.google.com ([2607:f8b0:4002:c07::22f]:33812) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aCRNP-0007yu-Of for bug-sed@gnu.org; Fri, 25 Dec 2015 07:21:43 -0500 Received: by mail-yk0-x22f.google.com with SMTP id p130so240630851yka.1 for ; Fri, 25 Dec 2015 04:21:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=to:from:reply-to:user-agent:subject:date:message-id:mime-version :content-type:content-transfer-encoding; bh=sWDHJoszigTdzL7bW1jJZxDqHU0bRCL3PucyayAJqHc=; b=IBZnakZxB1wn/owYjCGpN0ddn2N6qJ2BhYhZjcdpxTYdRu0tWoZrImeXXiG9AXD5qT G9fQzxu89FgIz54pY5IZ5kuVIwY4EsJW8u5UGeDNZMzhDN600lL0xuK6xmuP86w6C/99 L0hZp9h+VD35xnrVICNamXNCIEcY8kIzRjQBRCPzg3tsBIjhl8npibIfql09LCPrbasG ZfF9nEFydkd6C0Arpi6/uo9LH9QnfNRWdIpF5KhFxCSdHgyLAj5jAH1fTN4bxi8GSxzP JdHzy9ON3dM7hgWhXJ/QUFp41adzAlfNE595isOXDZ3vaCgnvJJlnUgzrIoczqcMjIJc tBOQ== X-Received: by 10.129.56.8 with SMTP id f8mr2698192ywa.279.1451046103162; Fri, 25 Dec 2015 04:21:43 -0800 (PST) Received: from montanalag@gmail.com ([104.137.73.129]) by smtp.gmail.com with ESMTPSA id r190sm42602592ywd.53.2015.12.25.04.21.41 for (version=TLSv1/SSLv3 cipher=OTHER); Fri, 25 Dec 2015 04:21:41 -0800 (PST) To: bug-sed@gnu.org From: Brian Tew User-Agent: edbrowse/3.4.4 Subject: sed no longer removes high-ascii characters as it did formerly. Date: Fri, 25 Dec 2015 06:21:41 -0600 Message-ID: <20151125062141.montanalag@gmail.com> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Fri, 25 Dec 2015 13:51:38 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Brian Tew Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) Well, sometimes it do and sometimes it don't. Script started on Fri 25 Dec 2015 05:53:04 AM CS ~$ed sample 50 l subject now that thanksgiving has come and gone\342\246$ q ~$ ~$sed -i 's/[^a-z 0-9]//g' sample ~$ed sample 50 l subject now that thanksgiving has come and gone\342\246$ q ~$ ~$unsed --version sed (GNU sed) 4.2.2 Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later . This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Jay Fenlason, Tom Lord, Ken Pizzini, and Paolo Bonzini. GNU sed home page: . General help using GNU software: . E-mail bug reports to: . Be sure to include the word ``sed'' somewhere in the ``Subject:'' field. ~$exit Script done on Fri 25 Dec 2015 05:59:12 AM CS From debbugs-submit-bounces@debbugs.gnu.org Sat Dec 26 16:19:34 2015 Received: (at 22237-done) by debbugs.gnu.org; 26 Dec 2015 21:19:34 +0000 Received: from localhost ([127.0.0.1]:42198 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aCwFS-00023u-2C for submit@debbugs.gnu.org; Sat, 26 Dec 2015 16:19:34 -0500 Received: from mail-io0-f194.google.com ([209.85.223.194]:33272) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aCwFQ-00023i-Ip for 22237-done@debbugs.gnu.org; Sat, 26 Dec 2015 16:19:32 -0500 Received: by mail-io0-f194.google.com with SMTP id f127so21983781ioa.0 for <22237-done@debbugs.gnu.org>; Sat, 26 Dec 2015 13:19:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=SiALTdn5zdp7CTLgFIC5Kd4X9YYTXrDiaoveQWOmQDg=; b=e3Z84X1NojQUO3ugTIeakb+QqwvQNqGEuEgKYlAlpmAbJ1eERI/rfnbNWCF3D06+SW l66mH28mYJjbjmpxVlYXCCMH5OQN/mq5EHkFzzXl0uorbdIw1qEzNrhADS9CeZG6WxSO 2jrmPbSOhN88kzk+4WNv+PtkPzuTJ1jBz/xD2Gd/w2xA1Hs6QrgTrg5WE2PdJLtBJGBD TxB4GU+BhmUIwQi3XRjWV/RFUG9bhu7E6U1jKRRcv962/RmcZXC7cwO65K21z/3X11ZC V2HrOHrBQc99zo2aCPTNUifJXsVimc8suvL1GqRZExQB9TJUmcc1O3ua38aODNr4mQeh mRVg== X-Received: by 10.107.149.205 with SMTP id x196mr45074437iod.181.1451164767089; Sat, 26 Dec 2015 13:19:27 -0800 (PST) MIME-Version: 1.0 Received: by 10.36.10.18 with HTTP; Sat, 26 Dec 2015 13:19:07 -0800 (PST) In-Reply-To: <20151125062141.montanalag@gmail.com> References: <20151125062141.montanalag@gmail.com> From: Jim Meyering Date: Sat, 26 Dec 2015 13:19:07 -0800 X-Google-Sender-Auth: 0krLBDtTH1WtdrdOzf2B-fQiJYo Message-ID: Subject: Re: bug#22237: sed no longer removes high-ascii characters as it did formerly. To: Brian Tew Content-Type: text/plain; charset=UTF-8 X-Spam-Score: -0.4 (/) X-Debbugs-Envelope-To: 22237-done Cc: 22237-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.4 (/) On Fri, Dec 25, 2015 at 4:21 AM, Brian Tew wrote: > Well, sometimes it do and sometimes it don't. > > Script started on Fri 25 Dec 2015 05:53:04 AM CS > ~$ed sample > 50 > l > subject now that thanksgiving has come and gone\342\246$ > q > ~$ > ~$sed -i 's/[^a-z 0-9]//g' sample To remove all but the matched bytes, you probably want something like this instead: LC_ALL=C sed -i 's/[^[:alnum:] ]//' Note I've done two things: used LC_ALL=C to override your default locale (probably a UTF8 one), and to use [:alnum:] in place of that nonportable a-z range and 0-9. In general, with UTF8-based locales, a byte sequence like your \342\246 will match no regular expression, since it is not a valid UTF8 character. What probably changed is that older versions of sed did not properly handle multi-byte locales, or your other experience was using a single-byte locale. If you still think there is a problem with sed-4.22, please provide more detail and I'll reopen this issue. From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 27 13:25:00 2015 Received: (at control) by debbugs.gnu.org; 27 Dec 2015 18:25:00 +0000 Received: from localhost ([127.0.0.1]:44712 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aDG03-0003Mf-Vq for submit@debbugs.gnu.org; Sun, 27 Dec 2015 13:25:00 -0500 Received: from mail-io0-f178.google.com ([209.85.223.178]:34677) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aDG03-0003MQ-0V for control@debbugs.gnu.org; Sun, 27 Dec 2015 13:24:59 -0500 Received: by mail-io0-f178.google.com with SMTP id e126so290126360ioa.1 for ; Sun, 27 Dec 2015 10:24:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:message-id:subject:to:content-type; bh=mqifjXr6kWlInyx4N8ga5Qwoz3I3P0/bcVXodYk05vg=; b=NFFZfUvVFy+SsJROivKSe1Ky6l2BaOKJ8iBvpXyWJah3mbiruZl36ZeynsOc5nchoD qnrUSqqBKsmIG3uMl5noFLVW4UoWh6tqvzvsJ+K3AUpxBok+Fbmg1IZsDMZgNHpgQ7Cf DRwl4Yh5uwcDln8OIIP3La2o9JMb0tPFpQth6z7d473u7nqWXz/LQX7FMUaZd4E0alra cVaObwNokGqpdfnvoUbj0w7rX6IQneIAvvdaPF5bFIj41XnKUYeS/HpUiXJ9hb/TedCN tc+BmpU+DvfWC7zYXYBW/PR1VPB9i4Gh2NzTdoHCnbJs/p91UmwN44qtS+Pn3UGDLoyF 0tfA== X-Received: by 10.107.27.6 with SMTP id b6mr49771712iob.163.1451240693552; Sun, 27 Dec 2015 10:24:53 -0800 (PST) MIME-Version: 1.0 Received: by 10.36.10.18 with HTTP; Sun, 27 Dec 2015 10:24:33 -0800 (PST) From: Jim Meyering Date: Sun, 27 Dec 2015 10:24:33 -0800 X-Google-Sender-Auth: He7_qZncsZoI_jDR8bunULNfq1w Message-ID: Subject: To: GNU bug tracker automated control server Content-Type: text/plain; charset=UTF-8 X-Spam-Score: 1.6 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: tags 22237 notabug [...] Content analysis details: (1.6 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 HEADER_FROM_DIFFERENT_DOMAINS From and EnvelopeFrom 2nd level mail domains are different 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (meyering[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.223.178 listed in list.dnswl.org] -0.0 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [209.85.223.178 listed in wl.mailspike.net] 0.2 FREEMAIL_FORGED_FROMDOMAIN 2nd level domains in From and EnvelopeFrom freemail headers are different 0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid 2.0 BLANK_SUBJECT Subject is present but empty X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.6 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: tags 22237 notabug [...] Content analysis details: (1.6 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.223.178 listed in list.dnswl.org] -0.0 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [209.85.223.178 listed in wl.mailspike.net] 0.0 HEADER_FROM_DIFFERENT_DOMAINS From and EnvelopeFrom 2nd level mail domains are different 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (meyering[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record 0.2 FREEMAIL_FORGED_FROMDOMAIN 2nd level domains in From and EnvelopeFrom freemail headers are different 0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid 2.0 BLANK_SUBJECT Subject is present but empty tags 22237 notabug From unknown Fri Aug 15 04:04:44 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Mon, 25 Jan 2016 12:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator