From unknown Sun Jun 22 09:57:06 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#19985 <19985@debbugs.gnu.org> To: bug#19985 <19985@debbugs.gnu.org> Subject: Status: active locale impacts binary data detection? Reply-To: bug#19985 <19985@debbugs.gnu.org> Date: Sun, 22 Jun 2025 16:57:06 +0000 retitle 19985 active locale impacts binary data detection? reassign 19985 grep submitter 19985 Mike Frysinger severity 19985 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Mon Mar 02 21:00:07 2015 Received: (at submit) by debbugs.gnu.org; 3 Mar 2015 02:00:07 +0000 Received: from localhost ([127.0.0.1]:33238 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YSc7y-0006TZ-QK for submit@debbugs.gnu.org; Mon, 02 Mar 2015 21:00:07 -0500 Received: from eggs.gnu.org ([208.118.235.92]:46747) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YSc7w-0006SZ-8q for submit@debbugs.gnu.org; Mon, 02 Mar 2015 21:00:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YSc7q-0002y8-6Y for submit@debbugs.gnu.org; Mon, 02 Mar 2015 20:59:58 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:47147) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YSc7q-0002y4-3I for submit@debbugs.gnu.org; Mon, 02 Mar 2015 20:59:58 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48465) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YSc7p-0005AS-AH for bug-grep@gnu.org; Mon, 02 Mar 2015 20:59:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YSc7m-0002xV-43 for bug-grep@gnu.org; Mon, 02 Mar 2015 20:59:57 -0500 Received: from smtp.gentoo.org ([140.211.166.183]:41988) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YSc7l-0002xQ-Um for bug-grep@gnu.org; Mon, 02 Mar 2015 20:59:54 -0500 Received: from vapier (localhost [127.0.0.1]) by smtp.gentoo.org (Postfix) with SMTP id 4820D340954; Tue, 3 Mar 2015 01:59:51 +0000 (UTC) Date: Mon, 2 Mar 2015 20:59:51 -0500 From: Mike Frysinger To: bug-grep@gnu.org Subject: active locale impacts binary data detection? Message-ID: <20150303015951.GN24238@vapier> Mail-Followup-To: bug-grep@gnu.org, proteuss@sdf.lonestar.org MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="r/w8vo2lxBmCPGjQ" Content-Disposition: inline X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.3 (----) X-Debbugs-Envelope-To: submit Cc: proteuss@sdf.lonestar.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.3 (----) --r/w8vo2lxBmCPGjQ Content-Type: multipart/mixed; boundary="NT59pYSnj1ZLVgEN" Content-Disposition: inline --NT59pYSnj1ZLVgEN Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable i've got some users reporting diff behavior between 2.20 and 2.21. the exa= mple=20 file is attached and has mixed encoding. i think the new behavior is corre= ct,=20 but want to make sure it's expected, or that the maintainers don't have=20 different ideas here. with 2.20: $ LC_ALL=3Den_US.UTF8 grep 8 test-mixed 852 cd =CE=98=CE=95=CE=9C=CE=91=CE=A4=CE=91\ =CE=A0=CE=91=CE=9D=CE=95= =CE=9B=CE=9B=CE=97=CE=9D=CE=99=CE=A9=CE=9D/ with 2.21: $ LC_ALL=3Den_US.UTF8 grep 8 test-mixed Binary file test-mixed matches $ LC_ALL=3Den_US.UTF8 grep -a 8 test-mixed 852 cd =CE=98=CE=95=CE=9C=CE=91=CE=A4=CE=91\ =CE=A0=CE=91=CE=9D=CE=95= =CE=9B=CE=9B=CE=97=CE=9D=CE=99=CE=A9=CE=9D/ $ LC_ALL=3DC grep 8 test-mixed 852 cd =CE=98=CE=95=CE=9C=CE=91=CE=A4=CE=91\ =CE=A0=CE=91=CE=9D=CE=95= =CE=9B=CE=9B=CE=97=CE=9D=CE=99=CE=A9=CE=9D/ -mike --NT59pYSnj1ZLVgEN Content-Type: application/octet-stream Content-Disposition: attachment; filename="test-mixed.gz" Content-Transfer-Encoding: base64 H4sICCIV9VQAA3Rlc3QtbWl4ZWQAU1AwszRQUEhNzshXuPDww8OPDz8p1CgUp6Yo6BYpqBfr x9jo6cfkqOmnq3MpKFiYGikoJKcoKJybcW7quTnnJp5bcm5ijMK5BUDWXKDIbCCcDmTNPLfy 3Fx9LgCmzIJ8XAAAAA== --NT59pYSnj1ZLVgEN-- --r/w8vo2lxBmCPGjQ Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJU9RWXAAoJEEFjO5/oN/WB+y8P/jn8tUofIJZLQjQRXJQI1Lw9 Ww+SHsY4odhYXauJwKPXSeAgPM/R+2UOOaPvA1tkGj5y9p4973SCyJpDPbVy2O2g virGUAbZJeiUenYNPcSqzfQ6yqXS73BfW4ViOzNUK/829tJ4dSwIWmKaarme9wkH 5kdZWous/KemlunrW0vSlLfmtAJ1VUI4dWvqFpb18TVnyxFJYIqawE5vZBN3YWx3 RTi3rxLtocH4RWUrRScM8ApEnkZ+bTi/Pyi34G4llYPLFpBDhxRSiBKFo0YsDbbf X8sb5OPUzC6gqjrMDA0LWNv92BlJyAeJNVhnvu/kvub1+IgMtp5/dOXP2i9eBHgL yKaYGUdMeueeqZiNntO6tPOsbdRVcrPe8buZdhCIqRXHP08ker6oezK4io0n2Fwe olecqL/83Cd+q2SV1BXenzfBtPAzhQ+Ucj5OafAcZ+/NXorpW0apbICUf2HTlzY7 0Px5UtiTOGIMq1JLDet29mkw3FVjFeGrkjwtHd0VCbjCP8vW+La/nhM/prwAyhTy Ri68EQD7x4PwdLHgHTBQJnUgj0EQdBa4z0JoFXjc3e8qxPP9vrMqCQH9wXQn/Oav MqLWxUdNifl6IvmILos0zjy6lCmm2LVPVv5pkYVZgD41D+tkECsDEmQsDvVtC+3d Tl1wD/tZrWXSeb87rVAu =+3Ap -----END PGP SIGNATURE----- --r/w8vo2lxBmCPGjQ-- From debbugs-submit-bounces@debbugs.gnu.org Mon Mar 02 21:31:42 2015 Received: (at submit) by debbugs.gnu.org; 3 Mar 2015 02:31:42 +0000 Received: from localhost ([127.0.0.1]:33255 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YSccY-0007Gv-5a for submit@debbugs.gnu.org; Mon, 02 Mar 2015 21:31:42 -0500 Received: from eggs.gnu.org ([208.118.235.92]:55647) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YSccV-0007Ge-Uf for submit@debbugs.gnu.org; Mon, 02 Mar 2015 21:31:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YSccP-0006Mv-Qg for submit@debbugs.gnu.org; Mon, 02 Mar 2015 21:31:34 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:43080) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YSccP-0006Mr-Nw for submit@debbugs.gnu.org; Mon, 02 Mar 2015 21:31:33 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57362) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YSccO-0000bA-HG for bug-grep@gnu.org; Mon, 02 Mar 2015 21:31:33 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YSccL-0006Li-By for bug-grep@gnu.org; Mon, 02 Mar 2015 21:31:32 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]:49347) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YSccL-0006LY-6l for bug-grep@gnu.org; Mon, 02 Mar 2015 21:31:29 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id DCA8139E807B; Mon, 2 Mar 2015 18:31:27 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FWjVT0EnNTRI; Mon, 2 Mar 2015 18:31:27 -0800 (PST) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 5167F39E801A; Mon, 2 Mar 2015 18:31:27 -0800 (PST) Message-ID: <54F51CFF.2080101@cs.ucla.edu> Date: Mon, 02 Mar 2015 18:31:27 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: bug-grep@gnu.org, proteuss@sdf.lonestar.org Subject: Re: bug#19985: active locale impacts binary data detection? References: <20150303015951.GN24238@vapier> In-Reply-To: <20150303015951.GN24238@vapier> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) The new behavior is expected, and this is mentioned in the NEWS file: If a file contains data improperly encoded for the current locale, and this is discovered before any of the file's contents are output, grep now treats the file as binary. In some cases one can get the old behavior with 'grep -a'. This is not the first time the problem has been reported. Please see: http://bugs.gnu.org/19230 If the problem occurs often enough, perhaps we should change grep's behavior. For example, perhaps grep should fall back on the C locale if the first part of a file contains an encoding error but no NUL bytes. From debbugs-submit-bounces@debbugs.gnu.org Sat May 30 16:04:40 2015 Received: (at control) by debbugs.gnu.org; 30 May 2015 20:04:41 +0000 Received: from localhost ([127.0.0.1]:33783 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yymzo-0007X7-Fr for submit@debbugs.gnu.org; Sat, 30 May 2015 16:04:40 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:53645) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yymzm-0007Wi-MZ for control@debbugs.gnu.org; Sat, 30 May 2015 16:04:39 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 3C93E39E801B for ; Sat, 30 May 2015 13:04:33 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9SwSbBss7Jia for ; Sat, 30 May 2015 13:04:32 -0700 (PDT) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 4CC9C39E8016 for ; Sat, 30 May 2015 13:04:32 -0700 (PDT) Message-ID: <556A17D0.4000303@cs.ucla.edu> Date: Sat, 30 May 2015 13:04:32 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: control@debbugs.gnu.org Subject: grep bug maintainance Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) tag 20605 notabug close 20605 severity 20657 wishlist tag 20638 notabug close 20638 merge 20526 19985 19230 tag 19837 notabug close 19837 merge 16444 19777 close 19563 close 19486 tag 19330 notabug close 19330 tag 19193 notabug close 19193 tag 19071 notabug close 19071 tag 19005 notabug close 19005 close 19000 tag 18888 notabug close 18888 From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 25 14:04:27 2015 Received: (at control) by debbugs.gnu.org; 25 Sep 2015 18:04:27 +0000 Received: from localhost ([127.0.0.1]:43731 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZfXMB-0002Vk-HL for submit@debbugs.gnu.org; Fri, 25 Sep 2015 14:04:27 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:60076) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZfXM9-0002VX-Go for control@debbugs.gnu.org; Fri, 25 Sep 2015 14:04:25 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id E6EEF161131 for ; Fri, 25 Sep 2015 11:04:19 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 4Eb4D0PhnFAA for ; Fri, 25 Sep 2015 11:04:19 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 573ED1611B3 for ; Fri, 25 Sep 2015 11:04:19 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ETL_bYe9AWli for ; Fri, 25 Sep 2015 11:04:19 -0700 (PDT) Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 4188C161131 for ; Fri, 25 Sep 2015 11:04:19 -0700 (PDT) To: control@debbugs.gnu.org From: Paul Eggert Subject: merge 21558 into 20526 Organization: UCLA Computer Science Department Message-ID: <56058CA3.2010804@cs.ucla.edu> Date: Fri, 25 Sep 2015 11:04:19 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) merge 20526 21558 thanks From unknown Sun Jun 22 09:57:06 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 06 Feb 2016 12:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator