From unknown Fri Aug 15 20:50:57 2025 X-Loop: help-debbugs@gnu.org Subject: bug#51669: some patterns which should match 0x0 =?UTF-8?Q?don=E2=80=99t?= do so Resent-From: Christoph Anton Mitterer Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 07 Nov 2021 17:30:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 51669 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: 51669@debbugs.gnu.org X-Debbugs-Original-To: bug-grep@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.163630618023676 (code B ref -1); Sun, 07 Nov 2021 17:30:02 +0000 Received: (at submit) by debbugs.gnu.org; 7 Nov 2021 17:29:40 +0000 Received: from localhost ([127.0.0.1]:54563 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mjlz9-00069o-Ud for submit@debbugs.gnu.org; Sun, 07 Nov 2021 12:29:40 -0500 Received: from lists.gnu.org ([209.51.188.17]:45252) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mjlUQ-0005NM-2h for submit@debbugs.gnu.org; Sun, 07 Nov 2021 11:57:55 -0500 Received: from [2001:470:142:3::10] (port=46496 helo=eggs.gnu.org) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mjlUP-0004LU-QL for bug-grep@gnu.org; Sun, 07 Nov 2021 11:57:53 -0500 Received: from dragonfly.birch.relay.mailchannels.net ([23.83.209.51]:39842) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mjlUM-0007Z9-Bw for bug-grep@gnu.org; Sun, 07 Nov 2021 11:57:53 -0500 X-Sender-Id: instrampxe0y3a|x-authuser|calestyo@scientia.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 4F7199216BB for ; Sun, 7 Nov 2021 16:57:06 +0000 (UTC) Received: from cpanel-007-fra.hostingww.com (100-96-99-51.trex.outbound.svc.cluster.local [100.96.99.51]) (Authenticated sender: instrampxe0y3a) by relay.mailchannels.net (Postfix) with ESMTPA id 19647921AB6 for ; Sun, 7 Nov 2021 16:57:04 +0000 (UTC) X-Sender-Id: instrampxe0y3a|x-authuser|calestyo@scientia.org Received: from cpanel-007-fra.hostingww.com (cpanel-007-fra.hostingww.com [3.69.87.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384) by 100.96.99.51 (trex/6.4.3); Sun, 07 Nov 2021 16:57:06 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: instrampxe0y3a|x-authuser|calestyo@scientia.org X-MailChannels-Auth-Id: instrampxe0y3a X-Befitting-Thread: 65a3d13e2cdcfa5d_1636304225810_3980150401 X-MC-Loop-Signature: 1636304225810:1965725044 X-MC-Ingress-Time: 1636304225809 Received: from ppp-46-244-253-194.dynamic.mnet-online.de ([46.244.253.194]:55584 helo=heisenberg.fritz.box) by cpanel-007-fra.hostingww.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1mjlTV-00CS45-JR for bug-grep@gnu.org; Sun, 07 Nov 2021 16:57:02 +0000 Message-ID: From: Christoph Anton Mitterer Date: Sun, 07 Nov 2021 17:56:56 +0100 Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.42.0-2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-OutGoing-Spam-Status: No, score=-1.0 X-AuthUser: calestyo@scientia.org Received-SPF: pass client-ip=23.83.209.51; envelope-from=calestyo@scientia.org; helo=dragonfly.birch.relay.mailchannels.net X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, HAS_X_OUTGOING_SPAM_STAT=0.378, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Mailman-Approved-At: Sun, 07 Nov 2021 12:29:38 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) Hey. Maybe this is no a bug at all due grep rather being focused on text files and 0x0 being special anyway, but just for your information: $ hd test-with-0x00-and-0x02 00000000 66 6f 6f 0a 62 61 72 0a 7a 65 02 00 0a 62 61 7a |foo.bar.ze...baz| 00000010 0a 7a 65 72 00 0a 65 6e 64 0a |.zer..end.| 0000001a If one now does: $ grep '[^[:alnum:][:space:][:punct:]]' test-with-0x00-and-0x02 grep: test-with-0x00-and-0x02: binary file matches it matches, presumably only the 0x02, though. Having only 0x00 in the file: $ hd test-with-0x00-only 00000000 66 6f 6f 0a 62 61 72 0a 7a 65 72 00 0a 62 61 7a |foo.bar.zer..baz| 00000010 0a 7a 65 72 00 0a 65 6e 64 0a |.zer..end.| 0000001a doesn’t cause a match: $ grep '[^[:alnum:][:space:][:punct:]]' test-with-0x00-only $ while naively I'd have assume that 0x00 should be matched as well. Cheers, Chris. From unknown Fri Aug 15 20:50:57 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Christoph Anton Mitterer Subject: bug#51669: closed (Re: bug#51669: some patterns which should match 0x0 =?UTF-8?Q?don=E2=80=99t?= do so) Message-ID: References: <1158bc7b-f8d4-e692-1af0-1767edf7b630@cs.ucla.edu> X-Gnu-PR-Message: they-closed 51669 X-Gnu-PR-Package: grep Reply-To: 51669@debbugs.gnu.org Date: Mon, 08 Nov 2021 00:38:01 +0000 Content-Type: multipart/mixed; boundary="----------=_1636331881-26982-1" This is a multi-part message in MIME format... ------------=_1636331881-26982-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #51669: some patterns which should match 0x0 don=E2=80=99t do so which was filed against the grep package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 51669@debbugs.gnu.org. --=20 51669: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D51669 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1636331881-26982-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 51669-done) by debbugs.gnu.org; 8 Nov 2021 00:37:26 +0000 Received: from localhost ([127.0.0.1]:55323 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mjsf8-00070P-9E for submit@debbugs.gnu.org; Sun, 07 Nov 2021 19:37:26 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:50074) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mjsf6-00070C-MG for 51669-done@debbugs.gnu.org; Sun, 07 Nov 2021 19:37:25 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 320C01600BB; Sun, 7 Nov 2021 16:37:19 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id mkE1D--_reaa; Sun, 7 Nov 2021 16:37:18 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 5F5A61600C4; Sun, 7 Nov 2021 16:37:18 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id a0g2STwKjVAb; Sun, 7 Nov 2021 16:37:18 -0800 (PST) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 3BC191600BB; Sun, 7 Nov 2021 16:37:18 -0800 (PST) Message-ID: <1158bc7b-f8d4-e692-1af0-1767edf7b630@cs.ucla.edu> Date: Sun, 7 Nov 2021 16:37:17 -0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.1 Subject: =?UTF-8?Q?Re=3a_bug=2351669=3a_some_patterns_which_should_match_0x0?= =?UTF-8?Q?_don=e2=80=99t_do_so?= Content-Language: en-US To: Christoph Anton Mitterer References: From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 51669-done Cc: 51669-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) That's a feature not a bug; see: https://www.gnu.org/software/grep/manual/html_node/File-and-Directory-Selection.html and look for --binary-files. You can use 'grep -a' to pay more attention to binary data. ------------=_1636331881-26982-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 7 Nov 2021 17:29:40 +0000 Received: from localhost ([127.0.0.1]:54563 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mjlz9-00069o-Ud for submit@debbugs.gnu.org; Sun, 07 Nov 2021 12:29:40 -0500 Received: from lists.gnu.org ([209.51.188.17]:45252) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mjlUQ-0005NM-2h for submit@debbugs.gnu.org; Sun, 07 Nov 2021 11:57:55 -0500 Received: from [2001:470:142:3::10] (port=46496 helo=eggs.gnu.org) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mjlUP-0004LU-QL for bug-grep@gnu.org; Sun, 07 Nov 2021 11:57:53 -0500 Received: from dragonfly.birch.relay.mailchannels.net ([23.83.209.51]:39842) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mjlUM-0007Z9-Bw for bug-grep@gnu.org; Sun, 07 Nov 2021 11:57:53 -0500 X-Sender-Id: instrampxe0y3a|x-authuser|calestyo@scientia.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 4F7199216BB for ; Sun, 7 Nov 2021 16:57:06 +0000 (UTC) Received: from cpanel-007-fra.hostingww.com (100-96-99-51.trex.outbound.svc.cluster.local [100.96.99.51]) (Authenticated sender: instrampxe0y3a) by relay.mailchannels.net (Postfix) with ESMTPA id 19647921AB6 for ; Sun, 7 Nov 2021 16:57:04 +0000 (UTC) X-Sender-Id: instrampxe0y3a|x-authuser|calestyo@scientia.org Received: from cpanel-007-fra.hostingww.com (cpanel-007-fra.hostingww.com [3.69.87.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384) by 100.96.99.51 (trex/6.4.3); Sun, 07 Nov 2021 16:57:06 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: instrampxe0y3a|x-authuser|calestyo@scientia.org X-MailChannels-Auth-Id: instrampxe0y3a X-Befitting-Thread: 65a3d13e2cdcfa5d_1636304225810_3980150401 X-MC-Loop-Signature: 1636304225810:1965725044 X-MC-Ingress-Time: 1636304225809 Received: from ppp-46-244-253-194.dynamic.mnet-online.de ([46.244.253.194]:55584 helo=heisenberg.fritz.box) by cpanel-007-fra.hostingww.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1mjlTV-00CS45-JR for bug-grep@gnu.org; Sun, 07 Nov 2021 16:57:02 +0000 Message-ID: Subject: some patterns which should match 0x0 =?UTF-8?Q?don=E2=80=99t?= do so From: Christoph Anton Mitterer To: bug-grep@gnu.org Date: Sun, 07 Nov 2021 17:56:56 +0100 Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.42.0-2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-OutGoing-Spam-Status: No, score=-1.0 X-AuthUser: calestyo@scientia.org Received-SPF: pass client-ip=23.83.209.51; envelope-from=calestyo@scientia.org; helo=dragonfly.birch.relay.mailchannels.net X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, HAS_X_OUTGOING_SPAM_STAT=0.378, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Sun, 07 Nov 2021 12:29:38 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) Hey. Maybe this is no a bug at all due grep rather being focused on text files and 0x0 being special anyway, but just for your information: $ hd test-with-0x00-and-0x02 00000000 66 6f 6f 0a 62 61 72 0a 7a 65 02 00 0a 62 61 7a |foo.bar.ze...baz| 00000010 0a 7a 65 72 00 0a 65 6e 64 0a |.zer..end.| 0000001a If one now does: $ grep '[^[:alnum:][:space:][:punct:]]' test-with-0x00-and-0x02 grep: test-with-0x00-and-0x02: binary file matches it matches, presumably only the 0x02, though. Having only 0x00 in the file: $ hd test-with-0x00-only 00000000 66 6f 6f 0a 62 61 72 0a 7a 65 72 00 0a 62 61 7a |foo.bar.zer..baz| 00000010 0a 7a 65 72 00 0a 65 6e 64 0a |.zer..end.| 0000001a doesn’t cause a match: $ grep '[^[:alnum:][:space:][:punct:]]' test-with-0x00-only $ while naively I'd have assume that 0x00 should be matched as well. Cheers, Chris. ------------=_1636331881-26982-1-- From unknown Fri Aug 15 20:50:57 2025 X-Loop: help-debbugs@gnu.org Subject: bug#51669: some patterns which should match 0x0 =?UTF-8?Q?don=E2=80=99t?= do so Resent-From: Christoph Anton Mitterer Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Mon, 08 Nov 2021 00:46:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 51669 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Paul Eggert Cc: 51669@debbugs.gnu.org Received: via spool by 51669-submit@debbugs.gnu.org id=B51669.163633232027678 (code B ref 51669); Mon, 08 Nov 2021 00:46:01 +0000 Received: (at 51669) by debbugs.gnu.org; 8 Nov 2021 00:45:20 +0000 Received: from localhost ([127.0.0.1]:55331 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mjsmm-0007CM-3C for submit@debbugs.gnu.org; Sun, 07 Nov 2021 19:45:20 -0500 Received: from dog.elm.relay.mailchannels.net ([23.83.212.48]:63112) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mjsmi-0007CB-Nl for 51669@debbugs.gnu.org; Sun, 07 Nov 2021 19:45:19 -0500 X-Sender-Id: instrampxe0y3a|x-authuser|calestyo@scientia.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 78DB7620B3C; Mon, 8 Nov 2021 00:45:15 +0000 (UTC) Received: from cpanel-007-fra.hostingww.com (unknown [127.0.0.6]) (Authenticated sender: instrampxe0y3a) by relay.mailchannels.net (Postfix) with ESMTPA id 521D9620541; Mon, 8 Nov 2021 00:45:14 +0000 (UTC) X-Sender-Id: instrampxe0y3a|x-authuser|calestyo@scientia.org Received: from cpanel-007-fra.hostingww.com (cpanel-007-fra.hostingww.com [3.69.87.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384) by 100.114.196.210 (trex/6.4.3); Mon, 08 Nov 2021 00:45:15 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: instrampxe0y3a|x-authuser|calestyo@scientia.org X-MailChannels-Auth-Id: instrampxe0y3a X-Squirrel-Lyrical: 454d89a93f026ec5_1636332315150_3216985146 X-MC-Loop-Signature: 1636332315150:51492463 X-MC-Ingress-Time: 1636332315150 Received: from ppp-46-244-253-194.dynamic.mnet-online.de ([46.244.253.194]:55594 helo=heisenberg.fritz.box) by cpanel-007-fra.hostingww.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1mjsmZ-00DM6X-Tg; Mon, 08 Nov 2021 00:45:12 +0000 Message-ID: <027ae576523148e00a5643735180a2888a2a9f57.camel@scientia.org> From: Christoph Anton Mitterer Date: Mon, 08 Nov 2021 01:45:06 +0100 In-Reply-To: <1158bc7b-f8d4-e692-1af0-1767edf7b630@cs.ucla.edu> References: <1158bc7b-f8d4-e692-1af0-1767edf7b630@cs.ucla.edu> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.42.0-2 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-OutGoing-Spam-Status: No, score=-1.0 X-AuthUser: calestyo@scientia.org X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On Sun, 2021-11-07 at 16:37 -0800, Paul Eggert wrote: > https://www.gnu.org/software/grep/manual/html_node/File-and-Directory-Selection.html > > and look for --binary-files. You can use 'grep -a' to pay more > attention > to binary data. Well I've had seen that, but why is 0x00 different from 0x02? As shown in the example above, even *without* -a or similar, it would detect 0x02. That's what feels a bit strange, IMO. Cheers, Chris. From unknown Fri Aug 15 20:50:57 2025 X-Loop: help-debbugs@gnu.org Subject: bug#51669: some patterns which should match 0x0 =?UTF-8?Q?don=E2=80=99t?= do so Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Mon, 08 Nov 2021 07:21:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 51669 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Christoph Anton Mitterer Cc: 51669@debbugs.gnu.org Received: via spool by 51669-submit@debbugs.gnu.org id=B51669.16363560343624 (code B ref 51669); Mon, 08 Nov 2021 07:21:02 +0000 Received: (at 51669) by debbugs.gnu.org; 8 Nov 2021 07:20:34 +0000 Received: from localhost ([127.0.0.1]:55931 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mjyxG-0000wO-KA for submit@debbugs.gnu.org; Mon, 08 Nov 2021 02:20:34 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:57256) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mjyxE-0000wB-MP for 51669@debbugs.gnu.org; Mon, 08 Nov 2021 02:20:33 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 75A8C1600C4; Sun, 7 Nov 2021 23:20:26 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id IvnJQZjvpSi4; Sun, 7 Nov 2021 23:20:25 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 47AB01600FD; Sun, 7 Nov 2021 23:20:25 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 6KET-2gSC85U; Sun, 7 Nov 2021 23:20:25 -0800 (PST) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 225D81600C4; Sun, 7 Nov 2021 23:20:25 -0800 (PST) Message-ID: <32661210-2879-55ef-8801-882638e35180@cs.ucla.edu> Date: Sun, 7 Nov 2021 23:20:24 -0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.1 Content-Language: en-US References: <1158bc7b-f8d4-e692-1af0-1767edf7b630@cs.ucla.edu> <027ae576523148e00a5643735180a2888a2a9f57.camel@scientia.org> From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: <027ae576523148e00a5643735180a2888a2a9f57.camel@scientia.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.4 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) On 11/7/21 16:45, Christoph Anton Mitterer wrote: > why is 0x00 different from 0x02? POSIX says text files cannot contain NUL bytes. They can contain 0x02 bytes, though. https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_403 More generally, in a POSIX system any method for deciding whether a file is text vs binary is to some extent a heuristic, and there will always be corners in any such heuristic. From unknown Fri Aug 15 20:50:57 2025 X-Loop: help-debbugs@gnu.org Subject: bug#51669: some patterns which should match 0x0 =?UTF-8?Q?don=E2=80=99t?= do so Resent-From: Christoph Anton Mitterer Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Mon, 08 Nov 2021 15:01:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 51669 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Paul Eggert Cc: 51669@debbugs.gnu.org Received: via spool by 51669-submit@debbugs.gnu.org id=B51669.163638360924298 (code B ref 51669); Mon, 08 Nov 2021 15:01:01 +0000 Received: (at 51669) by debbugs.gnu.org; 8 Nov 2021 15:00:09 +0000 Received: from localhost ([127.0.0.1]:59087 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mk680-0006Jq-NL for submit@debbugs.gnu.org; Mon, 08 Nov 2021 10:00:08 -0500 Received: from cow.ash.relay.mailchannels.net ([23.83.222.41]:12504) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mk67w-0006Hp-RX for 51669@debbugs.gnu.org; Mon, 08 Nov 2021 10:00:07 -0500 X-Sender-Id: instrampxe0y3a|x-authuser|calestyo@scientia.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 807AF362632; Mon, 8 Nov 2021 15:00:00 +0000 (UTC) Received: from cpanel-007-fra.hostingww.com (100-96-3-13.trex.outbound.svc.cluster.local [100.96.3.13]) (Authenticated sender: instrampxe0y3a) by relay.mailchannels.net (Postfix) with ESMTPA id 4A95F36252C; Mon, 8 Nov 2021 14:59:58 +0000 (UTC) X-Sender-Id: instrampxe0y3a|x-authuser|calestyo@scientia.org Received: from cpanel-007-fra.hostingww.com (cpanel-007-fra.hostingww.com [3.69.87.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384) by 100.96.3.13 (trex/6.4.3); Mon, 08 Nov 2021 14:59:59 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: instrampxe0y3a|x-authuser|calestyo@scientia.org X-MailChannels-Auth-Id: instrampxe0y3a X-Tasty-Abortive: 5089d6a32e9eca63_1636383599134_3449305653 X-MC-Loop-Signature: 1636383599134:3253219714 X-MC-Ingress-Time: 1636383599134 Received: from ppp-46-244-253-194.dynamic.mnet-online.de ([46.244.253.194]:58562 helo=heisenberg.fritz.box) by cpanel-007-fra.hostingww.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1mk67k-00EwiT-E3; Mon, 08 Nov 2021 14:59:56 +0000 Message-ID: <70885107f477b7b8486203fae7aa5b1b09caccf3.camel@scientia.org> From: Christoph Anton Mitterer Date: Mon, 08 Nov 2021 15:59:51 +0100 In-Reply-To: <32661210-2879-55ef-8801-882638e35180@cs.ucla.edu> References: <1158bc7b-f8d4-e692-1af0-1767edf7b630@cs.ucla.edu> <027ae576523148e00a5643735180a2888a2a9f57.camel@scientia.org> <32661210-2879-55ef-8801-882638e35180@cs.ucla.edu> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.42.0-2 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-OutGoing-Spam-Status: No, score=-1.0 X-AuthUser: calestyo@scientia.org X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On Sun, 2021-11-07 at 23:20 -0800, Paul Eggert wrote: > POSIX says text files cannot contain NUL bytes. They can contain 0x02 > bytes, though. Well yes, that' clear... but at least the console output in the 0x02 case seems to imply that grep already considers it binary (and not text file). So I thought it would make sense to do the same if it encounters 0x00. Anyway... thanks for your help :-) Cheers, Chris. From unknown Fri Aug 15 20:50:57 2025 X-Loop: help-debbugs@gnu.org Subject: bug#51669: some patterns which should match 0x0 =?UTF-8?Q?don=E2=80=99t?= do so Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Mon, 08 Nov 2021 19:06:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 51669 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Christoph Anton Mitterer Cc: 51669@debbugs.gnu.org Received: via spool by 51669-submit@debbugs.gnu.org id=B51669.163639835211067 (code B ref 51669); Mon, 08 Nov 2021 19:06:01 +0000 Received: (at 51669) by debbugs.gnu.org; 8 Nov 2021 19:05:52 +0000 Received: from localhost ([127.0.0.1]:59500 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mk9xn-0002sR-Lh for submit@debbugs.gnu.org; Mon, 08 Nov 2021 14:05:52 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:59226) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mk9xj-0002s9-JL for 51669@debbugs.gnu.org; Mon, 08 Nov 2021 14:05:51 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 7131C1600AE; Mon, 8 Nov 2021 11:05:41 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id gVVzpDdLnmXD; Mon, 8 Nov 2021 11:05:40 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id CE2BD1600FD; Mon, 8 Nov 2021 11:05:40 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Z8bsYpCvi1gu; Mon, 8 Nov 2021 11:05:40 -0800 (PST) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id A7B8E1600AE; Mon, 8 Nov 2021 11:05:40 -0800 (PST) Message-ID: <63fddb08-8724-60b7-11c6-ceb59a98feb5@cs.ucla.edu> Date: Mon, 8 Nov 2021 11:05:40 -0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.1 Content-Language: en-US References: <1158bc7b-f8d4-e692-1af0-1767edf7b630@cs.ucla.edu> <027ae576523148e00a5643735180a2888a2a9f57.camel@scientia.org> <32661210-2879-55ef-8801-882638e35180@cs.ucla.edu> <70885107f477b7b8486203fae7aa5b1b09caccf3.camel@scientia.org> From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: <70885107f477b7b8486203fae7aa5b1b09caccf3.camel@scientia.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.4 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) On 11/8/21 06:59, Christoph Anton Mitterer wrote: > the console output in the 0x02 > case seems to imply that grep already considers it binary (and not text > file). > So I thought it would make sense to do the same if it encounters 0x00. No, because grep is documented to treat 0x00 like newline in some cases, such as the case you described. It does not treat 0x02 like newline.