From unknown Sun Jul 27 03:52:29 2025 X-Loop: help-debbugs@gnu.org Subject: bug#22461: problem with "Binary file messages" in latest snapshot Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Mon, 25 Jan 2016 09:21:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 22461 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: 22461@debbugs.gnu.org X-Debbugs-Original-To: grep mailing list Received: via spool by submit@debbugs.gnu.org id=B.14537136044604 (code B ref -1); Mon, 25 Jan 2016 09:21:02 +0000 Received: (at submit) by debbugs.gnu.org; 25 Jan 2016 09:20:04 +0000 Received: from localhost ([127.0.0.1]:35461 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aNdJc-0001CB-Jo for submit@debbugs.gnu.org; Mon, 25 Jan 2016 04:20:04 -0500 Received: from eggs.gnu.org ([208.118.235.92]:54148) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aNdJa-0001Bf-WA for submit@debbugs.gnu.org; Mon, 25 Jan 2016 04:20:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aNdJU-0004IY-V9 for submit@debbugs.gnu.org; Mon, 25 Jan 2016 04:19:57 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:55693) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aNdJU-0004IU-SI for submit@debbugs.gnu.org; Mon, 25 Jan 2016 04:19:56 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45113) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aNdJU-00048p-0h for bug-grep@gnu.org; Mon, 25 Jan 2016 04:19:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aNdJO-0004I3-Rk for bug-grep@gnu.org; Mon, 25 Jan 2016 04:19:55 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:33636) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aNdJO-0004HO-LR for bug-grep@gnu.org; Mon, 25 Jan 2016 04:19:50 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id F407D1605E1 for ; Mon, 25 Jan 2016 01:19:47 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id WOpsnStqchKn for ; Mon, 25 Jan 2016 01:19:47 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 572031607DA for ; Mon, 25 Jan 2016 01:19:47 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id PVA7mpHus5Z2 for ; Mon, 25 Jan 2016 01:19:47 -0800 (PST) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 3BE4A1605E1 for ; Mon, 25 Jan 2016 01:19:47 -0800 (PST) From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <56A5E8AE.3090709@cs.ucla.edu> Date: Mon, 25 Jan 2016 01:19:42 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) I ran into this problem when using 'grep' to search through GNU Emacs installed files. Here's how to reproduce the problem: $ (echo xxx && yes yyy | sed 100000q && printf '\0') >big $ grep xxx big xxx Binary file big matches The last line should not be output. I'll look into fixing this. From unknown Sun Jul 27 03:52:29 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Paul Eggert Subject: bug#22461: closed (Re: problem with "Binary file messages" in latest snapshot) Message-ID: References: <56AF09C6.8010408@cs.ucla.edu> <56A5E8AE.3090709@cs.ucla.edu> X-Gnu-PR-Message: they-closed 22461 X-Gnu-PR-Package: grep Reply-To: 22461@debbugs.gnu.org Date: Mon, 01 Feb 2016 07:32:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1454311922-21829-1" This is a multi-part message in MIME format... ------------=_1454311922-21829-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #22461: problem with "Binary file messages" in latest snapshot which was filed against the grep package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 22461@debbugs.gnu.org. --=20 22461: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D22461 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1454311922-21829-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 22461-done) by debbugs.gnu.org; 1 Feb 2016 07:31:28 +0000 Received: from localhost ([127.0.0.1]:43833 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aQ8xM-0005fF-2f for submit@debbugs.gnu.org; Mon, 01 Feb 2016 02:31:28 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:49630) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aQ8xK-0005f0-4e for 22461-done@debbugs.gnu.org; Mon, 01 Feb 2016 02:31:26 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id A34D3160F53 for <22461-done@debbugs.gnu.org>; Sun, 31 Jan 2016 23:31:20 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id tEFkCEj7Y86g for <22461-done@debbugs.gnu.org>; Sun, 31 Jan 2016 23:31:18 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id B60DF160F54 for <22461-done@debbugs.gnu.org>; Sun, 31 Jan 2016 23:31:18 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Ngob9jaEW7Ht for <22461-done@debbugs.gnu.org>; Sun, 31 Jan 2016 23:31:18 -0800 (PST) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 9539E160F53 for <22461-done@debbugs.gnu.org>; Sun, 31 Jan 2016 23:31:18 -0800 (PST) To: 22461-done@debbugs.gnu.org From: Paul Eggert Subject: Re: problem with "Binary file messages" in latest snapshot Organization: UCLA Computer Science Department Message-ID: <56AF09C6.8010408@cs.ucla.edu> Date: Sun, 31 Jan 2016 23:31:18 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------080407050007030205010604" X-Spam-Score: -0.6 (/) X-Debbugs-Envelope-To: 22461-done X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) This is a multi-part message in MIME format. --------------080407050007030205010604 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit I installed the attached patch, which should fix the bug, and am closing this. --------------080407050007030205010604 Content-Type: text/x-diff; name="0001-Omit-excess-Binary-file-.-matches.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0001-Omit-excess-Binary-file-.-matches.patch" >From 1d6609c299d2a51747c9bc9e82a399d53c54f8ea Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sun, 31 Jan 2016 23:29:01 -0800 Subject: [PATCH] Omit excess "Binary file ... matches" Problem reported in: http://bugs.gnu.org/22461 * src/grep.c (grep): Don't report "Binary file ... matches" merely because the file contained both matches and binary data. Insist that the binary data contained a match. * tests/null-byte: Add a test for this. --- src/grep.c | 16 +++++++++++----- tests/null-byte | 5 +++++ 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/src/grep.c b/src/grep.c index 10aabf9..73c3651 100644 --- a/src/grep.c +++ b/src/grep.c @@ -1373,7 +1373,11 @@ grep (int fd, struct stat const *st) char nul_zapper = '\0'; bool done_on_match_0 = done_on_match; bool out_quiet_0 = out_quiet; - bool has_nulls = false; + + /* The value of NLINES when nulls were first deduced in the input; + this is not necessarily the same as the number of matching lines + before the first null. -1 if no input nulls have been deduced. */ + intmax_t nlines_first_null = -1; if (! reset (fd, st)) return 0; @@ -1400,15 +1404,15 @@ grep (int fd, struct stat const *st) for (bool firsttime = true; ; firsttime = false) { - if (!has_nulls && eol && binary_files != TEXT_BINARY_FILES + if (nlines_first_null < 0 && eol && binary_files != TEXT_BINARY_FILES && (buf_has_nulls (bufbeg, buflim - bufbeg) || (firsttime && file_must_have_nulls (buflim - bufbeg, fd, st)))) { - has_nulls = true; if (binary_files == WITHOUT_MATCH_BINARY_FILES) return 0; if (!count_matches) done_on_match = out_quiet = true; + nlines_first_null = nlines; nul_zapper = eol; skip_nuls = skip_empty_lines; } @@ -1445,7 +1449,8 @@ grep (int fd, struct stat const *st) nlines += grepbuf (beg, lim); if (pending) prpending (lim); - if ((!outleft && !pending) || (nlines && done_on_match)) + if ((!outleft && !pending) + || (done_on_match && MAX (0, nlines_first_null) < nlines)) goto finish_grep; } @@ -1490,7 +1495,8 @@ grep (int fd, struct stat const *st) finish_grep: done_on_match = done_on_match_0; out_quiet = out_quiet_0; - if ((has_nulls || encoding_error_output) && !out_quiet && nlines != 0) + if (!out_quiet && (encoding_error_output + || (0 <= nlines_first_null && nlines_first_null < nlines))) { printf (_("Binary file %s matches\n"), filename); if (line_buffered) diff --git a/tests/null-byte b/tests/null-byte index 44dad92..9a76887 100755 --- a/tests/null-byte +++ b/tests/null-byte @@ -51,4 +51,9 @@ for left in '' a '#' '\0'; do done done +(echo xxx && yes yyy | sed 100000q && printf '\0') >in || framework_failure_ +echo xxx >exp || framework_failure_ +grep xxx in >out || fail=1 +compare exp out || fail=1 + Exit $fail -- 2.5.0 --------------080407050007030205010604-- ------------=_1454311922-21829-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 25 Jan 2016 09:20:04 +0000 Received: from localhost ([127.0.0.1]:35461 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aNdJc-0001CB-Jo for submit@debbugs.gnu.org; Mon, 25 Jan 2016 04:20:04 -0500 Received: from eggs.gnu.org ([208.118.235.92]:54148) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aNdJa-0001Bf-WA for submit@debbugs.gnu.org; Mon, 25 Jan 2016 04:20:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aNdJU-0004IY-V9 for submit@debbugs.gnu.org; Mon, 25 Jan 2016 04:19:57 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:55693) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aNdJU-0004IU-SI for submit@debbugs.gnu.org; Mon, 25 Jan 2016 04:19:56 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45113) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aNdJU-00048p-0h for bug-grep@gnu.org; Mon, 25 Jan 2016 04:19:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aNdJO-0004I3-Rk for bug-grep@gnu.org; Mon, 25 Jan 2016 04:19:55 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:33636) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aNdJO-0004HO-LR for bug-grep@gnu.org; Mon, 25 Jan 2016 04:19:50 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id F407D1605E1 for ; Mon, 25 Jan 2016 01:19:47 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id WOpsnStqchKn for ; Mon, 25 Jan 2016 01:19:47 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 572031607DA for ; Mon, 25 Jan 2016 01:19:47 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id PVA7mpHus5Z2 for ; Mon, 25 Jan 2016 01:19:47 -0800 (PST) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 3BE4A1605E1 for ; Mon, 25 Jan 2016 01:19:47 -0800 (PST) To: grep mailing list From: Paul Eggert Subject: problem with "Binary file messages" in latest snapshot Organization: UCLA Computer Science Department Message-ID: <56A5E8AE.3090709@cs.ucla.edu> Date: Mon, 25 Jan 2016 01:19:42 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) I ran into this problem when using 'grep' to search through GNU Emacs installed files. Here's how to reproduce the problem: $ (echo xxx && yes yyy | sed 100000q && printf '\0') >big $ grep xxx big xxx Binary file big matches The last line should not be output. I'll look into fixing this. ------------=_1454311922-21829-1-- From unknown Sun Jul 27 03:52:29 2025 X-Loop: help-debbugs@gnu.org Subject: bug#22461: bug#22443: Subject: new snapshot available: grep-2.22.31-8b6a Resent-From: Jim Meyering Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Mon, 01 Feb 2016 16:21:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 22461 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Paul Eggert Cc: 22461@debbugs.gnu.org Received: via spool by 22461-submit@debbugs.gnu.org id=B22461.145434364315597 (code B ref 22461); Mon, 01 Feb 2016 16:21:02 +0000 Received: (at 22461) by debbugs.gnu.org; 1 Feb 2016 16:20:43 +0000 Received: from localhost ([127.0.0.1]:45337 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aQHDW-00043V-TZ for submit@debbugs.gnu.org; Mon, 01 Feb 2016 11:20:43 -0500 Received: from mail-oi0-f46.google.com ([209.85.218.46]:35445) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aQHDU-00042z-Uh for 22461@debbugs.gnu.org; Mon, 01 Feb 2016 11:20:41 -0500 Received: by mail-oi0-f46.google.com with SMTP id p187so92750066oia.2 for <22461@debbugs.gnu.org>; Mon, 01 Feb 2016 08:20:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=+YwM9p4sUPbRsY5/Mpkq789+sk7Cgg/bS8D+TD7Cu0c=; b=kiOcRIcnzaIolL+9rRiWeDYadKgN39QYwMjNz/uA6uI37GjkG170Za98e7i24ZGiUt lZkVdfLY3YENoqd2HyB+5YRR6EyNBLIDXoSHVQcnpMhjB/1caronwaz/JWS9bZGJ3rmQ 6TdUDr9v9e28jTcYLCrDtNNi+jjSRwdQFAiA1Ubs39akWPlrdbceRqy/WwF8BFYKNtsg NNxFSyu3x6QCBI0uB33zkJVdcMsNMGKQ07y3rOicHOZy7dE7r32pZpqY7x/JljxY7ysf 8MAzTOpKhvL5VXm3ljR1ZJg9kbqNefwCU5Pz9sR9Gk70JGrcXLZAr2FeceN8Ma7BVGoA 5LLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc:content-type; bh=+YwM9p4sUPbRsY5/Mpkq789+sk7Cgg/bS8D+TD7Cu0c=; b=PjWhUVTCRNLOlNlEFpEA6vry0R7lz4PAzcMmywGFYMR3rCSV+LStn4Tl6k4FJiuXXI JLqo3IEdG/QuLu3Xn7kESqrLTUCwRsOLW5B+18k+vpp0eqTnb30B2cNBdoB2ZqhEMwxD jJ2S/TGUbHJ1Dx03MH+aqoK7CY7icnlc0Zj8BtQRwVIkSowF6GalzkQiyXk7jyFAz+3w 17aPd95Z5JzBF3PhjqWopjKoSbA+9/DVpe9ezZm2r6iWX7xm4W/brXZN+vFxTnGiS4dF 9h72npYb2CJa4IP50oxItg0eUy4lOqh42XTzADp7aYLTYEfoqNVBd/el+iiselHLoXBH ZUWA== X-Gm-Message-State: AG10YORyI7IIDJ4vlVx6b/UnjD6LKs8z5fjEeJehxQp4KjoycoZ9buYMygBvyhaOoLDGIT1cS791xGaujg9s9w== X-Received: by 10.202.49.211 with SMTP id x202mr8095174oix.130.1454343635155; Mon, 01 Feb 2016 08:20:35 -0800 (PST) MIME-Version: 1.0 Received: by 10.202.64.134 with HTTP; Mon, 1 Feb 2016 08:20:15 -0800 (PST) In-Reply-To: <56AF097A.3000703@cs.ucla.edu> References: <56AAC17C.7060307@cs.ucla.edu> <56AF097A.3000703@cs.ucla.edu> From: Jim Meyering Date: Mon, 1 Feb 2016 08:20:15 -0800 X-Google-Sender-Auth: UqEVOOS84ppYMfwlp3Lh8yVl5CE Message-ID: Content-Type: text/plain; charset=UTF-8 X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Sun, Jan 31, 2016 at 11:30 PM, Paul Eggert wrote: > Jim Meyering wrote: >> >> I looked into it and do not see a way to fix it with reasonable cost. >> When grep finds a NUL byte, it punts on the entire block. > > > Hmm, I think of it differently. When grep finds a NUL it records that the > file is binary but if no match has been found so far, it keeps looking for > the first match, either in the block containing the NUL or in a later block. > In this case grep stops reading the file only after finding a match, and it > works correctly. > > The problem occurs if grep finds one or more text matches before the first > NUL: it reports those matches, records the fact that it found them, then > sees the NUL, then stops reading that file, and then the calling code > notices that this was (1) a binary file that (2) contained matches, so > outputs the "Binary file ... matches" message. This is wrong, because no > binary data actually matched. > > When grep finds a NUL, it should record that the file is binary and then > look for one more match after the NUL, then quit reading the file and report > "Binary file ... matches" only if it found that one more match. > > The code is complicated by the fact that the file could also be binary > because an output line contains an encoding error, something that's detected > in a different part of the code. > > Argh, I'm taking too long to explain this. It's easier to fix than to > explain. I installed a patch; what do you think? Oh, I see, now. Nicely done. I noticed only now that I replied to you off-list. Didn't mean to. So am adding the bug email in Cc, so your explanation is recorded there. I added one more test case: http://git.savannah.gnu.org/cgit/grep.git/commit/?id=43f6246fe82f1