From unknown Sat Aug 16 19:17:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#73360: Error when a long list is provided to grep with "--binary-files=without-match" option Resent-From: Rodrigo Jorge Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Thu, 19 Sep 2024 14:29:04 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 73360 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: 73360@debbugs.gnu.org X-Debbugs-Original-To: bug-grep@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.17267561209437 (code B ref -1); Thu, 19 Sep 2024 14:29:04 +0000 Received: (at submit) by debbugs.gnu.org; 19 Sep 2024 14:28:40 +0000 Received: from localhost ([127.0.0.1]:33208 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srI98-0002S7-JH for submit@debbugs.gnu.org; Thu, 19 Sep 2024 10:28:39 -0400 Received: from lists.gnu.org ([209.51.188.17]:41152) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srHYL-00005S-RA for submit@debbugs.gnu.org; Thu, 19 Sep 2024 09:50:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1srHXz-0008QU-CH for bug-grep@gnu.org; Thu, 19 Sep 2024 09:50:19 -0400 Received: from mail-pj1-x1029.google.com ([2607:f8b0:4864:20::1029]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1srHXw-0006aS-Eb for bug-grep@gnu.org; Thu, 19 Sep 2024 09:50:15 -0400 Received: by mail-pj1-x1029.google.com with SMTP id 98e67ed59e1d1-2d88690837eso816946a91.2 for ; Thu, 19 Sep 2024 06:50:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726753808; x=1727358608; darn=gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=yEyZ3N3qldm1QFnovN74h565gJZEbuG665P188vSxDE=; b=edc0xy73dwYOJWTg/EFwO7wdVDYU+5sIbRm7jQbu2oIBYixgmKjllS9oyn0r9adS1y YQg5RnH7wEDnM7L+e0zTSa3JIL69SzlnqFU3gR6G+4hEeKzLUUxnzFQmyOjuQmDWsaTs Pijx6FojSe82XRurk5f0A10mhzd5wDDtHGihV5qhLwPJ8taimFT2JTdoel8qx8/15rNm hv1rMWkbCBghI8Zmwa5UhJL3zIw8o+He2eDD1deXju1aVWAEEfFKQ8mf0amPMAoFJghG L3FoGxsIzrAh8TYRGrlubv2wUN6HK5aQlHLQFq9B/1XQMijq4lDA3GzyrzlsZTUhxSCy unPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726753808; x=1727358608; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=yEyZ3N3qldm1QFnovN74h565gJZEbuG665P188vSxDE=; b=pCD/mki6PKWtxqXO7g4eyurJeGbd5mBFJb89/dq+WmtKQxUqBExRHUHNBGWxxOT2QT 4MWWOPsBRTPlhKcAhYb3LPs5/AqxMEXuXGFa2Ais7w/hhVbHwQJkMC2jQF/8OZrmjDCx L3dyvgw4JGWKcHzjQHR1pDHvkZxLHW7cPA71AhKyHLuysK/AbSIxV4XaAgUqUNamSbez rbA5qgHGSnOTl/FPol/6GeCyMw7PGAeKWOBvnZm9Z+p0CrOBLIbbbwp9cHwJeyUqXLFz LeAVxOmzr9w6BuUFwoyrHrCcWlKNFsbOCSTVz5p5vzdRbXPHKpaDlmAcqkFCQH3d73gP DRTQ== X-Gm-Message-State: AOJu0Yy6tTuOHKQFyOrz+4Xhz+OtcszhQPaxSHbdzU0Ikly+MEBSGQkS oRU5Miq7dRvmHeLX2uKaQV44dvf1Zg6j4w1iPhSjvi3M8TzAolgCmOaWyxAYBOptjE6fsDxiDFm vzWBGRSWNz2EfQXFD8kOQFtBvARzdhqLY X-Google-Smtp-Source: AGHT+IHLtyxKQPtbvSo9oaBp7Irl9wSwyqwwQNEWxdaRQysQLX5Gz1muHSvrFAnrB28likMEuAahSuzXogKHW0OAVp4= X-Received: by 2002:a17:90a:3fc4:b0:2d8:ea11:1c68 with SMTP id 98e67ed59e1d1-2dba00624d2mr27860224a91.31.1726753808191; Thu, 19 Sep 2024 06:50:08 -0700 (PDT) MIME-Version: 1.0 From: Rodrigo Jorge Date: Thu, 19 Sep 2024 10:49:31 -0300 Message-ID: Content-Type: multipart/alternative; boundary="0000000000005a1940062279330d" Received-SPF: pass client-ip=2607:f8b0:4864:20::1029; envelope-from=rodrigoaraujorge@gmail.com; helo=mail-pj1-x1029.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Mailman-Approved-At: Thu, 19 Sep 2024 10:28:34 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --0000000000005a1940062279330d Content-Type: text/plain; charset="UTF-8" Hello. I'm trying to use grep to get the list of all non-binary files in a given folder. I tried with the 2.20 and the 3.11 release. For some reason, grep is providing 2 false negatives when the list is huge. This issue does not happen if I break the grep input with "xargs -n X". Check below: [opc@oradiff-core dbhome_1]$ grep -V grep (GNU grep) 3.11 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later < https://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and others; see . [opc@oradiff-core dbhome_1]$ find -type f -not -path "./.patch_storage/*" -not -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 -n 100 grep -Il '.' > /tmp/list1.list [opc@oradiff-core dbhome_1]$ find -type f -not -path "./.patch_storage/*" -not -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 grep -Il '.' > /tmp/list2.list [opc@oradiff-core dbhome_1]$ diff /tmp/list1.list /tmp/list2.list 12268,12269d12267 < ./apex/images/apex_ui/psd/apex_5_ui.ai < ./apex/images/apex_ui/psd/apex-logo.ai [opc@oradiff-core dbhome_1]$ wc -l /tmp/list1.list /tmp/list2.list 23397 /tmp/list1.list 23395 /tmp/list2.list 46792 total The output should not show any difference. The same issue was also reproduced in grep 2.20. Thanks, Rodrigo --0000000000005a1940062279330d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello. I'm trying to use grep to get the list of = all non-binary files in a given folder. I tried with the 2.20 and the 3.11 = release.

For some reason, grep is providing 2 fals= e negatives when the list is huge. This issue does not happen if I break th= e grep input with "xargs -n X".

Check be= low:

[opc@oradiff-core dbhome= _1]$ grep -V
grep (GNU grep) 3.11
Copyright (C) 2023 Free Software Fo= undation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html&g= t;.
This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law.

Written by M= ike Haertel and others; see
<https://git.savannah.gnu.org/cgit/grep.git/tree= /AUTHORS>.

[opc@oradiff-core dbhome_1]$ find -t= ype f -not -path "./.patch_storage/*" -not -name "tfa_setup&= quot; -print0 2>> /tmp/error.list | xargs -0 -n 100 grep -Il '.&#= 39; > /tmp/list1.list

[opc@oradiff-core dbhome_1]$ fi= nd -type f -not -path "./.patch_storage/*" -not -name "tfa_s= etup" -print0 2>> /tmp/error.list | xargs -0 grep -Il '.'= ; > /tmp/list2.list

[opc@oradiff-core dbhome_1]$ diff= /tmp/list1.list /tmp/list2.list
12268,12269d12267
<= ; ./apex/images/apex_ui/psd/apex_5_ui.ai
< ./apex/ima= ges/apex_ui/psd/apex-logo.ai
[opc@oradiff-core dbhome_1]$ wc -l /tmp/list1.list /tmp/list2.list=
=C2=A0 23397 /tmp/list1.list
=C2=A0 23395 /tmp/list2.list
=C2=A0 = 46792 total

The output should not show any = difference.

The same issue was also reproduced= in grep 2.20.

Thanks,
Rodrigo
=
--0000000000005a1940062279330d-- From unknown Sat Aug 16 19:17:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#73360: Error when a long list is provided to grep with "--binary-files=without-match" option Resent-From: jackson@fastmail.com Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Fri, 20 Sep 2024 00:22:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73360 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: "Rodrigo Jorge" , 73360@debbugs.gnu.org Received: via spool by 73360-submit@debbugs.gnu.org id=B73360.17267916983608 (code B ref 73360); Fri, 20 Sep 2024 00:22:02 +0000 Received: (at 73360) by debbugs.gnu.org; 20 Sep 2024 00:21:38 +0000 Received: from localhost ([127.0.0.1]:33654 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srROz-0000w7-Qu for submit@debbugs.gnu.org; Thu, 19 Sep 2024 20:21:38 -0400 Received: from fhigh6-smtp.messagingengine.com ([103.168.172.157]:34843) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srROx-0000vp-88 for 73360@debbugs.gnu.org; Thu, 19 Sep 2024 20:21:37 -0400 Received: from phl-compute-08.internal (phl-compute-08.phl.internal [10.202.2.48]) by mailfhigh.phl.internal (Postfix) with ESMTP id CB12E1140232; Thu, 19 Sep 2024 20:21:10 -0400 (EDT) Received: from phl-imap-13 ([10.202.2.103]) by phl-compute-08.internal (MEProxy); Thu, 19 Sep 2024 20:21:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.com; h= cc:content-transfer-encoding:content-type:content-type:date:date :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm2; t=1726791670; x=1726878070; bh=3RQoliy4PcOzgkA6CQ2I5vwe1sub9bZehiGFv3eAbbg=; b= gc+kQOi8pZ08E24iw7nm3+d6O4hKIA6HJdScYXRub53ggBjVOfN8kL84HfnHGKtN CaVVwOlvvZMoFU5sBwdrzZUF/qmGDa09HBoHWTCWChb8OtI55r72GMr8j7jSQQeS L9BSNA55vPIbE5oH5bRiIomnEaFH9Ra1a9wq8mUiwoL6+yRTwgvt7uiWDMsGmP5w J5hW/rm208dsd2BjQ0IQGv+Gf4JWf6Frto7LwE2FAk1NMPXDq+euZ9r8AC0VzCwO xS85Vg/JEHtGefnifwAHNaGjOGDBb2TjpekGkwPZATiDXFXAMSh8Wz3VofcOwLxh /aCAKsyyYm5MFqeOQzkYhw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1726791670; x= 1726878070; bh=3RQoliy4PcOzgkA6CQ2I5vwe1sub9bZehiGFv3eAbbg=; b=i xsFty9agaz4fInFkywkSxtjCPimPy9bCz4ad5fazn72CtPVHyKAhQFmLg+Ei0nOX t51kmm9wUwj5MYboHnI0EqW4pnuVGbH7iG2cwEnLJAT0EUxJjam2UWGwqWR+CIZX JijrPt3T29P/IGWHTbIDlTT3SEd4FUnKXZW3nBsC/Mhto/9yz3hpDAV4ilU2FkEW WW0KqWMes5YtnEz41zYC65cJElrMHE1EJc0GKLgfZ5Hhsg3joL/GlHRjemrrrigk KcASRTMbXvJN/I0ESE0TJSgM+5VbIeCC+eIv+rC4AT97yFyg5K4b6Z/f8LybMQlC dR+UJQD1EXQpghWYnNaXQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrudelvddgfeegucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhepofggfffhvffkjghfufgtgfesthejredtredttden ucfhrhhomhepjhgrtghkshhonhesfhgrshhtmhgrihhlrdgtohhmnecuggftrfgrthhtvg hrnhepieeuhfejgeelgffgtddufeehvdegleevveduieetfeegueegfeehffelteejfeej necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepjhgrtg hkshhonhesfhgrshhtmhgrihhlrdgtohhmpdhnsggprhgtphhtthhopedvpdhmohguvgep shhmthhpohhuthdprhgtphhtthhopeejfeefiedtseguvggssghughhsrdhgnhhurdhorh hgpdhrtghpthhtoheprhhoughrihhgohgrrhgruhhjohhrghgvsehgmhgrihhlrdgtohhm X-ME-Proxy: Feedback-ID: i982440cf:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id 8AD101F00072; Thu, 19 Sep 2024 20:21:10 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface MIME-Version: 1.0 Date: Thu, 19 Sep 2024 19:19:26 -0500 From: jackson@fastmail.com Message-Id: <7c673366-9d2f-473f-b209-41b84f8d0f0e@app.fastmail.com> In-Reply-To: References: Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Spam-Score: 0.3 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) I can't reproduce this. I am running "grep (GNU grep) 3.11" and "xargs (GNU findutils) 4.10.0" on an Artix distribution. I have a directory that has 52422 regular files in it, over twice your example with some 23397 files. I get the same result regardless of whether or not I constrain xargs to "-n 100" arguments per exec. Could you: 1) See if you can see any difference in if or how xargs invokes grep on the two files that are coming up different, by looking for those two missing filenames in the tracing output from using the xargs --verbose option. 2) Probably not helpful, but is there anything strange about these two missing files: < ./apex/images/apex_ui/psd/apex_5_ui.ai < ./apex/images/apex_ui/psd/apex-logo.ai Are their sizes and file types, from the 'file' command, similar to some of the other files? My wild guess speculation would be that you're hitting some unknown limit on xargs when invoked with an argument list that is right at the limit of what your system allows. But I wouldn't bet a cheap beer on that guess being right. -- Paul Jackson jackson@fastmail.fm From unknown Sat Aug 16 19:17:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#73360: Error when a long list is provided to grep with "--binary-files=without-match" option Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Fri, 20 Sep 2024 03:45:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73360 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Rodrigo Jorge Cc: 73360@debbugs.gnu.org Received: via spool by 73360-submit@debbugs.gnu.org id=B73360.172680386913446 (code B ref 73360); Fri, 20 Sep 2024 03:45:02 +0000 Received: (at 73360) by debbugs.gnu.org; 20 Sep 2024 03:44:29 +0000 Received: from localhost ([127.0.0.1]:33711 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srUZI-0003Un-OK for submit@debbugs.gnu.org; Thu, 19 Sep 2024 23:44:28 -0400 Received: from mail.cs.ucla.edu ([131.179.128.66]:56644) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srUZF-0003UV-SE for 73360@debbugs.gnu.org; Thu, 19 Sep 2024 23:44:27 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id D2C743C011BC5; Thu, 19 Sep 2024 20:44:01 -0700 (PDT) Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10032) with ESMTP id 1RUnAdKd68QI; Thu, 19 Sep 2024 20:44:01 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 6E2AD3C011BD4; Thu, 19 Sep 2024 20:44:01 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.cs.ucla.edu 6E2AD3C011BD4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=9D0B346E-2AEB-11ED-9476-E14B719DCE6C; t=1726803841; bh=CMN2ooLNFwTjLmu4YTlKdWFR89JnZI+9CuNM8nfA5To=; h=Message-ID:Date:MIME-Version:To:From; b=hhx4S3OjWlWA/jz3Ftq08ztLYlMxbfwwUU4i2SkykGRgrRB/5QBWBWx7UCvYxJx// OTyoIR85JUIn/YO8sgnRxJmSSuLaFRL6k0Kwlat/pI8rMAVGOGXh+MOOOKJfQF8xTE eoi0xjoJMZSoZ/f55/El2/KIXshxWG0c31LacY1LaIjKu2WL6rgZt85HT9g27q5Uy8 z212yLjal+s/cE5NXcW1tBhu4kXHfoVgCUC6wY84XUsua/2X5AzrBmZqY0MJ8Kg0El l0wR5aWFp/C2TjJkT3FsesvDo6K0DmoQpluN/VZ+KEMyqn7gpy7sI1aTwa5xUbAiWC rZ82LmW3FDmyA== X-Virus-Scanned: amavis at mail.cs.ucla.edu Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id Y_20YAyZG9Hy; Thu, 19 Sep 2024 20:44:01 -0700 (PDT) Received: from [192.168.254.12] (unknown [47.150.137.250]) by mail.cs.ucla.edu (Postfix) with ESMTPSA id 4F41E3C011BC5; Thu, 19 Sep 2024 20:44:01 -0700 (PDT) Message-ID: Date: Thu, 19 Sep 2024 20:44:01 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird References: Content-Language: en-US From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) I suggest using xargs -t to see how 'grep' is actually being invoked. Then run the individual 'grep' commands that xargs -t reports, and see which one misbehaves (or possibly you'll find that none of the individual 'grep' commands are misbehaving, and the problem lies elsewhere). From unknown Sat Aug 16 19:17:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#73360: Error when a long list is provided to grep with "--binary-files=without-match" option Resent-From: "David G. Pickett" Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Fri, 20 Sep 2024 13:33:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73360 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: "73360@debbugs.gnu.org" <73360@debbugs.gnu.org>, Rodrigo Jorge Received: via spool by 73360-submit@debbugs.gnu.org id=B73360.172683915710680 (code B ref 73360); Fri, 20 Sep 2024 13:33:02 +0000 Received: (at 73360) by debbugs.gnu.org; 20 Sep 2024 13:32:37 +0000 Received: from localhost ([127.0.0.1]:34389 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srdkS-0002mB-SO for submit@debbugs.gnu.org; Fri, 20 Sep 2024 09:32:37 -0400 Received: from sonic306-20.consmr.mail.gq1.yahoo.com ([98.137.68.83]:38754) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srdkQ-0002lv-Ae for 73360@debbugs.gnu.org; Fri, 20 Sep 2024 09:32:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aol.com; s=a2048; t=1726839129; bh=UdJUVvKgwsJckKIx5aZyO/rdKeRW/rP7UK3vHfMBLuM=; h=Date:From:To:In-Reply-To:References:Subject:From:Subject:Reply-To; b=Y7QexE4Cu+fBaO5weXis1hROGobgUz5vZgZCG3v1P9adO+gBdyNFtBiNxqMcFubvbVjuZEQxwIT2Vm6eQUN7Giyd4xR8phYo0PTUQrzQ9AlQzOTImdBY1EsV2lYs/+GNsa4CtQkJ0qy70a7iEQ3mLMqGfi7LR0ey9c11wTlcTcELFVZ2vTMRRCTIqco5/q5rcDp6Xap+UvA+yjvDSkifWDVn+ZP0UshkbOLkzQD/nwNyqgR1B7o8F0jy2ZLiMycfT2Pn8PPWHCrVqEI5GP7VXPshLzl6NvSJFzXwFRCSwmiTmAftcsKYjBDBjuzzTxcMxKHvHM3evVJPduZZJGVAEw== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1726839129; bh=OL9BTSQbrYvLRmbmigUVVbJSCs5RRJa9eec6uzMHpD/=; h=X-Sonic-MF:Date:From:To:Subject:From:Subject; b=DZ+e2em50+i7UBujUReKDY9eVNHELxB68JHSiDIEiEClpLw7ogMme5ubw5ffVhwTI4wyWYwsfsazaoDe75t2UQjJN+XqRX7R98pUpix1xT3JjSkHIxLQfZ/AjpLNz06UxFYhtR+G2eXm8g1Pawrv0iWV6pn0ODeNbAxWHOBr3o8ij0s/YQ5aoD2w0Hf5RAlWIra6zfeCrCHJJcgIGRmLnXq/BQV/nKGoU9NGnIiTHPaOM2fVGtUwpl65zTqnns5tiw8G3BXT6bSMInCqLHWgQtgVht/h0KnEY10h9OJ9GIlMADSrW0xr5wHrJwAdXVTJy/6ZkSWugSOfSJ4oax2iJA== X-YMail-OSG: UhaOKtUVM1k1OFsJSvPQt7zQWTDvOqCYbn5.Ngz2tFC9.qDBXeZG_atXw70izbM gcwCF2mTyyz1uTCZN96Qweqm9GVgAARK1W116WJuSTJcG.9rJf5kpLhIM1M1oUJtbm9Y.gjtn6Mi cdKBDGaH1LEgK_3iFYsILHsiQnUJTEd5FIBSrEcQ90Hq8ScynQsNU1sUJZlBM1ySGKzBNVML40Po Bc23k8x0hzreaTymGY_y92BR.kHabKim5Y8yTZUXxfTbUig_oxS1TQ99doJEpeF8bW22XI6aAAe0 N5pHLLEni4Rhe.mbe3xRceehCeUxbdlPK6UVnqVqxt4rZ5hjtyxdpnQHBYs8VNOWf537.s0r3qLs RbsmVz40zXeR8gSFGurQE8KL8CAqgfQuTrnhBSi4MYyVUh9HsDlNNnWqZ5ZQh5vWFs0vIU5FqftY YhgZFPmd3Fvtwh86OIWGDvo3RXUUHNE.dnlROnu.GY3Hgj3KTkprt8ZADG0tstUQqV.CcXdCItp1 MyIHCC27D4ZlOWVMiPCMvh9hir3.It5Z7RhgvDdQqK9kLwtfrwHQ2kxeSL1uqHN.rTD94ejtkVdM o60VQElV9Gse0lG71v2dkIcOTiNOQ8sZA.4QDhQlzlmVbyztEHTCg61._FyQ0Wkfe8fo_aA1EJeS M_enJeUdnGMgO8m94m0tQqIuvX6HB549kYZ1cGm4st8bB5JmRViGADC3d2tguHoENcNd8VbGulTv Ot.mhGLGFaFK1s8w1wI5bg1_MY.yOZkzKQlMS10hpgwNz1fO06QTgWVZHHy7_Qi.RigV51_2g5tk XQ8Ca_t6n1AzDdPslgvA1OOB6qyblKIvrYkMG9nRei7wtAm0eovYsC4khWbGelSXSiz1sd6xB7R6 wXvi5dUENVQh3CHUkfuLZxDObZyRY8LY46mIoWOwaioctcwxT.LMK6gf8oP9pzp8bsq4I_jA3ohf zyjrWJwX4FXOdLQ1bZtHRQlmv3DI_BThBVapPVjBQxn1h7blbi9MXzVUoy19hPiI0g7vJBQ4OBO9 aTAmoIFelo2rezUVoPd2h_nAsL8tSjONE9LrvV4bwmS5xg8BulUU7dPClwhnjQffb4g9IX2mRrm4 c1EZ4sQkoLkyIL6BoAsy_2aKn8DVJ0.vRXVkQXHzUcwK2UGkqZgJH7ODoJCj9XB441_4J13myMfq JWEzN9ch.Pc3rrjuc54L_qaH9GdsYUSF8z2osEskN_jwDA2h0XNP8eOLp5HMwyzBQWQHdZQcaGs2 Ln5D0rywpnKHIDHNyLVgtDQwqxP1wvJ.MJfJnlf3a9EHKjq2YRPF4Rk_zQgs9kwYyzR2QNQWKeGg c85szvhUzZLC80PeeZadHc04qkRbsp.xvUBtFt2nFa63hbsaF4DN9_xku_yYStCfWG8pZLiwG4Qe CNXS4SgA2uP1O3ci5SL5Qzsb81MWwy7J0meQmum__dlBjxd6AQmOSbld2TSJfahpwLDb2sRxyBDn ItpX4wquMCTqPUMM0zv5mo0hF7Ryzv3LVqrU9iQlbyRU6hGuSNK_73ZYhadWn3cGSp.ftkL1A7en g8v8qGr9BKsNPKhybrXYzpa.N4NfH0lgJx.34vB1D6SnPnT.sQxSRbaxeGlbwyrmSRCBYPdqg3A2 o3EORWOFftI3hDA0XIAhiFHLPK05GKaqnJccopaZgV59FNiepLoDBOtdJ78rEYTwjgOBsdpCuOH_ u6W_tMzpgDUfuncQKdYB.24DeJ6SF5hoCYRB2JviqrU5Zn_fIO72qady.bS8FJVq72VfC.NzieF8 kOliQ9pYAeyWi.amKeWwS6lcj7FfwJs54xIGuD0BF9Kubxwa5zkt31ZuMY09QVaqwR8SK6E9hCMC 9z8eCEl1g9KqhjGfF8zdDotG2iWN81mEfuMLvR.C15gb4MhjLQ1iBIFMUNhnWeBE0bzOHcTcaslW ob74F2K785tKt_g2YLTDOLG5L8nzM6Uw0LhlWmpBxM1PsOmrikGl2t8tUHioYOPeiBwHBjD0AnOg dhO.RHHRqHAap4KBMeDRest1ZBZRrw4wLZsRaITwpWqpvtZFa51V06SZ97M2sb4iqVNPkdw.HmeN .shez3vuq.i5c6yQndCeNa15YGKRCWRW.VmxsVREtVfssxKN0J.B59k.3B2xuIzduB8AImAT6y.J U7l2G2fBavfDXWHI9F7c3GRVd3BL_sAqIuQu_5DgCZ3Hs5Gs5qlDSraquxReqarQXIDqSRy7kPsl BIRsnvCkOgVmJe9xRRHoZ4SR8t0qsWA-- X-Sonic-MF: X-Sonic-ID: cbda0992-321f-4abc-a123-9780df0002d8 Received: from sonic.gate.mail.ne1.yahoo.com by sonic306.consmr.mail.gq1.yahoo.com with HTTP; Fri, 20 Sep 2024 13:32:09 +0000 Date: Fri, 20 Sep 2024 13:32:03 +0000 (UTC) From: "David G. Pickett" Message-ID: <2063277599.8366131.1726839123608@mail.yahoo.com> In-Reply-To: References: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_8366130_223265259.1726839123606" X-Mailer: WebService/1.1.22645 AolMailNorrin Content-Length: 6643 X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) ------=_Part_8366130_223265259.1726839123606 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable While the output may be bulky, on Linux you can try the strace command to = see exactly what it is up to.=C2=A0 It will show the execvp() call, for ins= tance.=C2=A0 You might need a bigger -s! $ strace -f -v -s 262144 On Thursday, September 19, 2024 at 10:29:30 AM EDT, Rodrigo Jorge wrote: =20 Hello. I'm trying to use grep to get the list of all non-binary files in a given folder. I tried with the 2.20 and the 3.11 release. For some reason, grep is providing 2 false negatives when the list is huge. This issue does not happen if I break the grep input with "xargs -n X". Check below: [opc@oradiff-core dbhome_1]$ grep -V grep (GNU grep) 3.11 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later < https://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and others; see . [opc@oradiff-core dbhome_1]$ find -type f -not -path "./.patch_storage/*" -not -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 -n 100 grep -Il '.' > /tmp/list1.list [opc@oradiff-core dbhome_1]$ find -type f -not -path "./.patch_storage/*" -not -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 grep -Il '.' > /tmp/list2.list [opc@oradiff-core dbhome_1]$ diff /tmp/list1.list /tmp/list2.list 12268,12269d12267 < ./apex/images/apex_ui/psd/apex_5_ui.ai < ./apex/images/apex_ui/psd/apex-logo.ai [opc@oradiff-core dbhome_1]$ wc -l /tmp/list1.list /tmp/list2.list =C2=A0 23397 /tmp/list1.list =C2=A0 23395 /tmp/list2.list =C2=A0 46792 total The output should not show any difference. The same issue was also reproduced in grep 2.20. Thanks, Rodrigo =20 ------=_Part_8366130_223265259.1726839123606 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
While the output may be bulk= y, on Linux you can try the strace command to see exactly what it is up to.=   It will show the execvp() call, for instance.  You might need a= bigger -s!

$ strace -f -v -s 262144 <YOUR_CMD&= gt;

=20
=20
On Thursday, September 19, 2024 at 10:29:30 AM EDT,= Rodrigo Jorge <rodrigoaraujorge@gmail.com> wrote:


Hello. I'm trying to use grep to get = the list of all non-binary files in a
given folde= r. I tried with the 2.20 and the 3.11 release.
For some reason, grep is providing 2 false negativ= es when the list is huge.
This issue does not hap= pen if I break the grep input with "xargs -n X".
=
Check below:

[opc@oradiff-core dbhome_1]$ grep -V
grep (GNU grep) 3.11
Copyright (C) 2023 Fre= e Software Foundation, Inc.
License GPLv3+: GNU G= PL version 3 or later <
This = is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and ot= hers; see

[opc@oradiff-core dbho= me_1]$ find -type f -not -path "./.patch_storage/*"
-not -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 -n 1= 00 grep
-Il '.' > /tmp/list1.list

[opc@oradiff-core dbhome_1]$ find = -type f -not -path "./.patch_storage/*"
-not -nam= e "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 grep -Il '.'
=
> /tmp/list2.list

<= /div>
[opc@oradiff-core dbhome_1]$ diff /tmp/list1.list /tm= p/list2.list
12268,12269d12267
< ./apex/images/apex_ui/psd/apex_5_ui.ai
< ./apex/images/apex_ui/psd/apex-logo.ai
<= br>
[opc@oradiff-core dbhome_1]$ wc -l /tmp/list1.lis= t /tmp/list2.list
  23397 /tmp/list1.list
  23395 /tmp/list2.list
  46792 total

The output should not show any difference.

<= /div>
The same issue was also reproduced in grep 2.20.
<= /div>

Thanks,
Rodrigo
------=_Part_8366130_223265259.1726839123606-- From unknown Sat Aug 16 19:17:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#73360: Error when a long list is provided to grep with "--binary-files=without-match" option Resent-From: Rodrigo Jorge Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Fri, 20 Sep 2024 13:57:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73360 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: "David G. Pickett" Cc: "73360@debbugs.gnu.org" <73360@debbugs.gnu.org> Received: via spool by 73360-submit@debbugs.gnu.org id=B73360.172684059717635 (code B ref 73360); Fri, 20 Sep 2024 13:57:01 +0000 Received: (at 73360) by debbugs.gnu.org; 20 Sep 2024 13:56:37 +0000 Received: from localhost ([127.0.0.1]:36042 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sre7g-0004aN-7E for submit@debbugs.gnu.org; Fri, 20 Sep 2024 09:56:37 -0400 Received: from mail-pj1-f51.google.com ([209.85.216.51]:51355) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sre7c-0004aB-RX for 73360@debbugs.gnu.org; Fri, 20 Sep 2024 09:56:34 -0400 Received: by mail-pj1-f51.google.com with SMTP id 98e67ed59e1d1-2d88c5d76eeso1533075a91.2 for <73360@debbugs.gnu.org>; Fri, 20 Sep 2024 06:56:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726840508; x=1727445308; darn=debbugs.gnu.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=cAX7e0jF1hCQTZYN6UptNoalRryv95yoC1VZOFFp9SE=; b=S3GeHD1NdWpqXDuwqV7QIeoLmrtHrvEFW0clFb1d9mBBbtawOodweki+ScEh3oxBaX erL1+CKLomTsGthIstUspryOaoFF+dQf0CmY9NTUd7X8X19vQd/O9BG4nCiDnSegmSf+ SEoqNetyR8ahlR9DI2dCDNEkGWFf/dtCW/FcQkWQCnqFQeln5ouiBIru3ObCGME2tSkP CAtHlgvqSelqmFuRF4DtOWQAfjIOANLirrsDzdEWqAY/ZGyk2SEPM7j2YsdH2RkDz1O+ 9BEsk+fKEAj0enbjnQXgb3R6O3XQw2lj47963XTVXjU1jzoT49XVhHYtz1Pbe0svdFxZ LeFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726840508; x=1727445308; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=cAX7e0jF1hCQTZYN6UptNoalRryv95yoC1VZOFFp9SE=; b=PiikMoxWa39tlMyDNoVQ+7c3g0f1QvfeeAiZ7aoVJ3hVcUNfs2bu4bSQAmgyH5JN4o xGiDSEChKTfG6D2BHjwq2X9Xc9zjUpheVYSuzK0Aujd7BwNDLGfj8RbCTYatHZMgMMm+ wF50D/dS71jxbMUvvi++aqe7NiBRzIHvMuAftCmfYShNTCv8XsIz4n9M22p1UKU6WkWR AsWX6JjlbGLwZinDhRfqAoJf5Sdb9GfSqWZbdR6p/zRBVwCQnkVF1N+7GWFhw4R0sI11 HoL9mlDrvdYLs9aDJSOxN09Qb/jUVb56PT2yAwLAYv7NJu0Vij3UKjD+yc4d0hzwiFL1 9gyA== X-Gm-Message-State: AOJu0YywDh6q4iImmKg98evkJXRP7kJGW9t+VBfPlg5lmKWInvjXsifD BSwcquAGswEAfc43dvUgjn0ICCsD6AOdQ/ScrhgdB7iOuZjse+9QZlOZPg9KCdltMif0jTbj5uk btBw5wyF/eb+0vbjzXsSYLrU6RIA= X-Google-Smtp-Source: AGHT+IHhxXb/xvyryTPoCqZfcGxfLW7V6djNsyYmEN1r3TnNuQKD8SMPrUeLwE3wwJWJRWmfrqnZp7ooawxB7qPyimY= X-Received: by 2002:a17:90a:b310:b0:2d8:94f1:b572 with SMTP id 98e67ed59e1d1-2dd7f42a6c6mr3943899a91.18.1726840507637; Fri, 20 Sep 2024 06:55:07 -0700 (PDT) MIME-Version: 1.0 References: <2063277599.8366131.1726839123608@mail.yahoo.com> In-Reply-To: <2063277599.8366131.1726839123608@mail.yahoo.com> From: Rodrigo Jorge Date: Fri, 20 Sep 2024 10:54:31 -0300 Message-ID: Content-Type: multipart/alternative; boundary="0000000000000aaa9806228d63fc" X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --0000000000000aaa9806228d63fc Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I could reproduce the same issue without xargs, so I think we can take it out of the picture: [user@server folder]$ find -type f -not -path "./.patch_storage/*" -not -name "tfa_setup" -print > /tmp/file.list [user@server folder]$ wc -l /tmp/file.list 37443 /tmp/file.list [user@server folder]$ cat /tmp/file.list | xargs -n 100 grep -Il '.' > /tmp/list1.list [user@server folder]$ wc -l /tmp/list1.list 23405 /tmp/list1.list [user@server folder]$ grep -Il '.' $(cat /tmp/file.list) > /tmp/list2.list [user@server folder]$ wc -l /tmp/list2.list 23403 /tmp/list2.list [user@server folder]$ diff /tmp/list1.list /tmp/list2.list 12268,12269d12267 < ./apex/images/apex_ui/psd/apex_5_ui.ai < ./apex/images/apex_ui/psd/apex-logo.ai [user@server folder]$ So we can see that running *"grep -Il '.' $(cat /tmp/file.list)"* will also skip those 2 files, unless the problem is actually bringing them, and xargs are adding those 2 files somehow. Those files are PDFs: [user@server folder]$ file ./apex/images/apex_ui/psd/apex_5_ui.ai ./apex/images/apex_ui/psd/apex_5_ui.ai: PDF document, version 1.5 [user@server folder]$ file ./apex/images/apex_ui/psd/apex-logo.ai ./apex/images/apex_ui/psd/apex-logo.ai: PDF document, version 1.5 [user@server folder]$ head ./apex/images/apex_ui/psd/apex_5_ui.ai %=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD1.5 <>stream8 0 R 209 0 R]/ON[6 0 R 7 0 R 210 0 R]/Order 211 0 R/RBGroups[]>>/OCGs[6 0 R 7 0 R 5 0 R 208 0 R 210 0 R 209 0 R]>>/Pages 3 0 R/Type/Catalog>> application/pdf I could also find exactly the point it breaks: [user@server folder]$ cat /tmp/file.list | xargs -n 100 grep -Il '.' | wc -= l 23405 [user@server folder]$ cat /tmp/file.list | xargs -n 1000 grep -Il '.' | wc -l 23405 [user@server folder]$ cat /tmp/file.list | xargs -n 2000 grep -Il '.' | wc -l 23405 [user@server folder]$ cat /tmp/file.list | xargs -n 2871 grep -Il '.' | wc -l 23405 [user@server folder]$ cat /tmp/file.list | xargs -n 2872 grep -Il '.' | wc -l 23403 I will reply shortly with the strace findings. On Fri, Sep 20, 2024 at 10:32=E2=80=AFAM David G. Pickett wrote: > While the output may be bulky, on Linux you can try the strace command to > see exactly what it is up to. It will show the execvp() call, for > instance. You might need a bigger -s! > > $ strace -f -v -s 262144 > > On Thursday, September 19, 2024 at 10:29:30 AM EDT, Rodrigo Jorge < > rodrigoaraujorge@gmail.com> wrote: > > > Hello. I'm trying to use grep to get the list of all non-binary files in = a > given folder. I tried with the 2.20 and the 3.11 release. > > For some reason, grep is providing 2 false negatives when the list is hug= e. > This issue does not happen if I break the grep input with "xargs -n X". > > Check below: > > [opc@oradiff-core dbhome_1]$ grep -V > grep (GNU grep) 3.11 > Copyright (C) 2023 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later < > https://gnu.org/licenses/gpl.html>. > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. > > Written by Mike Haertel and others; see > . > > [opc@oradiff-core dbhome_1]$ find -type f -not -path "./.patch_storage/*" > -not -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 -n 100 grep > -Il '.' > /tmp/list1.list > > [opc@oradiff-core dbhome_1]$ find -type f -not -path "./.patch_storage/*" > -not -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 grep -Il '.= ' > > /tmp/list2.list > > [opc@oradiff-core dbhome_1]$ diff /tmp/list1.list /tmp/list2.list > 12268,12269d12267 > < ./apex/images/apex_ui/psd/apex_5_ui.ai > < ./apex/images/apex_ui/psd/apex-logo.ai > > [opc@oradiff-core dbhome_1]$ wc -l /tmp/list1.list /tmp/list2.list > 23397 /tmp/list1.list > 23395 /tmp/list2.list > 46792 total > > The output should not show any difference. > > The same issue was also reproduced in grep 2.20. > > Thanks, > Rodrigo > --0000000000000aaa9806228d63fc Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I could reproduce the same issue without xargs, so I = think we can take it out of the picture:

[user@server folder= ]$ find -type f -not -path "./.patch_storage/*" -not -name "= tfa_setup" -print > /tmp/file.list
[user@server folder]$ wc -l /= tmp/file.list
37443 /tmp/file.list

[user@server f= older]$ cat /tmp/file.list | xargs -n 100 grep -Il '.' > /tmp/li= st1.list
[user@server folder]$ wc -l /tmp/list1.list
23405 /tmp/list1= .list

[user@server folder]$ grep -Il '.' $(c= at /tmp/file.list) > /tmp/list2.list
[user@server folder]$ wc -l /tmp= /list2.list
23403 /tmp/list2.list

[user@server fo= lder]$ diff /tmp/list1.list /tmp/list2.list
12268,12269d12267
< ./= apex/images/apex_ui/psd/apex_5_ui.ai< ./apex/images/apex_ui/psd/apex-logo.a= i
[user@server folder]$

So we can se= e that running "grep -Il '.' $(cat /tmp/file.list)" will also skip those 2 files, unless the problem is actually bringing the= m, and xargs are adding those 2 files somehow.

Those files are PDFs:

[user@server folder]$ = file ./apex/images/apex_ui/psd/apex_5_ui.ai=
./apex/images/apex_ui/psd/apex_5_ui= .ai: PDF document, version 1.5
[user@server folder]$ file ./apex/ima= ges/apex_ui/psd/apex-logo.ai
./apex/= images/apex_ui/psd/apex-logo.ai: PDF do= cument, version 1.5

[user@server folder]$ head ./apex/images/apex_ui= /psd/apex_5_ui.ai
%=EF=BF=BD=EF=BF= =BD=EF=BF=BD=EF=BF=BD1.5
<</Length 39582/Subtype/XML/Type/Metadata= >>stream8 0 R 209 0 R]/ON[6 0 R 7 0 R 210 0 R]/Order 211 0 R/RBGroups= []>>/OCGs[6 0 R 7 0 R 5 0 R 208 0 R 210 0 R 209 0 R]>>/Pages 3 = 0 R/Type/Catalog>>
<?xpacket begin=3D"" id=3D"W5= M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x=3D"adobe:ns= :meta/" x:xmptk=3D"Adobe XMP Core 5.3-c011 66.145661, 2012/02/06-= 14:56:27 =C2=A0 =C2=A0 =C2=A0 =C2=A0">
=C2=A0 =C2=A0<rdf:RDF = xmlns:rdf=3D"h= ttp://www.w3.org/1999/02/22-rdf-syntax-ns#">
=C2=A0 =C2=A0 = =C2=A0 <rdf:Description rdf:about=3D""
=C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 xmlns:dc=3D"http://purl.org/dc/elements/1.1/">
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0<dc:format>application/pdf</dc:format>
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<dc:title>
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 <rdf:Alt>

I coul= d also find exactly the point it breaks:

[user@server folder]$ cat /tmp= /file.list | xargs -n 100 grep -Il '.' | wc -l
23405
[user@se= rver folder]$ cat /tmp/file.list | xargs -n 1000 grep -Il '.' | wc = -l
23405
[user@server folder]$ cat /tmp/file.list | xargs -n 2000 gre= p -Il '.' | wc -l
23405
[user@server folder]$ cat /tmp/file.l= ist | xargs -n 2871 grep -Il '.' | wc -l
23405
[user@server f= older]$ cat /tmp/file.list | xargs -n 2872 grep -Il '.' | wc -l
= 23403

I will reply shortly with the strace finding= s.

On Fri, Sep 20, 2024 at 10:32=E2=80=AFAM David G. Pickett <<= a href=3D"mailto:dgpickett@aol.com">dgpickett@aol.com> wrote:
While the output may be bulky, on Linux you can tr= y the strace command to see exactly what it is up to.=C2=A0 It will show th= e execvp() call, for instance.=C2=A0 You might need a bigger -s!

$ strace -f -v -s 262144 <Y= OUR_CMD>

=20
=20
On Thursday, September 19, 2024 at 10:29:30 AM EDT,= Rodrigo Jorge <rodrigoaraujorge@gmail.com> wrote:


Hello. I'm trying to use grep to = get the list of all non-binary files in a
given f= older. I tried with the 2.20 and the 3.11 release.

For some reason, grep is providing 2 false neg= atives when the list is huge.
This issue does not= happen if I break the grep input with "xargs -n X".

Check below:

[opc@oradiff-core dbhome_1]$ grep -V
grep (GNU grep) 3.11
Copyrigh= t (C) 2023 Free Software Foundation, Inc.
License= GPLv3+: GNU GPL version 3 or later <
This is= free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
<= div dir=3D"ltr">
Written by Mike Haertel and othe= rs; see

[opc@oradiff-core dbhome_1]$ find -type= f -not -path "./.patch_storage/*"
-not= -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 -= n 100 grep
-Il '.' > /tmp/list1.list

[opc@oradiff-core dbhom= e_1]$ find -type f -not -path "./.patch_storage/*"
-not -name "tfa_setup" -print0 2>> /tmp/error.l= ist | xargs -0 grep -Il '.'
> /tmp/lis= t2.list

[opc@oradiff-c= ore dbhome_1]$ diff /tmp/list1.list /tmp/list2.list
12268,12269d12267
< ./apex/images/apex_ui/p= sd/apex_5_ui.ai
< ./apex/images/apex_ui/psd/apex-logo.ai

<= /div>
[opc@oradiff-core dbhome_1]$ wc -l /tmp/list1.list /t= mp/list2.list
=C2=A0 23397 /tmp/list1.list
=C2=A0 23395 /tmp/list2.list
= =C2=A0 46792 total

The= output should not show any difference.

The same issue was also reproduced in grep 2.20.

Thanks,
Rodrigo
--0000000000000aaa9806228d63fc-- From unknown Sat Aug 16 19:17:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#73360: Error when a long list is provided to grep with "--binary-files=without-match" option Resent-From: Rodrigo Jorge Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Fri, 20 Sep 2024 14:25:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73360 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: "David G. Pickett" Cc: "73360@debbugs.gnu.org" <73360@debbugs.gnu.org> Received: via spool by 73360-submit@debbugs.gnu.org id=B73360.172684226323177 (code B ref 73360); Fri, 20 Sep 2024 14:25:01 +0000 Received: (at 73360) by debbugs.gnu.org; 20 Sep 2024 14:24:23 +0000 Received: from localhost ([127.0.0.1]:36067 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sreYX-00061h-Gm for submit@debbugs.gnu.org; Fri, 20 Sep 2024 10:24:23 -0400 Received: from mail-pj1-f47.google.com ([209.85.216.47]:61777) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sreYU-00061T-Ts for 73360@debbugs.gnu.org; Fri, 20 Sep 2024 10:24:20 -0400 Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-2db89fb53f9so1491030a91.3 for <73360@debbugs.gnu.org>; Fri, 20 Sep 2024 07:23:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726842174; x=1727446974; darn=debbugs.gnu.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=haqJJSH7Rn3JmZPcpmzzJUrP0UWMAsXAyAbSCg67jyU=; b=kwCwBJHsTDqqCrPgsQ/xdb5XQMsBPDKsl4bf1XopCngSbTRqVy5pcB1TDHzZWZwo5z vIFbm1t2LX2CdXt6j5AVeaqPaKkcuhyMsdQX9eog3zHV0/4yWMc11hSOwQasclhzt0kQ gbJF+JBQDHwvvwLJ1/uMZfUeeBVrjpAiVG1gLO1hVnevDgUuFsKd8J4gYlAAP4mlvfbg jaaM96F18qpZahNmvdNuP8IiTDuwpE384BkZuUje93/82NLiJXFBUzsD/iGIoPBQb0Z9 fTIPfjifgub9XmJqisHR2mU7CW923ef+N+mSXqQz/lNTs2Q3LUI3Pu6GbwrjOfB/Hs3K IHgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726842174; x=1727446974; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=haqJJSH7Rn3JmZPcpmzzJUrP0UWMAsXAyAbSCg67jyU=; b=qF4soYdkrhKeq42lEGEIM0YQo6XTiARYh4yzxF4iDJR8D72Kc17sQv39NdTLTq5OnL MU5uTsVkFfgF3oY6/Yqaw5BKmlA+sqJ9u5T0GXPGHrMS+Zyrq0FlPvTCFZgB2t0PkI/r 6fv2/hW34YqUcu7l8lXAnEGpCWo4b0hKEZl2ljy2x06rd3tAVlyJWj3k3IGGi7ZTD16c AgAz3pK+Ebbwt8d0JGm1rsGj+DkoPVf8s3LfAVEVdQ24/b7UXJGMltRSrlErgKT5C6OB ZCF61WyQNs5NhCCmLrI9YLEreGMGyoB2pc/SFyFRe5fd0ysctmBOj8mpVelAYT2Di9vr vfNw== X-Gm-Message-State: AOJu0YzAEyrkDXGBJPilBCyFWPYqyM5OoFwCx+VcewWxKhCflb+mw+2r 2JJMFj2L3TlH+LF9JDMmiNHW9YEr2GBY3rl89Elw8TB4VyTOotbCSBswFF4mLewcFp5JVyVmFX4 CePnJgFL1iyl8x+ayazD4poyL4alVWvrp X-Google-Smtp-Source: AGHT+IGAwSU9wI9DxAArBWBk1/ryoCL9bwXhnVXpx8/PuGDF6tGxvP4vK5GtD9ku3woPILemV8nEkCiv2iSd53Z0Ia4= X-Received: by 2002:a17:90b:4a10:b0:2d8:7307:3f73 with SMTP id 98e67ed59e1d1-2dd80cf0c70mr3387381a91.39.1726842174038; Fri, 20 Sep 2024 07:22:54 -0700 (PDT) MIME-Version: 1.0 References: <2063277599.8366131.1726839123608@mail.yahoo.com> In-Reply-To: From: Rodrigo Jorge Date: Fri, 20 Sep 2024 11:22:17 -0300 Message-ID: Content-Type: multipart/alternative; boundary="0000000000005deb8b06228dc673" X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --0000000000005deb8b06228dc673 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Ok, more things were discovered. After I had a problem exactly at the "xargs -n 2872", I ran the xargs again with the "-t" flag to get the command, and noticed that the 2 missing files were exactly the 2 last ones on the command file list. grep -Il . "{ 2870 files }" ./apex/images/apex_ui/psd/apex_5_ui.ai ./apex/images/apex_ui/psd/apex-logo.ai Now if I run: [user@server folder]$ cat /tmp/cmd1 grep -Il . ./apex/images/apex_ui/psd/apex_5_ui.ai ./apex/images/apex_ui/psd= / apex-logo.ai ... "{ 2870 files }" [user@server folder]$ wc -c /tmp/cmd1 131049 /tmp/cmd1 [user@server folder]$ cat /tmp/cmd2 grep -Il . "{ 2870 files }" ./apex/images/apex_ui/psd/apex_5_ui.ai ./apex/images/apex_ui/psd/apex-logo.ai [user@server folder]$ wc -c /tmp/cmd2 131049 /tmp/cmd2 [user@server folder]$ sh /tmp/cmd1 | wc -l 1072 [user@server folder]$ sh /tmp/cmd2 | wc -l 1070 In other words, depending on the location on the command line where those 2 files are provided to grep, we will have a different result. Can I run those 2 grep commands with some sort of debug flag and send them back for analysis? The file list is exactly the same, just changing the file order. Thanks, Rodrigo On Fri, Sep 20, 2024 at 10:54=E2=80=AFAM Rodrigo Jorge wrote: > I could reproduce the same issue without xargs, so I think we can take it > out of the picture: > > [user@server folder]$ find -type f -not -path "./.patch_storage/*" -not > -name "tfa_setup" -print > /tmp/file.list > [user@server folder]$ wc -l /tmp/file.list > 37443 /tmp/file.list > > [user@server folder]$ cat /tmp/file.list | xargs -n 100 grep -Il '.' > > /tmp/list1.list > [user@server folder]$ wc -l /tmp/list1.list > 23405 /tmp/list1.list > > [user@server folder]$ grep -Il '.' $(cat /tmp/file.list) > /tmp/list2.lis= t > [user@server folder]$ wc -l /tmp/list2.list > 23403 /tmp/list2.list > > [user@server folder]$ diff /tmp/list1.list /tmp/list2.list > 12268,12269d12267 > < ./apex/images/apex_ui/psd/apex_5_ui.ai > < ./apex/images/apex_ui/psd/apex-logo.ai > [user@server folder]$ > > So we can see that running *"grep -Il '.' $(cat /tmp/file.list)"* will > also skip those 2 files, unless the problem is actually bringing them, an= d > xargs are adding those 2 files somehow. > > Those files are PDFs: > > [user@server folder]$ file ./apex/images/apex_ui/psd/apex_5_ui.ai > ./apex/images/apex_ui/psd/apex_5_ui.ai: PDF document, version 1.5 > [user@server folder]$ file ./apex/images/apex_ui/psd/apex-logo.ai > ./apex/images/apex_ui/psd/apex-logo.ai: PDF document, version 1.5 > > [user@server folder]$ head ./apex/images/apex_ui/psd/apex_5_ui.ai > %=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD1.5 > <>stream8 0 R 209 0 R]/ON[6 0 R = 7 > 0 R 210 0 R]/Order 211 0 R/RBGroups[]>>/OCGs[6 0 R 7 0 R 5 0 R 208 0 R 21= 0 > 0 R 209 0 R]>>/Pages 3 0 R/Type/Catalog>> > > 66.145661, 2012/02/06-14:56:27 "> > > xmlns:dc=3D"http://purl.org/dc/elements/1.1/"> > application/pdf > > > > I could also find exactly the point it breaks: > > [user@server folder]$ cat /tmp/file.list | xargs -n 100 grep -Il '.' | wc > -l > 23405 > [user@server folder]$ cat /tmp/file.list | xargs -n 1000 grep -Il '.' | > wc -l > 23405 > [user@server folder]$ cat /tmp/file.list | xargs -n 2000 grep -Il '.' | > wc -l > 23405 > [user@server folder]$ cat /tmp/file.list | xargs -n 2871 grep -Il '.' | > wc -l > 23405 > [user@server folder]$ cat /tmp/file.list | xargs -n 2872 grep -Il '.' | > wc -l > 23403 > > I will reply shortly with the strace findings. > > On Fri, Sep 20, 2024 at 10:32=E2=80=AFAM David G. Pickett > wrote: > >> While the output may be bulky, on Linux you can try the strace command t= o >> see exactly what it is up to. It will show the execvp() call, for >> instance. You might need a bigger -s! >> >> $ strace -f -v -s 262144 >> >> On Thursday, September 19, 2024 at 10:29:30 AM EDT, Rodrigo Jorge < >> rodrigoaraujorge@gmail.com> wrote: >> >> >> Hello. I'm trying to use grep to get the list of all non-binary files in= a >> given folder. I tried with the 2.20 and the 3.11 release. >> >> For some reason, grep is providing 2 false negatives when the list is >> huge. >> This issue does not happen if I break the grep input with "xargs -n X". >> >> Check below: >> >> [opc@oradiff-core dbhome_1]$ grep -V >> grep (GNU grep) 3.11 >> Copyright (C) 2023 Free Software Foundation, Inc. >> License GPLv3+: GNU GPL version 3 or later < >> https://gnu.org/licenses/gpl.html>. >> This is free software: you are free to change and redistribute it. >> There is NO WARRANTY, to the extent permitted by law. >> >> Written by Mike Haertel and others; see >> . >> >> [opc@oradiff-core dbhome_1]$ find -type f -not -path "./.patch_storage/*= " >> -not -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 -n 100 gre= p >> -Il '.' > /tmp/list1.list >> >> [opc@oradiff-core dbhome_1]$ find -type f -not -path "./.patch_storage/*= " >> -not -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 grep -Il '= .' >> > /tmp/list2.list >> >> [opc@oradiff-core dbhome_1]$ diff /tmp/list1.list /tmp/list2.list >> 12268,12269d12267 >> < ./apex/images/apex_ui/psd/apex_5_ui.ai >> < ./apex/images/apex_ui/psd/apex-logo.ai >> >> [opc@oradiff-core dbhome_1]$ wc -l /tmp/list1.list /tmp/list2.list >> 23397 /tmp/list1.list >> 23395 /tmp/list2.list >> 46792 total >> >> The output should not show any difference. >> >> The same issue was also reproduced in grep 2.20. >> >> Thanks, >> Rodrigo >> > --0000000000005deb8b06228dc673 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Ok, more things were discovered. After I had a proble= m exactly at the "xargs -n 2872", I ran the xargs again with the = "-t" flag to get the command, and noticed that the 2 missing file= s were exactly the 2 last ones on the command file list.

gre= p -Il . "{ 2870 files }" ./apex/images/apex_ui/psd/apex_5_ui.ai ./apex/images/apex_ui/psd/apex-logo.ai

Now if I= run:

[user@server folder]$ cat /tmp/cmd1
grep -I= l . ./apex/images/apex_ui/psd/apex_5_ui.ai<= /a> ./apex/images/apex_ui/psd/apex-logo.ai<= /a> ... "{ 2870 files }"

[user@server fo= lder]$ wc -c /tmp/cmd1
131049 /tmp/cmd1

[user@server folder]$ cat= /tmp/cmd2
grep -Il . "{ 2870 files }" ./apex/images/apex_ui/p= sd/
apex_5_ui.ai ./apex/images/apex_ui/p= sd/apex-logo.ai
[user@server folder]= $ wc -c /tmp/cmd2
131049 /tmp/cmd2


[user@server folder]$ sh /= tmp/cmd1 | wc -l
1072
[user@server folder]$ sh /tmp/cmd2 | wc -l
1= 070

In other words, depending on the lo= cation on the command line where those 2 files are provided to grep, we wil= l have a different result.

Can I run those 2 grep = commands with some sort of debug flag and send them back for analysis? The = file list is exactly the same, just changing the file order.

=
Thanks,
Rodrigo

On Fri, Sep 20, 2024 at 10:54= =E2=80=AFAM Rodrigo Jorge <rodrigoaraujorge@gmail.com> wrote:
I could reproduce the same= issue without xargs, so I think we can take it out of the picture:

[user@server folder]$ find -type f -not -path "./.patch_storag= e/*" -not -name "tfa_setup" -print > /tmp/file.list
[u= ser@server folder]$ wc -l /tmp/file.list
37443 /tmp/file.list
=

[user@server folder]$ cat /tmp/file.list | xargs -n 100 grep -= Il '.' > /tmp/list1.list
[user@server folder]$ wc -l /tmp/lis= t1.list
23405 /tmp/list1.list

[user@server folder= ]$ grep -Il '.' $(cat /tmp/file.list) > /tmp/list2.list
[user= @server folder]$ wc -l /tmp/list2.list
23403 /tmp/list2.list
<= /div>
=
[user@server folder]$ diff /tmp/list1.list /tmp/list2.list
1= 2268,12269d12267
< ./apex/images/apex_ui/psd/apex_5_ui.ai
< ./apex/images/apex_ui/ps= d/apex-logo.ai
[us= er@server folder]$

So we can see that runni= ng "grep -Il '.' $(cat /tmp/file.list)" will also = skip those 2 files, unless the problem is actually bringing them, and xargs= are adding those 2 files somehow.

Tho= se files are PDFs:

[user@server folder]$ file ./apex/= images/apex_ui/psd/apex_5= _ui.ai
./apex/images/apex_ui/psd/apex_5_ui.ai: PDF document, version 1.5
[user@server = folder]$ file ./apex/images/apex_ui/psd/apex-logo.ai
./apex/images/apex_ui/psd/apex-logo.ai: PDF document, version= 1.5

[user@server folder]$ head ./apex/images/apex_ui/psd/apex_5_ui.ai
%=EF=BF=BD=EF= =BF=BD=EF=BF=BD=EF=BF=BD1.5
<</Length 39582/Subtype/XML/Type/Metad= ata>>stream8 0 R 209 0 R]/ON[6 0 R 7 0 R 210 0 R]/Order 211 0 R/RBGro= ups[]>>/OCGs[6 0 R 7 0 R 5 0 R 208 0 R 210 0 R 209 0 R]>>/Pages= 3 0 R/Type/Catalog>>
<?xpacket begin=3D"" id=3D"= ;W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x=3D"adobe= :ns:meta/" x:xmptk=3D"Adobe XMP Core 5.3-c011 66.145661, 2012/02/= 06-14:56:27 =C2=A0 =C2=A0 =C2=A0 =C2=A0">
=C2=A0 =C2=A0<rdf:R= DF xmlns:rdf=3D"http://www.w3.org/1999/02/22-rdf-syntax-ns#"&g= t;
=C2=A0 =C2=A0 =C2=A0 <rdf:Description rdf:about=3D""
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 xmlns:dc=3D"http://purl.org/dc/elements/= 1.1/">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<dc:format>ap= plication/pdf</dc:format>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<dc= :title>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <rdf:Alt>

I could also find exactly the point it breaks:=

[user@server folder]$ cat /tmp/file.list | xargs -n 100 grep -Il '= .' | wc -l
23405
[user@server folder]$ cat /tmp/file.list | xargs= -n 1000 grep -Il '.' | wc -l
23405
[user@server folder]$ cat= /tmp/file.list | xargs -n 2000 grep -Il '.' | wc -l
23405
[u= ser@server folder]$ cat /tmp/file.list | xargs -n 2871 grep -Il '.'= | wc -l
23405
[user@server folder]$ cat /tmp/file.list | xargs -n 28= 72 grep -Il '.' | wc -l
23403

I will re= ply shortly with the strace findings.

On Fri, Sep 20, 2024 at 10:3= 2=E2=80=AFAM David G. Pickett <dgpickett@aol.com> wrote:
While the output may be bulky, on Linux you can tr= y the strace command to see exactly what it is up to.=C2=A0 It will show th= e execvp() call, for instance.=C2=A0 You might need a bigger -s!

$ strace -f -v -s 262144 <Y= OUR_CMD>

=20
=20
On Thursday, September 19, 2024 at 10:29:30 AM EDT,= Rodrigo Jorge <rodrigoaraujorge@gmail.com> wrote:


Hello. I'm trying to use grep to = get the list of all non-binary files in a
given f= older. I tried with the 2.20 and the 3.11 release.

For some reason, grep is providing 2 false neg= atives when the list is huge.
This issue does not= happen if I break the grep input with "xargs -n X".

Check below:

[opc@oradiff-core dbhome_1]$ grep -V
grep (GNU grep) 3.11
Copyrigh= t (C) 2023 Free Software Foundation, Inc.
License= GPLv3+: GNU GPL version 3 or later <
This is= free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
<= div dir=3D"ltr">
Written by Mike Haertel and othe= rs; see

[opc@oradiff-core dbhome_1]$ find -type= f -not -path "./.patch_storage/*"
-not= -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 -= n 100 grep
-Il '.' > /tmp/list1.list

[opc@oradiff-core dbhom= e_1]$ find -type f -not -path "./.patch_storage/*"
-not -name "tfa_setup" -print0 2>> /tmp/error.l= ist | xargs -0 grep -Il '.'
> /tmp/lis= t2.list

[opc@oradiff-c= ore dbhome_1]$ diff /tmp/list1.list /tmp/list2.list
12268,12269d12267
< ./apex/images/apex_ui/p= sd/apex_5_ui.ai
< ./apex/images/apex_ui/psd/apex-logo.ai

<= /div>
[opc@oradiff-core dbhome_1]$ wc -l /tmp/list1.list /t= mp/list2.list
=C2=A0 23397 /tmp/list1.list
=C2=A0 23395 /tmp/list2.list
= =C2=A0 46792 total

The= output should not show any difference.

The same issue was also reproduced in grep 2.20.

Thanks,
Rodrigo
--0000000000005deb8b06228dc673-- From unknown Sat Aug 16 19:17:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#73360: Error when a long list is provided to grep with "--binary-files=without-match" option Resent-From: jackson@fastmail.com Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sat, 21 Sep 2024 03:35:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73360 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: "Rodrigo Jorge" , "David G. Pickett" Cc: "73360@debbugs.gnu.org" <73360@debbugs.gnu.org> Received: via spool by 73360-submit@debbugs.gnu.org id=B73360.172688964932108 (code B ref 73360); Sat, 21 Sep 2024 03:35:02 +0000 Received: (at 73360) by debbugs.gnu.org; 21 Sep 2024 03:34:09 +0000 Received: from localhost ([127.0.0.1]:36848 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srqsq-0008Lo-PT for submit@debbugs.gnu.org; Fri, 20 Sep 2024 23:34:09 -0400 Received: from fhigh7-smtp.messagingengine.com ([103.168.172.158]:35771) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srqso-0008LG-Kv for 73360@debbugs.gnu.org; Fri, 20 Sep 2024 23:34:07 -0400 Received: from phl-compute-08.internal (phl-compute-08.phl.internal [10.202.2.48]) by mailfhigh.phl.internal (Postfix) with ESMTP id 937A811401C5; Fri, 20 Sep 2024 23:33:40 -0400 (EDT) Received: from phl-imap-13 ([10.202.2.103]) by phl-compute-08.internal (MEProxy); Fri, 20 Sep 2024 23:33:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.com; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm2; t=1726889620; x=1726976020; bh=f77EfSlrR7JkuSIXrSAridM4rzL0buy6W+PJIl3wIEo=; b= BlLwcosA5x/m0P32hSmyd/yjId5CvV65iC20F5qmdvL19rH3Vv99UteEh7gVf6s/ Davh9L3XHbf4+ey5i59O7CnUMqUnmpymQY6rgxFb1cctsIBIAv2KLL8kMUpwdtjz 74wwvWPbX1G3LJ/eKEVj2PEhr9hDPjbdL+u8cVp/lRzKw3cmeW0LFYYYWHpyj5yP 0Tp362M0QTh5YK841VHgXKrpSMI9BBmOXm1H7zFuVK1l6wdcp0fggqh8TItE68gx 8JoUxghrfYpjTgZHsOSyVUnQNG9rbwUY3SY2HtHnNrr9z9XY49xHRWJKuEQUw/Kd KRz5uYIWz95Q3ubgDahW6w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1726889620; x= 1726976020; bh=f77EfSlrR7JkuSIXrSAridM4rzL0buy6W+PJIl3wIEo=; b=J aHyXjkxvnMDsmGoBJM4UQYdy53AgJgX6z5hM+FEH55+3d8M/dccd5raRV0hLb/Xt h2rZI9ikwWtrO1WtZzwHOpTeaUiLdP6xUxYyZvrw8jo4EilLQrpVszsFze+5AUER 8wusDL1AEHZTNFZlhdtGyFfyMYfgzsmPNXbXMnatiHI3CMA5ZMpPfHniEaHONK0e 7tVm+QsYXOGT3OK1AzKe4USFXci/ryFB5A0QNbONaArXLWjnujQQmmznihYxfhAI hsNWk0F4MsoTjCEdlxxPUh5pw6n3o54L4k73tuz9jVNukPSYKPvAOcwbMG4NH2Nn I5qPICPJiH7AhCn6ZFvoQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrudelgedgjedvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhepofggfffhvfevkfgjfhfutgfgsehtjeertdertddt necuhfhrohhmpehjrggtkhhsohhnsehfrghsthhmrghilhdrtghomhenucggtffrrghtth gvrhhnpeekgfegjeefgeekudelhfejueefgfeufefhffelgfefkedutdetfeeivdefgfef keenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehjrg gtkhhsohhnsehfrghsthhmrghilhdrtghomhdpnhgspghrtghpthhtohepfedpmhhouggv pehsmhhtphhouhhtpdhrtghpthhtohepughgphhitghkvghtthesrgholhdrtghomhdprh gtphhtthhopeejfeefiedtseguvggssghughhsrdhgnhhurdhorhhgpdhrtghpthhtohep rhhoughrihhgohgrrhgruhhjohhrghgvsehgmhgrihhlrdgtohhm X-ME-Proxy: Feedback-ID: i982440cf:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id 3B4341F00072; Fri, 20 Sep 2024 23:33:40 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface MIME-Version: 1.0 Date: Fri, 20 Sep 2024 22:31:30 -0500 From: jackson@fastmail.com Message-Id: In-Reply-To: References: <2063277599.8366131.1726839123608@mail.yahoo.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Spam-Score: 0.3 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) Rodrigo wrote: >> [user@server folder]$ cat /tmp/file.list | xargs -n 2872 grep -Il '.' | wc -l >> 23403 Since this problem is reproduced using that particular /tmp/file.list, therefore if that file.list does not contain any confidential information, and if you chose to let all of us see that file.list, then any of us should be able to easily reproduce this problem. -- Paul Jackson jackson@fastmail.fm From unknown Sat Aug 16 19:17:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#73360: Error when a long list is provided to grep with "--binary-files=without-match" option Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sat, 21 Sep 2024 05:43:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73360 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Rodrigo Jorge , "David G. Pickett" Cc: "73360@debbugs.gnu.org" <73360@debbugs.gnu.org> Received: via spool by 73360-submit@debbugs.gnu.org id=B73360.172689733027891 (code B ref 73360); Sat, 21 Sep 2024 05:43:02 +0000 Received: (at 73360) by debbugs.gnu.org; 21 Sep 2024 05:42:10 +0000 Received: from localhost ([127.0.0.1]:36906 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srssj-0007Fn-GL for submit@debbugs.gnu.org; Sat, 21 Sep 2024 01:42:09 -0400 Received: from mail.cs.ucla.edu ([131.179.128.66]:54984) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srssh-0007FH-BV for 73360@debbugs.gnu.org; Sat, 21 Sep 2024 01:42:08 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 8887B3C011BD9; Fri, 20 Sep 2024 22:41:41 -0700 (PDT) Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10032) with ESMTP id cC643WjUphGV; Fri, 20 Sep 2024 22:41:41 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 3C4C33C011BDC; Fri, 20 Sep 2024 22:41:41 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.cs.ucla.edu 3C4C33C011BDC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=9D0B346E-2AEB-11ED-9476-E14B719DCE6C; t=1726897301; bh=ZGQ02q0C5xyQt/OQaQgloFpyJNGLaWC2iLU7Wc7P4tM=; h=Message-ID:Date:MIME-Version:To:From; b=YLAxyt7eIDSLy7ya37eZE7XP5C4gyWfrW888e3TA8FVlrufm2eCdeUswwh5adlFrv jHkjqFM9NdcRfk+/1PWhkcAC9cAfGzHlkY3Ycmxg72+iOcBA2xWGSOGrXVYGql4nq5 reFymwRoUZgfSILC9HdIbd/QDPNB0kXLe+mtD4J5NfTSe8kXFCvq/Mznb84Ccqh/Mt 31zsZFgiyfDF9DpS1an7y8vcsSmewGMNMxat8MqLS4li5wv1WG9mgYDlH54ZfC24Tu FUDD8Vn0MBrfJl9iaK3QkJ7PjCchThuQnDITCBq8iwdluZpGmdW2RN7aJokGDTgXXz U4S9FkZHRoSnw== X-Virus-Scanned: amavis at mail.cs.ucla.edu Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id qiCd6oXlcFcE; Fri, 20 Sep 2024 22:41:41 -0700 (PDT) Received: from [192.168.254.12] (unknown [47.150.137.250]) by mail.cs.ucla.edu (Postfix) with ESMTPSA id 1E85F3C011BD9; Fri, 20 Sep 2024 22:41:41 -0700 (PDT) Message-ID: <47257c29-9381-4765-8507-e897e44fae52@cs.ucla.edu> Date: Fri, 20 Sep 2024 22:41:40 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird References: <2063277599.8366131.1726839123608@mail.yahoo.com> Content-Language: en-US From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On 2024-09-20 07:22, Rodrigo Jorge wrote: > Can I run those 2 grep commands with some sort of debug flag and send them > back for analysis? The file list is exactly the same, just changing the > file order. Unfortunately there's no debug flag. Of course you can run grep under GDB but it will require some expertise to puzzle out why the last two files are treated differently. Do you see the same problem if you run in the C locale? That is, set LC_ALL="C" in the environment. What does 'strace' say about grep's reading of the two files in question? Can you give the strace output for just those two files? I have the sneaking suspicion that the script is assuming properties of 'grep' that are not documented and that are not guaranteed. grep -I's heuristic for determining whether a file is "binary" is designed for that particular grep run, and does not necessarily agree with what other programs think are "binary files", or even what other instances of 'grep' think are "binary files". The strace output might help clear up whether this is what is happening. From unknown Sat Aug 16 19:17:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#73360: Error when a long list is provided to grep with "--binary-files=without-match" option Resent-From: "David G. Pickett" Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sat, 21 Sep 2024 19:14:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73360 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Rodrigo Jorge , Paul Eggert Cc: "73360@debbugs.gnu.org" <73360@debbugs.gnu.org> Received: via spool by 73360-submit@debbugs.gnu.org id=B73360.17269460187829 (code B ref 73360); Sat, 21 Sep 2024 19:14:01 +0000 Received: (at 73360) by debbugs.gnu.org; 21 Sep 2024 19:13:38 +0000 Received: from localhost ([127.0.0.1]:40328 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ss5Y1-00022C-W4 for submit@debbugs.gnu.org; Sat, 21 Sep 2024 15:13:38 -0400 Received: from sonic318-20.consmr.mail.gq1.yahoo.com ([98.137.70.146]:40927) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ss5Xy-00021v-QA for 73360@debbugs.gnu.org; Sat, 21 Sep 2024 15:13:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aol.com; s=a2048; t=1726945987; bh=muDvAZks5Odbh0lfvxyw4kIpVpGJYLiQlxLlIxisQiI=; h=Date:From:To:Cc:In-Reply-To:References:Subject:From:Subject:Reply-To; b=boOEmMYClUJgtaAH5ZlnR4J4uOo28d9jtOHzIUohb68yuxnKUVA605luRXXXq8GrBd/jAgPAfA3T7BS1ODoBM8qkXEaF9HR3q4imwaauvSXr/7AOkukMlPR+gRdLFD/q73YPuldOF5yymZBKBr8H0NaasP2PeaxDb86GLMcmxXwg9ePsIB1O/FtsrAp8RWa5uL9dZi2+RDiPiCDzMz9apw4nDvL1ImDs71eov7TWIyOvc3GNxRdC2HYi7X82guQNbiFGCcLZXjP8o82weUDAQPbJB64x0+A6sbUAayQB4lJQt9Hy6XDvoy9oHg07bFM8mOavbVQTOL8AnUOPqNOzTQ== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1726945987; bh=p8B8Pr12BwJViscNIG7Mb1NcxiPnPk/K+OGX0U3Ydu6=; h=X-Sonic-MF:Date:From:To:Subject:From:Subject; b=kJ7vSZSI6R14nrkW4wcucuG3MPlDUGixJcJG9rDuGR6zt465USXORse16w9tIHpnZ4kV5PPAMT6jZsr8L131F5lU19kZ4OxUsJRzAHdiQJ0o9qbiMeGBSYUpNXqz+OZJLiGFhmCA0T6q4VU6ai05A6QKGefrMx8vx1N84JIFWxqT1EXLZ331RW9tq4bt6Ew/xBnNxSYh/hsd9u6u92KhEuhcsg6ecDMON//ASMK5Sx8/DUcZpkjylDlDPLTJf72gENnk3Fxjwwg6s3Ojp5dtp7q9m5W+PVRykikTZOCkDbrwN0RGIQEfQJMCOkT2AXaSHGF2za9cxxQRHZeo9I6nYQ== X-YMail-OSG: 5cid.GAVM1l4NV7TyZhH9Zy9l6LEZmOSrlu_8ys_wRtVN1mK4v0VhnShpjRMjb7 6_bsY4clK8vSY_M2VlAc_WWq._l9emofTF.G_c0F6q5uoQFDLJbUOVocoaFfZcKs9o3YWSlTXEuF 9FZ9NhqDNhMpMOK..QVvnm.mBefmVfiDtXotPL2xgizMe2RI7tbSF_6mnJ72WuoTVZLTV22V3.Oc ajO2jBtaWC9oU.krh_8QGl2DBMsb8zfIkCgfJEgq6gQquiJjZSjnmIXMkdq1kU5hqcRN9VH2PZYV Ao54p.QPExYGwxNdzJVefslPM.GffzVsmQgkRogbxkqadwDbfn2Zb8k0jOkorCd1YjeFUm.smOHM p2aUlmf0YQ6Y356.FZhrm7Cn59kQ32Z_wSi37eskuC6b_fuX_YcvLuxGxA5.lujpOzKW4Qy23SNl Gg1.1juSa0stMnUjyE4LNDXdQhZ_EuZmBWDi12Txx_mvXmyvEH24oVJvijcpn5MwTf77nf16SEIc BB8LV3cW8PwhaeeeJi_pNlSHZybuwXsIbNV0WL.XB4hvJThQp98M1vmopbylq6_6i18c1Tw4pYjF H8z.38jowy5NQKaSquGi0hX7U73Q1MaZxRUt3ec1VIGXwHkaclFyhJkMvYG4n5ksMc9lDJbItwU3 fA9SJObi0ti3a9FvmZtVn97WegKMbGjodfA9NxBM_Fkh7YKN1fxQvkOahqusOUv7U2ED7TdJR0Hl dG3BJT8yhq5owrordmkTOFY0hr7eM4EcJQPZ8x.VT4MpaB8OgHBbgFqC_vdS.nC6_h1Ab6odc.K7 cZIwH78f4PVPYw4Vd11rTjkFwGBQE94GiJyn97b8aMn6gZCMComy5ciJAv8w.k8p9DhUqhgpeBYJ NCJbdVaMxaYsCiV.nF.swNkYsEhoLQwOf.xV.k0m6WSjuHgy_4bhJWVf8yCAyMKudJ8Q.L_F2qB5 4RfCAcNG135bvPO2hdhyWLOrepI_CPn1EpHd3CPMk7ThjfgFEjUNivGXK1IcAb.b2cxAVZMKft56 eDvnmNxnIXx.OVXSMqL0ybQOUay9X2l2tt3RF_CyEDM5BpPD3l1eZteLAthnlOFkikLSLl9eXvqD UjxGc4AZJ4gf_1HPx8n8Itj0GF5Nn.2HrkKTARAYh61uNONTqkQm7LQkMzpV59MOLqfWFE6e8q9A bUGgLRPYghl.kEr9ZmpjrRNj19nSlFKNbprSUVWDY6zyUvl2.W41KlODC28ceF2jvRGa7ZvHh_LX 7kKjgmDZpOu9vYS7aQiP5DiBNt86bkZdbrwmERCpTl5TBdtQL7srJ7rOBZUWvcg_t5uy7UbT_Ro7 SyHRrqQ7JIdj5OPHCVynEl4r4GUE6S2D4Gqiv5CP3ulDSj9Q0NbdIMFv1jdOJZnfLTpgXop8gIga u.bF6vMmz34S9DPfjwTZ4gGuUZBsXvdggnUvjOdRDlxKfUcMGisShqjvkjLCa941eoQ5RUwxBSUZ cD2yCkb9RvMPVFYXT76A64oD4lJ9CZtTOkxsurf3jEFL2eooze87zS5LZnmo_WiRPgsqiSoKkzQk Lh1TyOoHgJfcNgKU48fb51eiS2Xz42k52Tp559wsb24FIZv6tvTRgz6txJZ8y69D.czZbOGHecTq HyiO.wTJsqh_iyTuye4teomoMbXd0RYi7yfdN8Bt1k189LKGVfAJro_tzWeNbTPfkzbCGb2kB1P9 j4Mtdr4xywir3a9OnhaWitQxGJ1hJbmAO7T232fJQox8WMCz1jOLG4NRZBwEe5SsRLw1jo_9mlWQ yEvpVwjfDm2hheaYx05u4NWcfYZHxsk33oUcnVuDcKg_0Sl2RKrICIWQrZohSGkK3DI6sHSOrSWb CwDNU08fGKqXSGvsKAyjhzHL7rSfUSlSxJiEpdGe0p9Fs3l9lspZwvmwt3p6ZVk6HyddAoFQYFhE 9_8j3nJakoRxNB6ILn5b7G8XkQz3.v8lGZY3Ss5P8BzqxkIuqOzlvdi1IrnZwwBJq1NZ33KqIdRk HUDzwKuPrKqAZyOc3Wm7HDcgh1Pm.MNYO5VQ.GtnBeERkFvT9x4eUmzlvxJ3FurLlrIqvsFGV6uF LPD863g7Dl9VHYjB93B8VJm18hghQCi43GRIu4GAG77BDigkM3WC6yu4nN7gYfByAoH2qohTt3i0 NwwOeYtVqYTvRbbx54Kbr.MR6dLKwilk_J93F8fnWRK0H3wM400JVBlq17dFICJBj9s7UeZDGY_9 fuhcgyr6sqYgHqEhRHvWKjLnKmJ.sOCG1RGRD X-Sonic-MF: X-Sonic-ID: a7542b55-4543-4ae2-b0ad-29cfc27863a8 Received: from sonic.gate.mail.ne1.yahoo.com by sonic318.consmr.mail.gq1.yahoo.com with HTTP; Sat, 21 Sep 2024 19:13:07 +0000 Date: Sat, 21 Sep 2024 19:13:03 +0000 (UTC) From: "David G. Pickett" Message-ID: <4871111.8727936.1726945983631@mail.yahoo.com> In-Reply-To: <47257c29-9381-4765-8507-e897e44fae52@cs.ucla.edu> References: <2063277599.8366131.1726839123608@mail.yahoo.com> <47257c29-9381-4765-8507-e897e44fae52@cs.ucla.edu> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_8727935_1350945114.1726945983630" X-Mailer: WebService/1.1.22645 AolMailNorrin Content-Length: 4965 X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) ------=_Part_8727935_1350945114.1726945983630 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Linux strace (like Solaris truss) is a bit less confusing than gdb, and do= es not need assistance from a symbol preserving compile option -g and lack = of strip.=C2=A0 It can even start tracing running processes for which you h= ave no source code. On Saturday, September 21, 2024 at 01:41:42 AM EDT, Paul Eggert wrote: =20 On 2024-09-20 07:22, Rodrigo Jorge wrote: > Can I run those 2 grep commands with some sort of debug flag and send the= m > back for analysis? The file list is exactly the same, just changing the > file order. Unfortunately there's no debug flag. Of course you can run grep under=20 GDB but it will require some expertise to puzzle out why the last two=20 files are treated differently. Do you see the same problem if you run in the C locale? That is, set=20 LC_ALL=3D"C" in the environment. What does 'strace' say about grep's reading of the two files in=20 question? Can you give the strace output for just those two files? I have the sneaking suspicion that the script is assuming properties of=20 'grep' that are not documented and that are not guaranteed.=C2=A0 grep -I's= =20 heuristic for determining whether a file is "binary" is designed for=20 that particular grep run, and does not necessarily agree with what other=20 programs think are "binary files", or even what other instances of=20 'grep' think are "binary files". The strace output might help clear up=20 whether this is what is happening. =20 ------=_Part_8727935_1350945114.1726945983630 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Linux strace (like Solaris t= russ) is a bit less confusing than gdb, and does not need assistance from a= symbol preserving compile option -g and lack of strip.  It can even s= tart tracing running processes for which you have no source code.

=20
=20
On Saturday, September 21, 2024 at 01:41:42 AM EDT,= Paul Eggert <eggert@cs.ucla.edu> wrote:


On 2024-09-20 07:22, Rodrigo Jorge wr= ote:
> Can I run those 2 grep commands with some sort of deb= ug flag and send them
> back for analysis? The file li= st is exactly the same, just changing the
> file order= .


Unfortunately there's no debug= flag. Of course you can run grep under
GDB but it will = require some expertise to puzzle out why the last two
fi= les are treated differently.

Do you se= e the same problem if you run in the C locale? That is, set
LC_ALL=3D"C" in the environment.

W= hat does 'strace' say about grep's reading of the two files in
question? Can you give the strace output for just those two files?
I have the sneaking suspicion that the = script is assuming properties of
'grep' that are not doc= umented and that are not guaranteed.  grep -I's
heu= ristic for determining whether a file is "binary" is designed for
that particular grep run, and does not necessarily agree with wha= t other
programs think are "binary files", or even what = other instances of
'grep' think are "binary files". The = strace output might help clear up
whether this is what i= s happening.

------=_Part_8727935_1350945114.1726945983630-- From unknown Sat Aug 16 19:17:21 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Rodrigo Jorge Subject: bug#73360: closed (Re: bug#73360: Error when a long list is provided to grep with "--binary-files=without-match" option) Message-ID: References: <3acf1f78-7ac4-4391-8d68-f8683730b085@cs.ucla.edu> X-Gnu-PR-Message: they-closed 73360 X-Gnu-PR-Package: grep Reply-To: 73360@debbugs.gnu.org Date: Sun, 22 Sep 2024 06:41:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1726987262-22155-1" This is a multi-part message in MIME format... ------------=_1726987262-22155-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #73360: Error when a long list is provided to grep with "--binary-files=3Dw= ithout-match" option which was filed against the grep package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 73360@debbugs.gnu.org. --=20 73360: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D73360 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1726987262-22155-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 73360-done) by debbugs.gnu.org; 22 Sep 2024 06:40:10 +0000 Received: from localhost ([127.0.0.1]:40724 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ssGGP-0005br-NS for submit@debbugs.gnu.org; Sun, 22 Sep 2024 02:40:10 -0400 Received: from mail.cs.ucla.edu ([131.179.128.66]:41134) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ssGGN-0005bL-4G for 73360-done@debbugs.gnu.org; Sun, 22 Sep 2024 02:40:08 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 776E33C013279; Sat, 21 Sep 2024 23:39:39 -0700 (PDT) Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10032) with ESMTP id 8I-P60ITfQpK; Sat, 21 Sep 2024 23:39:38 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id C791C3C00FB31; Sat, 21 Sep 2024 23:39:38 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.cs.ucla.edu C791C3C00FB31 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=9D0B346E-2AEB-11ED-9476-E14B719DCE6C; t=1726987178; bh=5Rty6kwCdkJ1peLi1WrpjdK+BTCiKL0rZugZmtKKIJI=; h=Message-ID:Date:MIME-Version:From:To; b=Tygk0Faa2GtofUo3z3Q+ToYCAcDx0EcX4bnnSaV/VN+6taeJN0De8zch90brYGTrh COG+sKXh1Sy93N21Y/5Rj8x/UJ8KAj4WBzoKtLOyHn1VUpn0EeDlYBSF0UH+usT3Pv 7crI1ED5AFwF6b7t7hfyulilzXT6u4fFyzlc+f6RLxlbInsd+evHsQhUqGFcrLS8wD OezqMQZWbSKORveSempqai5xWUxMYOtxdJeoItTezQ0SI6YUcFk8p4Ro8NrLe4Qg4p PP7vU/dr6YuJOTMwm9VHm2Urc2WAxRKITJhdiMDsh1SBZDn/qftaypcir6heIWBqyp jCZ0WEDbE/oqA== X-Virus-Scanned: amavis at mail.cs.ucla.edu Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id TpBIKwYJhqKv; Sat, 21 Sep 2024 23:39:38 -0700 (PDT) Received: from [192.168.254.12] (unknown [47.150.137.250]) by mail.cs.ucla.edu (Postfix) with ESMTPSA id 9A6183C013279; Sat, 21 Sep 2024 23:39:38 -0700 (PDT) Content-Type: multipart/mixed; boundary="------------hjc01NkfAtXYkZm5HHMS3mpq" Message-ID: <3acf1f78-7ac4-4391-8d68-f8683730b085@cs.ucla.edu> Date: Sat, 21 Sep 2024 23:39:38 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: bug#73360: Error when a long list is provided to grep with "--binary-files=without-match" option From: Paul Eggert To: Rodrigo Jorge References: <2063277599.8366131.1726839123608@mail.yahoo.com> <47257c29-9381-4765-8507-e897e44fae52@cs.ucla.edu> Content-Language: en-US Organization: UCLA Computer Science Department In-Reply-To: <47257c29-9381-4765-8507-e897e44fae52@cs.ucla.edu> X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 73360-done Cc: 73360-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) This is a multi-part message in MIME format. --------------hjc01NkfAtXYkZm5HHMS3mpq Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 2024-09-20 22:41, Paul Eggert wrote: > I have the sneaking suspicion that the script is assuming properties of > 'grep' that are not documented and that are not guaranteed. In looking into the code a bit more, I can see some places where that is what is happening. A couple of things. First, grep 3.11 uses buffer sizes that depend on earlier files that it has scanned, and this affects whether grep decides later files are binary. This can lead to the sort of confusion that you mentioned. There are performance reasons to think that grep should not grow buffer sizes for later files merely because earlier files had very long lines, as huge buffers can hurt performance; so I installed onto the development repository on Savannah the first attached patch to fix that. As a side effect this may fix the symptoms you observed. Second, 'grep' is not a good tool for determining whether a file is text or binary, since the definition of "text" vs "binary" is application-specific and grep's definition is suitable for 'grep' and it's problematic to use it elsewhere. I installed the second attached patch to try to document this better. Hope this helps. Boldly closing this bug as fixed; if I'm wrong we can reopen it. --------------hjc01NkfAtXYkZm5HHMS3mpq Content-Type: text/x-patch; charset=UTF-8; name="0001-grep-avoid-huge-reads.patch" Content-Disposition: attachment; filename="0001-grep-avoid-huge-reads.patch" Content-Transfer-Encoding: base64 RnJvbSA4ZmIxNWZiNWJmZjM1ZmYwNjljZTEwOGIyZmY5ODdkYzcxODNkZTM3IE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBTYXQsIDIxIFNlcCAyMDI0IDEyOjMyOjUzIC0wNzAwClN1YmplY3Q6IFtQQVRD SCAxLzJdIGdyZXA6IGF2b2lkIGh1Z2UgcmVhZHMKClRoZSBwcmV2aW91cyBjb2RlIGNvdWxk IGNhbGwgJ3JlYWQnIHdpdGggYSBuZWFybHkgdW5ib3VuZGVkIHNpemUKaWYgdGhlIGlucHV0 IGhhZCBsb25nIGxpbmVzLCBhbmQgdGhpcyB1bmJvdW5kZWQgc2l6ZSBwZXJzaXN0ZWQKZnJv bSBvbmUgZmlsZSB0byB0aGUgbmV4dCBvbmNlIHRoZSBpbnB1dCBidWZmZXIgZ3Jldy4KVGhp cyBjb3VsZCBoYXZlIGJhZCBlZmZlY3RzIG9uIHRoZSBDUFUncyBkYXRhIGNhY2hlLAphbmQg YWxzbyBjb3VsZCBjYXVzZSAnZ3JlcCcgdG8gbWFrZSBjb3VudGVyaW50dWl0aXZlIGRlY2lz aW9ucyBhcwp0byB3aGV0aGVyIGEgZmlsZSBpcyBiaW5hcnkgPGh0dHBzOi8vYnVncy5nbnUu b3JnLzczMzYwPi4KSW5zdGVhZCwgcGljayBhIGdvb2QgcmVhZCBzaXplIGFuZCBzdGljayB3 aXRoIGl0OyB0aGlzIGlzCm1vcmUgY29uc2lzdGVudCwgYW5kIG1vcmUgbGlrZWx5IHRvIGZp dCBpbiBhIGNhY2hlLgoqIHNyYy9ncmVwLmMgKGdvb2RfcmVhZHNpemUpOiBOZXcgc3RhdGlj IHZhci4KKEdPT0RfUkVBRFNJWkVfTUlOKTogUmVuYW1lIGZyb20gSU5JVElBTF9CVUZTSVpF LiAgQWxsIHVzZXMgY2hhbmdlZC4KKGZpbGxidWYpOiBSZWFkIGdvb2RfcmVhZHNpemUgYnl0 ZXMgcmF0aGVyIHRoYW4gdHJ5aW5nIHRvCmZpbGwgdGhlIHJlc3Qgb2YgdGhlIGlucHV0IGJ1 ZmZlci4KKGRyYWluX2lucHV0KTogUmVhZCBnb29kX3JlYWRzaXplIHJhdGhlciB0aGFuIEdP T0RfUkVBRFNJWkVfTUlOCmJ5dGVzLgoobWFpbik6IEluaXRpYWxpemUgZ29vZF9yZWFkc2l6 ZS4KLS0tCiBzcmMvZ3JlcC5jIHwgMzIgKysrKysrKysrKysrKysrKysrKy0tLS0tLS0tLS0t LS0KIDEgZmlsZSBjaGFuZ2VkLCAxOSBpbnNlcnRpb25zKCspLCAxMyBkZWxldGlvbnMoLSkK CmRpZmYgLS1naXQgYS9zcmMvZ3JlcC5jIGIvc3JjL2dyZXAuYwppbmRleCA0NTkyYjIwLi45 MTJiY2U0IDEwMDY0NAotLS0gYS9zcmMvZ3JlcC5jCisrKyBiL3NyYy9ncmVwLmMKQEAgLTg3 Miw2ICs4NzIsNyBAQCBzdGF0aWMgaW50IGJ1ZmRlc2M7CQkvKiBGaWxlIGRlc2NyaXB0b3Iu ICovCiBzdGF0aWMgY2hhciAqYnVmYmVnOwkJLyogQmVnaW5uaW5nIG9mIHVzZXItdmlzaWJs ZSBzdHVmZi4gKi8KIHN0YXRpYyBjaGFyICpidWZsaW07CQkvKiBMaW1pdCBvZiB1c2VyLXZp c2libGUgc3R1ZmYuICovCiBzdGF0aWMgaWR4X3QgcGFnZXNpemU7CQkvKiBhbGlnbm1lbnQg b2YgbWVtb3J5IHBhZ2VzICovCitzdGF0aWMgaWR4X3QgZ29vZF9yZWFkc2l6ZTsJLyogZ29v ZCBzaXplIHRvIHBhc3MgdG8gJ3JlYWQnICovCiBzdGF0aWMgb2ZmX3QgYnVmb2Zmc2V0OwkJ LyogUmVhZCBvZmZzZXQuICAqLwogc3RhdGljIG9mZl90IGFmdGVyX2xhc3RfbWF0Y2g7CS8q IFBvaW50ZXIgYWZ0ZXIgbGFzdCBtYXRjaGluZyBsaW5lIHRoYXQKICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgd291bGQgaGF2ZSBiZWVuIG91dHB1dCBpZiB3ZSB3ZXJl CkBAIC04ODAsOCArODgxLDE1IEBAIHN0YXRpYyBib29sIHNraXBfbnVsczsJCS8qIFNraXAg J1wwJyBpbiBkYXRhLiAgKi8KIHN0YXRpYyBib29sIHNraXBfZW1wdHlfbGluZXM7CS8qIFNr aXAgZW1wdHkgbGluZXMgaW4gZGF0YS4gICovCiBzdGF0aWMgaW50bWF4X3QgdG90YWxubDsJ LyogVG90YWwgbmV3bGluZSBjb3VudCBiZWZvcmUgbGFzdG5sLiAqLwogCi0vKiBJbml0aWFs IGJ1ZmZlciBzaXplLCBub3QgY291bnRpbmcgc2xvcC4gKi8KLWVudW0geyBJTklUSUFMX0JV RlNJWkUgPSA5NiAqIDEwMjQgfTsKKy8qIE1pbmltdW0gdmFsdWUgZm9yIGdvb2RfcmVhZHNp emUuCisgICBJZiBpdCdzIHRvbyBzbWFsbCwgdGhlcmUgYXJlIG1vcmUgc3lzY2FsbHM7Cisg ICBpZiB0b28gbGFyZ2UsIGl0IHdhc3RlcyBtZW1vcnkgYW5kIGxpa2VseSBjYWNoZS4KKyAg IFVzZSA5NiBLaUIgYXMgaXQgZ2F2ZSBnb29kIHJlc3VsdHMgaW4gYSBiZW5jaG1hcmsgaW4g MjAxOAorICAgKHNlZSAyMDE4LTA5LTA2IGNvbW1pdCBsYWJlbGVkICJncmVwOiB0cmlwbGUg aW5pdGlhbCBidWZmZXIgc2l6ZTogMzJrLT45NmsiKQorICAgZXZlbiB0aG91Z2ggdGhlIHNh bWUgYmVuY2htYXJrIGluIDIwMjQgZm91bmQgbm8gc2lnbmlmaWNhbnQKKyAgIGRpZmZlcmVu Y2UgZm9yIHZhbHVlcyBmcm9tIDMyIEtpQiB0byAxMDI0IEtpQiBvbiBVYnVudHUgMjQuMDQu MSBMVFMKKyAgIHdpdGggYW4gSW50ZWwgWGVvbiBXLTEzNTAuICAqLworZW51bSB7IEdPT0Rf UkVBRFNJWkVfTUlOID0gOTYgKiAxMDI0IH07CiAKIC8qIFJldHVybiBWQUwgYWxpZ25lZCB0 byB0aGUgbmV4dCBtdWx0aXBsZSBvZiBBTElHTk1FTlQuICBWQUwgY2FuIGJlCiAgICBhbiBp bnRlZ2VyIG9yIGEgcG9pbnRlci4gIEJvdGggYXJncyBtdXN0IGJlIGZyZWUgb2Ygc2lkZSBl ZmZlY3RzLiAgKi8KQEAgLTk0Niw5ICs5NTQsOSBAQCBmaWxsYnVmIChpZHhfdCBzYXZlLCBz dHJ1Y3Qgc3RhdCBjb25zdCAqc3QpCiB7CiAgIGNoYXIgKnJlYWRidWY7CiAKLSAgLyogQWZ0 ZXIgQlVGTElNLCB3ZSBuZWVkIHJvb20gZm9yIGF0IGxlYXN0IGEgcGFnZSBvZiBkYXRhIHBs dXMgYQorICAvKiBBZnRlciBCVUZMSU0sIHdlIG5lZWQgcm9vbSBmb3IgYSBnb29kLXNpemVk IHJlYWQgcGx1cyBhCiAgICAgIHRyYWlsaW5nIHV3b3JkLiAgKi8KLSAgaWR4X3QgbWluX2Fm dGVyX2J1ZmxpbSA9IHBhZ2VzaXplICsgdXdvcmRfc2l6ZTsKKyAgaWR4X3QgbWluX2FmdGVy X2J1ZmxpbSA9IGdvb2RfcmVhZHNpemUgKyB1d29yZF9zaXplOwogCiAgIGlmIChtaW5fYWZ0 ZXJfYnVmbGltIDw9IGJ1ZmZlciArIGJ1ZmFsbG9jIC0gYnVmbGltKQogICAgIHJlYWRidWYg PSBidWZsaW07CkBAIC05NTcsOCArOTY1LDggQEAgZmlsbGJ1ZiAoaWR4X3Qgc2F2ZSwgc3Ry dWN0IHN0YXQgY29uc3QgKnN0KQogICAgICAgY2hhciAqbmV3YnVmOwogCiAgICAgICAvKiBG b3IgZGF0YSB0byBiZSBzZWFyY2hlZCB3ZSBuZWVkIHJvb20gZm9yIHRoZSBzYXZlZCBieXRl cywKLSAgICAgICAgIHBsdXMgYXQgbGVhc3QgYSBwYWdlIG9mIGRhdGEgdG8gcmVhZC4gICov Ci0gICAgICBpZHhfdCBtaW5zaXplID0gc2F2ZSArIHBhZ2VzaXplOworICAgICAgICAgcGx1 cyBhdCBsZWFzdCBhIGdvb2Qtc2l6ZWQgcmVhZC4gICovCisgICAgICBpZHhfdCBtaW5zaXpl ID0gc2F2ZSArIGdvb2RfcmVhZHNpemU7CiAKICAgICAgIC8qIEFkZCBlbm91Z2ggcm9vbSBz byB0aGF0IHRoZSBidWZmZXIgaXMgYWxpZ25lZCBhbmQgaGFzIHJvb20KICAgICAgICAgIGZv ciBieXRlIHNlbnRpbmVscyBmb3JlIGFuZCBhZnQsIGFuZCBzbyB0aGF0IGEgdXdvcmQgY2Fu CkBAIC0xMDAxLDE1ICsxMDA5LDEyIEBAIGZpbGxidWYgKGlkeF90IHNhdmUsIHN0cnVjdCBz dGF0IGNvbnN0ICpzdCkKIAogICBjbGVhcl9hc2FuX3BvaXNvbiAoKTsKIAotICBpZHhfdCBy ZWFkc2l6ZSA9IGJ1ZmZlciArIGJ1ZmFsbG9jIC0gdXdvcmRfc2l6ZSAtIHJlYWRidWY7Ci0g IHJlYWRzaXplIC09IHJlYWRzaXplICUgcGFnZXNpemU7Ci0KICAgcHRyZGlmZl90IGZpbGxz aXplOwogICBib29sIGNjID0gdHJ1ZTsKIAogICB3aGlsZSAodHJ1ZSkKICAgICB7Ci0gICAg ICBmaWxsc2l6ZSA9IHNhZmVfcmVhZCAoYnVmZGVzYywgcmVhZGJ1ZiwgcmVhZHNpemUpOwor ICAgICAgZmlsbHNpemUgPSBzYWZlX3JlYWQgKGJ1ZmRlc2MsIHJlYWRidWYsIGdvb2RfcmVh ZHNpemUpOwogICAgICAgaWYgKGZpbGxzaXplIDwgMCkKICAgICAgICAgewogICAgICAgICAg IGZpbGxzaXplID0gMDsKQEAgLTE3NjksMTIgKzE3NzQsMTIgQEAgZHJhaW5faW5wdXQgKGlu dCBmZCwgc3RydWN0IHN0YXQgY29uc3QgKnN0KQogI2lmZGVmIFNQTElDRV9GX01PVkUKICAg ICAgIC8qIFNob3VsZCBiZSBmYXN0ZXIsIHNpbmNlIGl0IG5lZWQgbm90IGNvcHkgZGF0YSB0 byB1c2VyIHNwYWNlLiAgKi8KICAgICAgIG5ieXRlcyA9IHNwbGljZSAoZmQsIG51bGxwdHIs IFNURE9VVF9GSUxFTk8sIG51bGxwdHIsCi0gICAgICAgICAgICAgICAgICAgICAgIElOSVRJ QUxfQlVGU0laRSwgU1BMSUNFX0ZfTU9WRSk7CisgICAgICAgICAgICAgICAgICAgICAgIGdv b2RfcmVhZHNpemUsIFNQTElDRV9GX01PVkUpOwogICAgICAgaWYgKDAgPD0gbmJ5dGVzIHx8 IGVycm5vICE9IEVJTlZBTCkKICAgICAgICAgewogICAgICAgICAgIHdoaWxlICgwIDwgbmJ5 dGVzKQogICAgICAgICAgICAgbmJ5dGVzID0gc3BsaWNlIChmZCwgbnVsbHB0ciwgU1RET1VU X0ZJTEVOTywgbnVsbHB0ciwKLSAgICAgICAgICAgICAgICAgICAgICAgICAgICAgSU5JVElB TF9CVUZTSVpFLCBTUExJQ0VfRl9NT1ZFKTsKKyAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgZ29vZF9yZWFkc2l6ZSwgU1BMSUNFX0ZfTU9WRSk7CiAgICAgICAgICAgcmV0dXJuIG5i eXRlcyA9PSAwOwogICAgICAgICB9CiAjZW5kaWYKQEAgLTI5OTQsNyArMjk5OSw4IEBAIG1h aW4gKGludCBhcmdjLCBjaGFyICoqYXJndikKICAgaWYgKCEgKDAgPCBwc2l6ZSAmJiBwc2l6 ZSA8PSAoSURYX01BWCAtIHV3b3JkX3NpemUpIC8gMikpCiAgICAgYWJvcnQgKCk7CiAgIHBh Z2VzaXplID0gcHNpemU7Ci0gIGJ1ZmFsbG9jID0gQUxJR05fVE8gKElOSVRJQUxfQlVGU0la RSwgcGFnZXNpemUpICsgcGFnZXNpemUgKyB1d29yZF9zaXplOworICBnb29kX3JlYWRzaXpl ID0gQUxJR05fVE8gKEdPT0RfUkVBRFNJWkVfTUlOLCBwYWdlc2l6ZSk7CisgIGJ1ZmFsbG9j ID0gZ29vZF9yZWFkc2l6ZSArIHBhZ2VzaXplICsgdXdvcmRfc2l6ZTsKICAgYnVmZmVyID0g eGltYWxsb2MgKGJ1ZmFsbG9jKTsKIAogICBpZiAoZnRzX29wdGlvbnMgJiBGVFNfTE9HSUNB TCAmJiBkZXZpY2VzID09IFJFQURfQ09NTUFORF9MSU5FX0RFVklDRVMpCi0tIAoyLjQzLjAK Cg== --------------hjc01NkfAtXYkZm5HHMS3mpq Content-Type: text/x-patch; charset=UTF-8; name="0002-doc-warn-re-using-grep-to-detect-binary-files.patch" Content-Disposition: attachment; filename="0002-doc-warn-re-using-grep-to-detect-binary-files.patch" Content-Transfer-Encoding: base64 RnJvbSA5NDRjMmVjY2M3ZTg4Mjk5YzQ3YWRhMWQzMTlmYmI1NzA1YmM3MTNkIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBTYXQsIDIxIFNlcCAyMDI0IDIyOjM3OjMzIC0wNzAwClN1YmplY3Q6IFtQQVRD SCAyLzJdID0/VVRGLTg/cT9kb2M6PTIwd2Fybj0yMHJlPTIwdXNpbmc9MjA9RTI9ODA9OThn cmVwPz0KID0/VVRGLTg/cT89RTI9ODA9OTk9MjB0bz0yMGRldGVjdD0yMGJpbmFyeT0yMGZp bGVzPz0KTUlNRS1WZXJzaW9uOiAxLjAKQ29udGVudC1UeXBlOiB0ZXh0L3BsYWluOyBjaGFy c2V0PVVURi04CkNvbnRlbnQtVHJhbnNmZXItRW5jb2Rpbmc6IDhiaXQKClRoaXMgaXMgaW4g cmVzcG9uc2UgdG8gYSBidWcgcmVwb3J0IGJ5IFJvZHJpZ28gSm9yZ2UKPGh0dHBzOi8vYnVn cy5nbnUub3JnLzczMzYwPi4KKiBkb2MvZ3JlcC50ZXhpIChGaWxlIGFuZCBEaXJlY3Rvcnkg U2VsZWN0aW9uKToKV2FybiB0aGF0IOKAmGdyZXDigJkgc2hvdWxkbuKAmXQgYmUgdXNlZCB0 byBkZXRlcm1pbmUgd2hldGhlcgphIGZpbGUgaXMgYmluYXJ5IGZvciBvdGhlciBhcHBsaWNh dGlvbnPigJkgcHVycG9zZXMsIGFzCnRoZWlyIGRlZmluaXRpb24gb2Yg4oCcYmluYXJ54oCd IG1heSB3ZWxsIGRpZmZlci4KSW1wcm92ZSBkb2N1bWVudGF0aW9uIGZvciBkaXNjb3Zlcnkg b2YgbnVsbCBpbnB1dC4KLS0tCiBkb2MvZ3JlcC50ZXhpIHwgMTggKysrKysrKysrKysrKysr LS0tCiAxIGZpbGUgY2hhbmdlZCwgMTUgaW5zZXJ0aW9ucygrKSwgMyBkZWxldGlvbnMoLSkK CmRpZmYgLS1naXQgYS9kb2MvZ3JlcC50ZXhpIGIvZG9jL2dyZXAudGV4aQppbmRleCA3ZDEw Zjg2Li45YzQ2ZTc2IDEwMDY0NAotLS0gYS9kb2MvZ3JlcC50ZXhpCisrKyBiL2RvYy9ncmVw LnRleGkKQEAgLTYzMSw5ICs2MzEsMTEgQEAgV2hlbiBzb21lIG91dHB1dCBpcyBzdXBwcmVz c2VkLCBAY29tbWFuZHtncmVwfSBmb2xsb3dzIGFueSBvdXRwdXQKIHdpdGggYSBtZXNzYWdl IHRvIHN0YW5kYXJkIGVycm9yIHNheWluZyB0aGF0IGEgYmluYXJ5IGZpbGUgbWF0Y2hlcy4K IAogSWYgQHZhcnt0eXBlfSBpcyBAc2FtcHt3aXRob3V0LW1hdGNofSwKLXdoZW4gQGNvbW1h bmR7Z3JlcH0gZGlzY292ZXJzIG51bGwgaW5wdXQgYmluYXJ5IGRhdGEKLWl0IGFzc3VtZXMg dGhhdCB0aGUgcmVzdCBvZiB0aGUgZmlsZSBkb2VzIG5vdCBtYXRjaDsKK3doZW4gQGNvbW1h bmR7Z3JlcH0gZGlzY292ZXJzIG51bGwgYmluYXJ5IGRhdGEgaW4gYW4gaW5wdXQgZmlsZQor aXQgYXNzdW1lcyB0aGF0IGFueSB1bnByb2Nlc3NlZCBpbnB1dCBkb2VzIG5vdCBtYXRjaDsK IHRoaXMgaXMgZXF1aXZhbGVudCB0byB0aGUgQG9wdGlvbnstSX0gb3B0aW9uLgorSW4gdGhp cyBjYXNlIHRoZSByZWdpb24gb2YgdW5wcm9jZXNzZWQgaW5wdXQgc3RhcnRzIG5vIGxhdGVy IHRoYW4gdGhlCitudWxsIGJpbmFyeSBkYXRhLCBhbmQgY29udGludWVzIHRvIGVuZCBvZiBm aWxlLgogCiBJZiBAdmFye3R5cGV9IGlzIEBzYW1we3RleHR9LAogQGNvbW1hbmR7Z3JlcH0g cHJvY2Vzc2VzIGJpbmFyeSBkYXRhIGFzIGlmIGl0IHdlcmUgdGV4dDsKQEAgLTY0OSw2ICs2 NTEsMTYgQEAgaXMgbm90IG1hdGNoZWQgd2hlbiBAdmFye3R5cGV9IGlzIEBzYW1we3RleHR9 LiAgQ29udmVyc2VseSwgd2hlbgogQHZhcnt0eXBlfSBpcyBAc2FtcHtiaW5hcnl9IHRoZSBw YXR0ZXJuIEBzYW1wey59IChwZXJpb2QpIG1pZ2h0IG5vdAogbWF0Y2ggYSBudWxsIGJ5dGUu CiAKK1RoZSBoZXVyaXN0aWMgdGhhdCBAY29tbWFuZHtncmVwfSB1c2VzIHRvIGludHVpdCB3 aGV0aGVyIGlucHV0IGlzCitiaW5hcnkgaXMgc3BlY2lmaWMgdG8gQGNvbW1hbmR7Z3JlcH0g YW5kIG1heSB3ZWxsIGJlIHVuc3VpdGFibGUgZm9yCitvdGhlciBhcHBsaWNhdGlvbnMsIGFz IGl0IGRlcGVuZHMgb24gY29tbWFuZC1saW5lIG9wdGlvbnMsIG9uIGxvY2FsZSwKK2FuZCBv biBoYXJkd2FyZSBhbmQgb3BlcmF0aW5nIHN5c3RlbSBjaGFyYWN0ZXJpc3RpY3Mgc3VjaCBh cyBzeXN0ZW0KK3BhZ2Ugc2l6ZSBhbmQgaW5wdXQgYnVmZmVyaW5nLiAgRm9yIGV4YW1wbGUs IGlmIHRoZSBpbnB1dCBjb25zaXN0cyBvZgorYSBtYXRjaGluZyB0ZXh0IGxpbmUgZm9sbG93 ZWQgYnkgbm9ubWF0Y2hpbmcgZGF0YSB0aGF0IGNvbnRhaW5zIGEgbnVsbAorYnl0ZSwgQGNv bW1hbmR7Z3JlcH0gbWlnaHQgZWl0aGVyIG91dHB1dCB0aGUgbWF0Y2hpbmcgbGluZSBvciB0 cmVhdAordGhlIGZpbGUgYXMgYmluYXJ5LCBkZXBlbmRpbmcgb24gd2hldGhlciB0aGUgdW5w cm9jZXNzZWQgaW5wdXQgaGFwcGVucwordG8gaW5jbHVkZSB0aGUgbWF0Y2hpbmcgdGV4dCBs aW5lLgorCiBAZW1waHtXYXJuaW5nOn0gVGhlIEBvcHRpb257LWF9IChAb3B0aW9uey0tYmlu YXJ5LWZpbGVzPXRleHR9KSBvcHRpb24KIG1pZ2h0IG91dHB1dCBiaW5hcnkgZ2FyYmFnZSwg d2hpY2ggY2FuIGhhdmUgbmFzdHkgc2lkZSBlZmZlY3RzIGlmIHRoZQogb3V0cHV0IGlzIGEg dGVybWluYWwgYW5kIGlmIHRoZSB0ZXJtaW5hbCBkcml2ZXIgaW50ZXJwcmV0cyBzb21lIG9m IGl0CkBAIC0yMDA0LDcgKzIwMTYsNyBAQCBOb3RlIHRoYXQgb24gc29tZSBwbGF0Zm9ybXMs CiBleGNlcHQgdGhlIGF2YWlsYWJsZSBtZW1vcnkuCiAKIEBpdGVtCi1XaHkgZG9lcyBAY29t bWFuZHtncmVwfSByZXBvcnQgYGBCaW5hcnkgZmlsZSBtYXRjaGVzJyc/CitXaHkgZG9lcyBA Y29tbWFuZHtncmVwfSByZXBvcnQgYGBiaW5hcnkgZmlsZSBtYXRjaGVzJyc/CiAKIElmIEBj b21tYW5ke2dyZXB9IGxpc3RlZCBhbGwgbWF0Y2hpbmcgYGBsaW5lcycnIGZyb20gYSBiaW5h cnkgZmlsZSwgaXQKIHdvdWxkIHByb2JhYmx5IGdlbmVyYXRlIG91dHB1dCB0aGF0IGlzIG5v dCB1c2VmdWwsIGFuZCBpdCBtaWdodCBldmVuCi0tIAoyLjQzLjAKCg== --------------hjc01NkfAtXYkZm5HHMS3mpq-- ------------=_1726987262-22155-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 19 Sep 2024 14:28:40 +0000 Received: from localhost ([127.0.0.1]:33208 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srI98-0002S7-JH for submit@debbugs.gnu.org; Thu, 19 Sep 2024 10:28:39 -0400 Received: from lists.gnu.org ([209.51.188.17]:41152) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1srHYL-00005S-RA for submit@debbugs.gnu.org; Thu, 19 Sep 2024 09:50:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1srHXz-0008QU-CH for bug-grep@gnu.org; Thu, 19 Sep 2024 09:50:19 -0400 Received: from mail-pj1-x1029.google.com ([2607:f8b0:4864:20::1029]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1srHXw-0006aS-Eb for bug-grep@gnu.org; Thu, 19 Sep 2024 09:50:15 -0400 Received: by mail-pj1-x1029.google.com with SMTP id 98e67ed59e1d1-2d88690837eso816946a91.2 for ; Thu, 19 Sep 2024 06:50:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726753808; x=1727358608; darn=gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=yEyZ3N3qldm1QFnovN74h565gJZEbuG665P188vSxDE=; b=edc0xy73dwYOJWTg/EFwO7wdVDYU+5sIbRm7jQbu2oIBYixgmKjllS9oyn0r9adS1y YQg5RnH7wEDnM7L+e0zTSa3JIL69SzlnqFU3gR6G+4hEeKzLUUxnzFQmyOjuQmDWsaTs Pijx6FojSe82XRurk5f0A10mhzd5wDDtHGihV5qhLwPJ8taimFT2JTdoel8qx8/15rNm hv1rMWkbCBghI8Zmwa5UhJL3zIw8o+He2eDD1deXju1aVWAEEfFKQ8mf0amPMAoFJghG L3FoGxsIzrAh8TYRGrlubv2wUN6HK5aQlHLQFq9B/1XQMijq4lDA3GzyrzlsZTUhxSCy unPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726753808; x=1727358608; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=yEyZ3N3qldm1QFnovN74h565gJZEbuG665P188vSxDE=; b=pCD/mki6PKWtxqXO7g4eyurJeGbd5mBFJb89/dq+WmtKQxUqBExRHUHNBGWxxOT2QT 4MWWOPsBRTPlhKcAhYb3LPs5/AqxMEXuXGFa2Ais7w/hhVbHwQJkMC2jQF/8OZrmjDCx L3dyvgw4JGWKcHzjQHR1pDHvkZxLHW7cPA71AhKyHLuysK/AbSIxV4XaAgUqUNamSbez rbA5qgHGSnOTl/FPol/6GeCyMw7PGAeKWOBvnZm9Z+p0CrOBLIbbbwp9cHwJeyUqXLFz LeAVxOmzr9w6BuUFwoyrHrCcWlKNFsbOCSTVz5p5vzdRbXPHKpaDlmAcqkFCQH3d73gP DRTQ== X-Gm-Message-State: AOJu0Yy6tTuOHKQFyOrz+4Xhz+OtcszhQPaxSHbdzU0Ikly+MEBSGQkS oRU5Miq7dRvmHeLX2uKaQV44dvf1Zg6j4w1iPhSjvi3M8TzAolgCmOaWyxAYBOptjE6fsDxiDFm vzWBGRSWNz2EfQXFD8kOQFtBvARzdhqLY X-Google-Smtp-Source: AGHT+IHLtyxKQPtbvSo9oaBp7Irl9wSwyqwwQNEWxdaRQysQLX5Gz1muHSvrFAnrB28likMEuAahSuzXogKHW0OAVp4= X-Received: by 2002:a17:90a:3fc4:b0:2d8:ea11:1c68 with SMTP id 98e67ed59e1d1-2dba00624d2mr27860224a91.31.1726753808191; Thu, 19 Sep 2024 06:50:08 -0700 (PDT) MIME-Version: 1.0 From: Rodrigo Jorge Date: Thu, 19 Sep 2024 10:49:31 -0300 Message-ID: Subject: Error when a long list is provided to grep with "--binary-files=without-match" option To: bug-grep@gnu.org Content-Type: multipart/alternative; boundary="0000000000005a1940062279330d" Received-SPF: pass client-ip=2607:f8b0:4864:20::1029; envelope-from=rodrigoaraujorge@gmail.com; helo=mail-pj1-x1029.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 19 Sep 2024 10:28:34 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --0000000000005a1940062279330d Content-Type: text/plain; charset="UTF-8" Hello. I'm trying to use grep to get the list of all non-binary files in a given folder. I tried with the 2.20 and the 3.11 release. For some reason, grep is providing 2 false negatives when the list is huge. This issue does not happen if I break the grep input with "xargs -n X". Check below: [opc@oradiff-core dbhome_1]$ grep -V grep (GNU grep) 3.11 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later < https://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and others; see . [opc@oradiff-core dbhome_1]$ find -type f -not -path "./.patch_storage/*" -not -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 -n 100 grep -Il '.' > /tmp/list1.list [opc@oradiff-core dbhome_1]$ find -type f -not -path "./.patch_storage/*" -not -name "tfa_setup" -print0 2>> /tmp/error.list | xargs -0 grep -Il '.' > /tmp/list2.list [opc@oradiff-core dbhome_1]$ diff /tmp/list1.list /tmp/list2.list 12268,12269d12267 < ./apex/images/apex_ui/psd/apex_5_ui.ai < ./apex/images/apex_ui/psd/apex-logo.ai [opc@oradiff-core dbhome_1]$ wc -l /tmp/list1.list /tmp/list2.list 23397 /tmp/list1.list 23395 /tmp/list2.list 46792 total The output should not show any difference. The same issue was also reproduced in grep 2.20. Thanks, Rodrigo --0000000000005a1940062279330d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello. I'm trying to use grep to get the list of = all non-binary files in a given folder. I tried with the 2.20 and the 3.11 = release.

For some reason, grep is providing 2 fals= e negatives when the list is huge. This issue does not happen if I break th= e grep input with "xargs -n X".

Check be= low:

[opc@oradiff-core dbhome= _1]$ grep -V
grep (GNU grep) 3.11
Copyright (C) 2023 Free Software Fo= undation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html&g= t;.
This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law.

Written by M= ike Haertel and others; see
<https://git.savannah.gnu.org/cgit/grep.git/tree= /AUTHORS>.

[opc@oradiff-core dbhome_1]$ find -t= ype f -not -path "./.patch_storage/*" -not -name "tfa_setup&= quot; -print0 2>> /tmp/error.list | xargs -0 -n 100 grep -Il '.&#= 39; > /tmp/list1.list

[opc@oradiff-core dbhome_1]$ fi= nd -type f -not -path "./.patch_storage/*" -not -name "tfa_s= etup" -print0 2>> /tmp/error.list | xargs -0 grep -Il '.'= ; > /tmp/list2.list

[opc@oradiff-core dbhome_1]$ diff= /tmp/list1.list /tmp/list2.list
12268,12269d12267
<= ; ./apex/images/apex_ui/psd/apex_5_ui.ai
< ./apex/ima= ges/apex_ui/psd/apex-logo.ai
[opc@oradiff-core dbhome_1]$ wc -l /tmp/list1.list /tmp/list2.list=
=C2=A0 23397 /tmp/list1.list
=C2=A0 23395 /tmp/list2.list
=C2=A0 = 46792 total

The output should not show any = difference.

The same issue was also reproduced= in grep 2.20.

Thanks,
Rodrigo
=
--0000000000005a1940062279330d-- ------------=_1726987262-22155-1-- From unknown Sat Aug 16 19:17:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#73360: Error when a long list is provided to grep with "--binary-files=without-match" option Resent-From: Rodrigo Jorge Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Mon, 23 Sep 2024 13:00:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73360 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Paul Eggert Cc: 73360-done@debbugs.gnu.org Received: via spool by 73360-done@debbugs.gnu.org id=D73360.172709639115957 (code D ref 73360); Mon, 23 Sep 2024 13:00:02 +0000 Received: (at 73360-done) by debbugs.gnu.org; 23 Sep 2024 12:59:51 +0000 Received: from localhost ([127.0.0.1]:43467 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ssifP-00049J-9u for submit@debbugs.gnu.org; Mon, 23 Sep 2024 08:59:51 -0400 Received: from mail-pl1-f169.google.com ([209.85.214.169]:52364) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ssifL-000491-UO for 73360-done@debbugs.gnu.org; Mon, 23 Sep 2024 08:59:50 -0400 Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-2053525bd90so39179345ad.0 for <73360-done@debbugs.gnu.org>; Mon, 23 Sep 2024 05:59:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727096299; x=1727701099; darn=debbugs.gnu.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Tbdj4AzxBqDo6dcX3v4+wiYy9y25B0+GCfePLMvQS94=; b=WT/zgky2ahBom9aS3FvOiLzo62e76jqBNld12ojmB52BsdbVPPmL3rGH2VsfuR9o2p uuTOnzm3SIuNcG3murzbL8KIiKmjY8cdTf/3UfPdCRg2kCKYlzMhdjeb5D0Wj4HYnnEM b0EaPLfwhbyxqoyuXGC02eOvAbfYttw0mz/XILmWqqy8jcRj5tg/6UqbseNSK95pd0BK i03dE5Upv3Hd8JWA4PwwHFkSvga9GQhv88RNavGFqh+hTLDQlJi6+CxXXolxmSd70Xoe FxoscNUXnD+5ODDo7NZS9whjP27fPScRBNdhJU39xf7dsIlj+JFh4IUMg5JVwqAIXyTr UYkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727096299; x=1727701099; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Tbdj4AzxBqDo6dcX3v4+wiYy9y25B0+GCfePLMvQS94=; b=LfwqGmoX1n66XLURIS/qhYpomKIGijRuQyAPBdhyTIiZv75Eyk14sZFdTf/pvjYObY wMuQMqVTd+rElnMBQ2Q7cQCyeU0kPvm3n9VKc9Yu3tdwAQtMAUOfAt5CnsigDN+Ft6Kb duQuve6WUEFyLC2UULrT7SxHFSA6TlSQVo9hjIlAF4U137RT1Y/UAX7uEU8/nw+VS7Op XeueVUs3x0pHYm0/w/u97mjKoLLzK3cASTguKEDhmJobNEHOA8D9SxbcnL+rovFf44c+ rE7gZBJkXW1MX8crBPS694qGcxTIy4wWQOtANwbLVmce8msHeEEylqQx8tSuPOgSJtV6 0bUw== X-Gm-Message-State: AOJu0YxklUmeBeASs+Kou4zA/tMbC6IVN0uv9k/zYXvl6ZAtAF5hsYL0 RKSoL4HYdsr+XV69mm/J0thrMEpzZcrW1kVskTB+UOccKCQeS7E/HN5rxsLaS8vwerwikbZ1ebt GSptX0QD2iQj4BFHgSS4bCQG66AI= X-Google-Smtp-Source: AGHT+IGWgqWsbptMoXL2aTiY8udntCY/i5abuR5FaWV/yvPSAFlEc8x0TPWcsALK7M3n7qyMNngCKE81gG96XTn0FyE= X-Received: by 2002:a17:902:d488:b0:206:8c4a:7bbb with SMTP id d9443c01a7336-208d8592f02mr164174395ad.58.1727096298819; Mon, 23 Sep 2024 05:58:18 -0700 (PDT) MIME-Version: 1.0 References: <2063277599.8366131.1726839123608@mail.yahoo.com> <47257c29-9381-4765-8507-e897e44fae52@cs.ucla.edu> <3acf1f78-7ac4-4391-8d68-f8683730b085@cs.ucla.edu> In-Reply-To: <3acf1f78-7ac4-4391-8d68-f8683730b085@cs.ucla.edu> From: Rodrigo Jorge Date: Mon, 23 Sep 2024 09:57:41 -0300 Message-ID: Content-Type: multipart/alternative; boundary="000000000000625bc90622c8f134" X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --000000000000625bc90622c8f134 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks, Paul. I tried to clone and compile your latest changes from the Savannah repo but since some extra requirements are probably needed to compile from master branch (that are beyond my knowledge), I ended up not being able to validate it. Anyway, thanks for the correction and fix implementation! Regards, Rodrigo On Sun, Sep 22, 2024 at 3:39=E2=80=AFAM Paul Eggert wr= ote: > On 2024-09-20 22:41, Paul Eggert wrote: > > I have the sneaking suspicion that the script is assuming properties of > > 'grep' that are not documented and that are not guaranteed. > > In looking into the code a bit more, I can see some places where that is > what is happening. > > A couple of things. > > First, grep 3.11 uses buffer sizes that depend on earlier files that it > has scanned, and this affects whether grep decides later files are > binary. This can lead to the sort of confusion that you mentioned. There > are performance reasons to think that grep should not grow buffer sizes > for later files merely because earlier files had very long lines, as > huge buffers can hurt performance; so I installed onto the development > repository on Savannah the first attached patch to fix that. As a side > effect this may fix the symptoms you observed. > > Second, 'grep' is not a good tool for determining whether a file is text > or binary, since the definition of "text" vs "binary" is > application-specific and grep's definition is suitable for 'grep' and > it's problematic to use it elsewhere. I installed the second attached > patch to try to document this better. > > Hope this helps. > > Boldly closing this bug as fixed; if I'm wrong we can reopen it. --000000000000625bc90622c8f134 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks, Paul.

I tried to clo= ne and compile your latest changes from the Savannah repo but since some ex= tra requirements are probably needed to compile from master branch (that ar= e beyond my knowledge), I ended up not being able to validate it. Anyway, t= hanks for the correction and fix implementation!

R= egards,
Rodrigo

On Sun, Sep 22, 2024 at 3:39=E2=80=AFAM = Paul Eggert <egg= ert@cs.ucla.edu> wrote:
On 2024-09-20 22:41, Paul Eggert wrote:
> I have the sneaking suspicion that the script is assuming properties o= f
> 'grep' that are not documented and that are not guaranteed.
In looking into the code a bit more, I can see some places where that is what is happening.

A couple of things.

First, grep 3.11 uses buffer sizes that depend on earlier files that it has scanned, and this affects whether grep decides later files are
binary. This can lead to the sort of confusion that you mentioned. There are performance reasons to think that grep should not grow buffer sizes for later files merely because earlier files had very long lines, as
huge buffers can hurt performance; so I installed onto the development
repository on Savannah the first attached patch to fix that. As a side
effect this may fix the symptoms you observed.

Second, 'grep' is not a good tool for determining whether a file is= text
or binary, since the definition of "text" vs "binary" i= s
application-specific and grep's definition is suitable for 'grep= 9; and
it's problematic to use it elsewhere. I installed the second attached <= br> patch to try to document this better.

Hope this helps.

Boldly closing this bug as fixed; if I'm wrong we can reopen it.
--000000000000625bc90622c8f134--