From unknown Wed Jun 18 23:07:13 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#34951 <34951@debbugs.gnu.org> To: bug#34951 <34951@debbugs.gnu.org> Subject: Status: [PATCH] grep: a kwset matcher not work in a grep matcher Reply-To: bug#34951 <34951@debbugs.gnu.org> Date: Thu, 19 Jun 2025 06:07:13 +0000 retitle 34951 [PATCH] grep: a kwset matcher not work in a grep matcher reassign 34951 grep submitter 34951 Norihiro Tanaka severity 34951 normal tag 34951 patch thanks From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 22 22:29:37 2019 Received: (at submit) by debbugs.gnu.org; 23 Mar 2019 02:29:37 +0000 Received: from localhost ([127.0.0.1]:54860 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h7WPe-0000Be-DI for submit@debbugs.gnu.org; Fri, 22 Mar 2019 22:29:34 -0400 Received: from eggs.gnu.org ([209.51.188.92]:49072) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h7WPb-0000BQ-Mr for submit@debbugs.gnu.org; Fri, 22 Mar 2019 22:29:32 -0400 Received: from lists.gnu.org ([209.51.188.17]:36603) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1h7Tmz-0002mA-HG for submit@debbugs.gnu.org; Fri, 22 Mar 2019 19:42:36 -0400 Received: from eggs.gnu.org ([209.51.188.92]:51189) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h7TMz-00018e-Q5 for bug-grep@gnu.org; Fri, 22 Mar 2019 19:14:39 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h7TFO-0000RS-Cx for bug-grep@gnu.org; Fri, 22 Mar 2019 19:06:47 -0400 Received: from mailgw06.kcn.ne.jp ([61.86.7.213]:48846) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h7TFK-0000IR-2o for bug-grep@gnu.org; Fri, 22 Mar 2019 19:06:45 -0400 Received: from mxs02-s (mailgw2.kcn.ne.jp [61.86.15.234]) by mailgw06.kcn.ne.jp (Postfix) with ESMTP id EFAACE8047D for ; Sat, 23 Mar 2019 08:06:36 +0900 (JST) X-matriXscan-loop-detect: d48ed8e6547c90cf45c696b68d2f541f74cc0c67 Received: from mail10.kcn.ne.jp ([61.86.6.128]) by mxs02-s with ESMTP; Sat, 23 Mar 2019 08:06:35 +0900 (JST) Received: from [10.120.1.89] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail10.kcn.ne.jp (Postfix) with ESMTPA id CEA75412923E; Sat, 23 Mar 2019 08:06:35 +0900 (JST) Date: Sat, 23 Mar 2019 08:06:35 +0900 From: Norihiro Tanaka To: Subject: [PATCH] grep: a kwset matcher not work in a grep matcher Message-Id: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------_5C9567D300000000E6DC_MULTIPART_MIXED_" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.73 [ja] X-matriXscan-Sophos-AV: Clean X-matriXscan-Action: Approve X-matriXscan: Uncategorized X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 61.86.7.213 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: submit Cc: bug-gnulib@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --------_5C9567D300000000E6DC_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit A kwset matcher is not built in a grep matcher after token re-order is introduced in commit 5c7a0371823876cca7a1347fa09ca26bbbff0c98 in dfa. It caused performance degradation in some typical cases. This bug is introduced in grep-3.2. DFAMUST() does not work if tokens which are parsed in dfa matcher are re-ordered. Therefore, change as it is called after parse and before tokens re-order. BTW, this change does not affect programs that do not use DFAMUST(), such as sed or gawk. $ yes $(printf '%040d' 0) | head -10000000 >inp $ grep-2.2/src/grep 01.2 inp real 1.61 user 1.53 sys 0.07 $ grep-2.3/src/grep 01.2 inp real 1.57 user 1.48 sys 0.08 $ grep-2.4/src/grep 01.2 inp real 1.50 user 1.44 sys 0.05 $ grep-2.4.1/src/grep 01.2 inp real 1.53 user 1.48 sys 0.05 $ grep-2.4.2/src/grep 01.2 inp real 1.52 user 1.47 sys 0.04 $ grep-2.5.4/src/grep 01.2 inp real 1.53 user 1.47 sys 0.05 $ grep-2.6/src/grep 01.2 inp real 1.51 user 1.47 sys 0.04 $ grep-2.6.1/src/grep 01.2 inp real 1.50 user 1.44 sys 0.05 $ grep-2.6.2/src/grep 01.2 inp real 1.52 user 1.46 sys 0.05 $ grep-2.6.3/src/grep 01.2 inp real 1.52 user 1.47 sys 0.05 $ grep-2.7/src/grep 01.2 inp real 1.53 user 1.49 sys 0.04 $ grep-2.8/src/grep 01.2 inp real 1.52 user 1.46 sys 0.05 $ grep-2.9/src/grep 01.2 inp real 1.54 user 1.50 sys 0.04 $ grep-2.10/src/grep 01.2 inp real 1.51 user 1.46 sys 0.05 $ grep-2.11/src/grep 01.2 inp real 1.53 user 1.48 sys 0.05 $ grep-2.12/src/grep 01.2 inp real 1.51 user 1.47 sys 0.03 $ grep-2.13/src/grep 01.2 inp real 1.52 user 1.47 sys 0.03 $ grep-2.14/src/grep 01.2 inp real 1.52 user 1.47 sys 0.04 $ grep-2.15/src/grep 01.2 inp real 1.55 user 1.49 sys 0.05 $ grep-2.16/src/grep 01.2 inp real 1.53 user 1.48 sys 0.04 $ grep-2.17/src/grep 01.2 inp real 1.53 user 1.48 sys 0.05 $ grep-2.18/src/grep 01.2 inp real 1.51 user 1.44 sys 0.06 $ grep-2.19/src/grep 01.2 inp real 0.06 user 0.02 sys 0.04 $ grep-2.20/src/grep 01.2 inp real 0.07 user 0.01 sys 0.05 $ grep-2.21/src/grep 01.2 inp real 0.06 user 0.02 sys 0.04 $ grep-2.22/src/grep 01.2 inp real 0.06 user 0.01 sys 0.05 $ grep-2.23/src/grep 01.2 inp real 0.09 user 0.04 sys 0.05 $ grep-2.24/src/grep 01.2 inp real 0.09 user 0.04 sys 0.04 $ grep-2.25/src/grep 01.2 inp real 0.09 user 0.05 sys 0.04 $ grep-2.26/src/grep 01.2 inp real 0.09 user 0.04 sys 0.05 $ grep-2.27/src/grep 01.2 inp real 0.09 user 0.04 sys 0.04 $ grep-2.28/src/grep 01.2 inp real 0.09 user 0.04 sys 0.04 $ grep-3.0/src/grep 01.2 inp real 0.09 user 0.04 sys 0.04 $ grep-3.1/src/grep 01.2 inp real 0.11 user 0.05 sys 0.06 $ grep-3.2/src/grep 01.2 inp real 0.37 user 0.32 sys 0.04 $ grep-3.3/src/grep 01.2 inp real 0.29 user 0.25 sys 0.04 Thanks, Norihiro --------_5C9567D300000000E6DC_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII"; name="0001-dfa-separate-parse-and-compile-phase.patch" Content-Disposition: attachment; filename="0001-dfa-separate-parse-and-compile-phase.patch" Content-Transfer-Encoding: base64 RnJvbSBmY2E2YTRjM2I5ZTA3NTc2MzdiN2EyMDA5Y2E4YjkwNzBhNjg3NGY1IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE YXRlOiBTYXQsIDIzIE1hciAyMDE5IDA3OjE4OjM3ICswOTAwClN1YmplY3Q6IFtQQVRDSF0gZGZh OiBzZXBhcmF0ZSBwYXJzZSBhbmQgY29tcGlsZSBwaGFzZQoKREZBTVVTVCgpIG11c3QgYmUgY2Fs bGVkIGFmdGVyIHBhcnNlIGFuZCBiZWZvcmUgdG9rZW5zIHJlLW9yZGVyIHdoaWNoIGlzCmludHJv ZHVjZWQgaW4gY29tbWl0IDVjN2EwMzcxODIzODc2Y2NhN2ExMzQ3ZmEwOWNhMjZiYmJmZjBjOTgs IGJ1dCBib3RoIGFyZQpleGVjdXRlZCBpbiBjb21waWxhdGlvbiBwaGFzZS4KCiogbGliL2RmYS5j IChkZmFwYXJzZSk6IENoYW5nZSBpdCB0byBnbG9iYWwgZnVuY3Rpb24uCihkZmFjb21wKTogSWYg Zmlyc3QgYXJndW1lbnQgaXMgTlVMTCwgc2tpcCBwYXJzZS4KKiBsaWIvZGZhLmg6IChkZmFwYXJz ZSk6IEFkZCBhIHByb3RvdHlwZS4KLS0tCiBzcmMvZGZhc2VhcmNoLmMgfCAgICAzICsrLQogMSBm aWxlcyBjaGFuZ2VkLCAyIGluc2VydGlvbnMoKyksIDEgZGVsZXRpb25zKC0pCgpkaWZmIC0tZ2l0 IGEvc3JjL2RmYXNlYXJjaC5jIGIvc3JjL2RmYXNlYXJjaC5jCmluZGV4IDNlYmQyNWUuLmYzZjg4 OWYgMTAwNjQ0Ci0tLSBhL3NyYy9kZmFzZWFyY2guYworKysgYi9zcmMvZGZhc2VhcmNoLmMKQEAg LTIwMiw4ICsyMDIsOSBAQCBHRUFjb21waWxlIChjaGFyICpwYXR0ZXJuLCBzaXplX3Qgc2l6ZSwg cmVnX3N5bnRheF90IHN5bnRheF9iaXRzKQogICBlbHNlCiAgICAgbW90aWYgPSBOVUxMOwogCi0g IGRmYWNvbXAgKHBhdHRlcm4sIHNpemUsIGRjLT5kZmEsIDEpOworICBkZmFwYXJzZSAocGF0dGVy biwgc2l6ZSwgZGMtPmRmYSk7CiAgIGt3c211c3RzIChkYyk7CisgIGRmYWNvbXAgKE5VTEwsIDAs IGRjLT5kZmEsIDEpOwogCiAgIGZyZWUgKG1vdGlmKTsKIAotLSAKMS43LjEKCg== --------_5C9567D300000000E6DC_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII"; name="0001-grep-a-kwset-matcher-not-work-in-a-grep-matcher.patch" Content-Disposition: attachment; filename="0001-grep-a-kwset-matcher-not-work-in-a-grep-matcher.patch" Content-Transfer-Encoding: base64 RnJvbSA2MGQ0N2UxZDlhNWFmMThlZWI2MTcxOWM3YWM4YzhhZTA5N2EwNmU0IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE YXRlOiBTYXQsIDIzIE1hciAyMDE5IDA3OjM5OjA0ICswOTAwClN1YmplY3Q6IFtQQVRDSF0gZ3Jl cDogYSBrd3NldCBtYXRjaGVyIG5vdCB3b3JrIGluIGEgZ3JlcCBtYXRjaGVyCgpBIGt3c2V0IG1h dGNoZXIgaXMgbm90IGJ1aWx0IGluIGEgZ3JlcCBtYXRjaGVyIGFmdGVyIHRva2VucyByZS1vcmRl ciBpcwppbnRyb2R1Y2VkIGluIGNvbW1pdCA1YzdhMDM3MTgyMzg3NmNjYTdhMTM0N2ZhMDljYTI2 YmJiZmYwYzk4IGluIGRmYS4KTm93IERGQU1VU1QoKSBtdXN0IGJlIGNhbGxlZCBhZnRlciBwYXJz ZSBhbmQgYmVmb3JlIGNvbXBpbGUuCgoqIHNyYy9kZmFzZWFyY2guYyAoR0VBY29tcGlsZSk6IEJ1 aWxkIGEga3dzZXQgbWF0Y2ggYmVmb3JlIGNvbXBpbGUgZGZhLgotLS0KIHNyYy9kZmFzZWFyY2gu YyB8ICAgIDMgKystCiAxIGZpbGVzIGNoYW5nZWQsIDIgaW5zZXJ0aW9ucygrKSwgMSBkZWxldGlv bnMoLSkKCmRpZmYgLS1naXQgYS9zcmMvZGZhc2VhcmNoLmMgYi9zcmMvZGZhc2VhcmNoLmMKaW5k ZXggM2ViZDI1ZS4uZjNmODg5ZiAxMDA2NDQKLS0tIGEvc3JjL2RmYXNlYXJjaC5jCisrKyBiL3Ny Yy9kZmFzZWFyY2guYwpAQCAtMjAyLDggKzIwMiw5IEBAIEdFQWNvbXBpbGUgKGNoYXIgKnBhdHRl cm4sIHNpemVfdCBzaXplLCByZWdfc3ludGF4X3Qgc3ludGF4X2JpdHMpCiAgIGVsc2UKICAgICBt b3RpZiA9IE5VTEw7CiAKLSAgZGZhY29tcCAocGF0dGVybiwgc2l6ZSwgZGMtPmRmYSwgMSk7Cisg IGRmYXBhcnNlIChwYXR0ZXJuLCBzaXplLCBkYy0+ZGZhKTsKICAga3dzbXVzdHMgKGRjKTsKKyAg ZGZhY29tcCAoTlVMTCwgMCwgZGMtPmRmYSwgMSk7CiAKICAgZnJlZSAobW90aWYpOwogCi0tIAox LjcuMQoK --------_5C9567D300000000E6DC_MULTIPART_MIXED_-- From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 22 22:49:14 2019 Received: (at 34951) by debbugs.gnu.org; 23 Mar 2019 02:49:14 +0000 Received: from localhost ([127.0.0.1]:54882 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h7Wif-0000kD-Ru for submit@debbugs.gnu.org; Fri, 22 Mar 2019 22:49:14 -0400 Received: from mailgw06.kcn.ne.jp ([61.86.7.213]:39132) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h7Wid-0000jp-Ox for 34951@debbugs.gnu.org; Fri, 22 Mar 2019 22:49:13 -0400 Received: from mxs02-s (mailgw2.kcn.ne.jp [61.86.15.234]) by mailgw06.kcn.ne.jp (Postfix) with ESMTP id 5D4C1E80612 for <34951@debbugs.gnu.org>; Sat, 23 Mar 2019 11:49:03 +0900 (JST) X-matriXscan-loop-detect: f5892a53e0215d157fd3ec2055933e41d141c31b Received: from mail10.kcn.ne.jp ([61.86.6.128]) by mxs02-s with ESMTP; Sat, 23 Mar 2019 11:49:03 +0900 (JST) Received: from [10.120.1.89] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail10.kcn.ne.jp (Postfix) with ESMTPA id 1108640AA1CC; Sat, 23 Mar 2019 11:49:03 +0900 (JST) Date: Sat, 23 Mar 2019 11:49:02 +0900 From: Norihiro Tanaka To: 34951@debbugs.gnu.org Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher In-Reply-To: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> Message-Id: <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------_5C959DEC00000000E6FE_MULTIPART_MIXED_" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.73 [ja] X-matriXscan-Sophos-AV: Clean X-matriXscan-Action: Approve X-matriXscan: Uncategorized X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34951 Cc: bug-gnulib@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --------_5C959DEC00000000E6FE_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit On Sat, 23 Mar 2019 08:06:35 +0900 Norihiro Tanaka wrote: > A kwset matcher is not built in a grep matcher after token re-order is > introduced in commit 5c7a0371823876cca7a1347fa09ca26bbbff0c98 in dfa. > It caused performance degradation in some typical cases. This bug is > introduced in grep-3.2. > > DFAMUST() does not work if tokens which are parsed in dfa matcher are > re-ordered. Therefore, change as it is called after parse and before > tokens re-order. > > BTW, this change does not affect programs that do not use DFAMUST(), > such as sed or gawk. > > $ yes $(printf '%040d' 0) | head -10000000 >inp > $ grep-2.2/src/grep 01.2 inp > real 1.61 > user 1.53 > sys 0.07 > $ grep-2.3/src/grep 01.2 inp > real 1.57 > user 1.48 > sys 0.08 > $ grep-2.4/src/grep 01.2 inp > real 1.50 > user 1.44 > sys 0.05 > $ grep-2.4.1/src/grep 01.2 inp > real 1.53 > user 1.48 > sys 0.05 > $ grep-2.4.2/src/grep 01.2 inp > real 1.52 > user 1.47 > sys 0.04 > $ grep-2.5.4/src/grep 01.2 inp > real 1.53 > user 1.47 > sys 0.05 > $ grep-2.6/src/grep 01.2 inp > real 1.51 > user 1.47 > sys 0.04 > $ grep-2.6.1/src/grep 01.2 inp > real 1.50 > user 1.44 > sys 0.05 > $ grep-2.6.2/src/grep 01.2 inp > real 1.52 > user 1.46 > sys 0.05 > $ grep-2.6.3/src/grep 01.2 inp > real 1.52 > user 1.47 > sys 0.05 > $ grep-2.7/src/grep 01.2 inp > real 1.53 > user 1.49 > sys 0.04 > $ grep-2.8/src/grep 01.2 inp > real 1.52 > user 1.46 > sys 0.05 > $ grep-2.9/src/grep 01.2 inp > real 1.54 > user 1.50 > sys 0.04 > $ grep-2.10/src/grep 01.2 inp > real 1.51 > user 1.46 > sys 0.05 > $ grep-2.11/src/grep 01.2 inp > real 1.53 > user 1.48 > sys 0.05 > $ grep-2.12/src/grep 01.2 inp > real 1.51 > user 1.47 > sys 0.03 > $ grep-2.13/src/grep 01.2 inp > real 1.52 > user 1.47 > sys 0.03 > $ grep-2.14/src/grep 01.2 inp > real 1.52 > user 1.47 > sys 0.04 > $ grep-2.15/src/grep 01.2 inp > real 1.55 > user 1.49 > sys 0.05 > $ grep-2.16/src/grep 01.2 inp > real 1.53 > user 1.48 > sys 0.04 > $ grep-2.17/src/grep 01.2 inp > real 1.53 > user 1.48 > sys 0.05 > $ grep-2.18/src/grep 01.2 inp > real 1.51 > user 1.44 > sys 0.06 > $ grep-2.19/src/grep 01.2 inp > real 0.06 > user 0.02 > sys 0.04 > $ grep-2.20/src/grep 01.2 inp > real 0.07 > user 0.01 > sys 0.05 > $ grep-2.21/src/grep 01.2 inp > real 0.06 > user 0.02 > sys 0.04 > $ grep-2.22/src/grep 01.2 inp > real 0.06 > user 0.01 > sys 0.05 > $ grep-2.23/src/grep 01.2 inp > real 0.09 > user 0.04 > sys 0.05 > $ grep-2.24/src/grep 01.2 inp > real 0.09 > user 0.04 > sys 0.04 > $ grep-2.25/src/grep 01.2 inp > real 0.09 > user 0.05 > sys 0.04 > $ grep-2.26/src/grep 01.2 inp > real 0.09 > user 0.04 > sys 0.05 > $ grep-2.27/src/grep 01.2 inp > real 0.09 > user 0.04 > sys 0.04 > $ grep-2.28/src/grep 01.2 inp > real 0.09 > user 0.04 > sys 0.04 > $ grep-3.0/src/grep 01.2 inp > real 0.09 > user 0.04 > sys 0.04 > $ grep-3.1/src/grep 01.2 inp > real 0.11 > user 0.05 > sys 0.06 > $ grep-3.2/src/grep 01.2 inp > real 0.37 > user 0.32 > sys 0.04 > $ grep-3.3/src/grep 01.2 inp > real 0.29 > user 0.25 > sys 0.04 > > Thanks, > Norihiro Missing a patch for dfa. Re-send correct patch file. --------_5C959DEC00000000E6FE_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII"; name="0001-dfa-separate-parse-and-compile-phase.patch" Content-Disposition: attachment; filename="0001-dfa-separate-parse-and-compile-phase.patch" Content-Transfer-Encoding: base64 RnJvbSBkMTJkMzI1NjA0M2E3OTJmZTY1ODMwZmY2NDY5YmJhNjQxODg3NmUxIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE YXRlOiBTYXQsIDIzIE1hciAyMDE5IDA4OjE5OjExICswOTAwClN1YmplY3Q6IFtQQVRDSF0gZGZh OiBzZXBhcmF0ZSBwYXJzZSBhbmQgY29tcGlsZSBwaGFzZQoKREZBTVVTVCgpIG11c3QgYmUgY2Fs bGVkIGFmdGVyIHBhcnNlIGFuZCBiZWZvcmUgdG9rZW5zIHJlLW9yZGVyIHdoaWNoIGlzCmludHJv ZHVjZWQgaW4gY29tbWl0IDVjN2EwMzcxODIzODc2Y2NhN2ExMzQ3ZmEwOWNhMjZiYmJmZjBjOTgs IGJ1dCBib3RoIGFyZQpleGVjdXRlZCBpbiBjb21waWxhdGlvbiBwaGFzZS4KCiogbGliL2RmYS5j IChkZmFwYXJzZSk6IENoYW5nZSBpdCB0byBnbG9iYWwgZnVuY3Rpb24uCihkZmFjb21wKTogSWYg Zmlyc3QgYXJndW1lbnQgaXMgTlVMTCwgc2tpcCBwYXJzZS4KKiBsaWIvZGZhLmg6IChkZmFwYXJz ZSk6IEFkZCBhIHByb3RvdHlwZS4KLS0tCiBsaWIvZGZhLmMgfCAgICA2ICsrKystLQogbGliL2Rm YS5oIHwgICAgMyArKysKIDIgZmlsZXMgY2hhbmdlZCwgNyBpbnNlcnRpb25zKCspLCAyIGRlbGV0 aW9ucygtKQoKZGlmZiAtLWdpdCBhL2xpYi9kZmEuYyBiL2xpYi9kZmEuYwppbmRleCAzMjlhMjA5 Li4xZTEyNWI0IDEwMDY0NAotLS0gYS9saWIvZGZhLmMKKysrIGIvbGliL2RmYS5jCkBAIC0xOTY5 LDcgKzE5NjksNyBAQCByZWdleHAgKHN0cnVjdCBkZmEgKmRmYSkKIC8qIE1haW4gZW50cnkgcG9p bnQgZm9yIHRoZSBwYXJzZXIuICBTIGlzIGEgc3RyaW5nIHRvIGJlIHBhcnNlZCwgbGVuIGlzIHRo ZQogICAgbGVuZ3RoIG9mIHRoZSBzdHJpbmcsIHNvIHMgY2FuIGluY2x1ZGUgTlVMIGNoYXJhY3Rl cnMuICBEIGlzIGEgcG9pbnRlciB0bwogICAgdGhlIHN0cnVjdCBkZmEgdG8gcGFyc2UgaW50by4g ICovCi1zdGF0aWMgdm9pZAordm9pZAogZGZhcGFyc2UgKGNoYXIgY29uc3QgKnMsIHNpemVfdCBs ZW4sIHN0cnVjdCBkZmEgKmQpCiB7CiAgIGQtPmxleC5wdHIgPSBzOwpAQCAtMzc0NSw3ICszNzQ1 LDkgQEAgZGZhc3NidWlsZCAoc3RydWN0IGRmYSAqZCkKIHZvaWQKIGRmYWNvbXAgKGNoYXIgY29u c3QgKnMsIHNpemVfdCBsZW4sIHN0cnVjdCBkZmEgKmQsIGJvb2wgc2VhcmNoZmxhZykKIHsKLSAg ZGZhcGFyc2UgKHMsIGxlbiwgZCk7CisgIGlmIChzICE9IE5VTEwpCisgICAgZGZhcGFyc2UgKHMs IGxlbiwgZCk7CisKICAgZGZhc3NidWlsZCAoZCk7CiAKICAgaWYgKGRmYV9zdXBwb3J0ZWQgKGQp KQpkaWZmIC0tZ2l0IGEvbGliL2RmYS5oIGIvbGliL2RmYS5oCmluZGV4IDYwNTEyZTIuLjIyMWY3 ZDEgMTAwNjQ0Ci0tLSBhL2xpYi9kZmEuaAorKysgYi9saWIvZGZhLmgKQEAgLTcxLDYgKzcxLDkg QEAgZXh0ZXJuIHN0cnVjdCBkZmFtdXN0ICpkZmFtdXN0IChzdHJ1Y3QgZGZhIGNvbnN0ICopOwog LyogRnJlZSB0aGUgc3RvcmFnZSBoZWxkIGJ5IHRoZSBjb21wb25lbnRzIG9mIGEgc3RydWN0IGRm YW11c3QuICovCiBleHRlcm4gdm9pZCBkZmFtdXN0ZnJlZSAoc3RydWN0IGRmYW11c3QgKik7CiAK Ky8qIFBhcnNlIHRoZSBnaXZlbiBzdHJpbmcgb2YgZ2l2ZW4gbGVuZ3RoIGludG8gdGhlIGdpdmVu IHN0cnVjdCBkZmEuICAqLworZXh0ZXJuIHZvaWQgZGZhcGFyc2UgKGNoYXIgY29uc3QgKiwgc2l6 ZV90LCBzdHJ1Y3QgZGZhICopOworCiAvKiBDb21waWxlIHRoZSBnaXZlbiBzdHJpbmcgb2YgdGhl IGdpdmVuIGxlbmd0aCBpbnRvIHRoZSBnaXZlbiBzdHJ1Y3QgZGZhLgogICAgRmluYWwgYXJndW1l bnQgaXMgYSBmbGFnIHNwZWNpZnlpbmcgd2hldGhlciB0byBidWlsZCBhIHNlYXJjaGluZyBvciBh bgogICAgZXhhY3QgbWF0Y2hlci4gKi8KLS0gCjEuNy4xCgo= --------_5C959DEC00000000E6FE_MULTIPART_MIXED_-- From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 22 22:58:46 2019 Received: (at 34951) by debbugs.gnu.org; 23 Mar 2019 02:58:46 +0000 Received: from localhost ([127.0.0.1]:54886 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h7Wru-00010O-0q for submit@debbugs.gnu.org; Fri, 22 Mar 2019 22:58:46 -0400 Received: from mail-yw1-f67.google.com ([209.85.161.67]:46286) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h7Wrr-000108-RH for 34951@debbugs.gnu.org; Fri, 22 Mar 2019 22:58:44 -0400 Received: by mail-yw1-f67.google.com with SMTP id v127so3270388ywe.13 for <34951@debbugs.gnu.org>; Fri, 22 Mar 2019 19:58:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=rm246jZtY9Po0AL9f7xcFkov9c4zLduiq04OPc399ac=; b=Gevw1lKl/5rQUc8qU7l2yJIZUHXElwiZ/kF1e/nH6TPu5mdCK7Rw/5moggjthWYKo8 J1Nf9hE8ucOTJ8o19d/0qEn0yIn13+RLLrOMNiuBUVyOjYEu6TscsAauo2zoW5JlULst TFM8N5NASw1JtGtK8kSngTMGl0X932lz4Cha2GQhowfL206SBZcoEgrc2EV1tj//tDBL 5iIj91YdSm7YgtxeejiATfO63Zk7DkQLAflUw2inVPulsMODLGJqtajM0GBPyb9KAubl Sz3omPIDRSSNkH1xG+HPVL8d+btagZlHgbje0zKFarLAbWjStYIQrbGZYUVRG93ckHnW gl9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=rm246jZtY9Po0AL9f7xcFkov9c4zLduiq04OPc399ac=; b=JbIjUqcrHJY7LpYMsid9Ozp0kE96UdoFJz/E1c9sCKf7jZyfmG2AJFWog2NBrtI/b0 3wMlDrfbuxoMZ22Ta0HWUUNPeg9SiNhDphBJX5e9RC+gos6A73zBy7ICDDuVs7lD+tXG o21+55THhh1fmPX9zgxqPR2GKLKeeM+HBk17Ty6rSPbCYwqJGImGXhqL+eOYNksbhyLS WjVrs++fCIXDP6QO9IpKYeJuj6qbfobeYp8+j+Q55eZZ8Jwny3uQ3CTMBnE1ZUyFDn5f NoWTgCjBMbxaoMEofzZpM/4oITjpOYoSPoO2E5rvWIjpuf280vt4gsU2AjUIP+gNCrWY o1vQ== X-Gm-Message-State: APjAAAXbgyWYarLqu0T1J4TXnilWBtc6ZH7mS85Gef6PlNdGzFStJNTh zKfHOWbCmhui/M7sf4DjQdJN5zfgwCzZMbAtFWY= X-Google-Smtp-Source: APXvYqwzZT0xYUJ1Wd+KVj4hOVMJ0WocadLE1QBBIJVo4UaEr3JOvDiZTYFNXnQByhEjqFH7hTDyv3Z86aVH/YnIfjM= X-Received: by 2002:a25:b4a:: with SMTP id 71mr11700083ybl.104.1553309918137; Fri, 22 Mar 2019 19:58:38 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a25:3794:0:0:0:0:0 with HTTP; Fri, 22 Mar 2019 19:58:37 -0700 (PDT) In-Reply-To: <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> From: Budi Date: Sat, 23 Mar 2019 09:58:37 +0700 Message-ID: Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher To: Norihiro Tanaka Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34951 Cc: 34951@debbugs.gnu.org, bug-gnulib@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) How make grep walinh through FS by scanning breadth first instead of the usual depth On 3/23/19, Norihiro Tanaka wrote: > On Sat, 23 Mar 2019 08:06:35 +0900 > Norihiro Tanaka wrote: > >> A kwset matcher is not built in a grep matcher after token re-order is >> introduced in commit 5c7a0371823876cca7a1347fa09ca26bbbff0c98 in dfa. >> It caused performance degradation in some typical cases. This bug is >> introduced in grep-3.2. >> >> DFAMUST() does not work if tokens which are parsed in dfa matcher are >> re-ordered. Therefore, change as it is called after parse and before >> tokens re-order. >> >> BTW, this change does not affect programs that do not use DFAMUST(), >> such as sed or gawk. >> >> $ yes $(printf '%040d' 0) | head -10000000 >inp >> $ grep-2.2/src/grep 01.2 inp >> real 1.61 >> user 1.53 >> sys 0.07 >> $ grep-2.3/src/grep 01.2 inp >> real 1.57 >> user 1.48 >> sys 0.08 >> $ grep-2.4/src/grep 01.2 inp >> real 1.50 >> user 1.44 >> sys 0.05 >> $ grep-2.4.1/src/grep 01.2 inp >> real 1.53 >> user 1.48 >> sys 0.05 >> $ grep-2.4.2/src/grep 01.2 inp >> real 1.52 >> user 1.47 >> sys 0.04 >> $ grep-2.5.4/src/grep 01.2 inp >> real 1.53 >> user 1.47 >> sys 0.05 >> $ grep-2.6/src/grep 01.2 inp >> real 1.51 >> user 1.47 >> sys 0.04 >> $ grep-2.6.1/src/grep 01.2 inp >> real 1.50 >> user 1.44 >> sys 0.05 >> $ grep-2.6.2/src/grep 01.2 inp >> real 1.52 >> user 1.46 >> sys 0.05 >> $ grep-2.6.3/src/grep 01.2 inp >> real 1.52 >> user 1.47 >> sys 0.05 >> $ grep-2.7/src/grep 01.2 inp >> real 1.53 >> user 1.49 >> sys 0.04 >> $ grep-2.8/src/grep 01.2 inp >> real 1.52 >> user 1.46 >> sys 0.05 >> $ grep-2.9/src/grep 01.2 inp >> real 1.54 >> user 1.50 >> sys 0.04 >> $ grep-2.10/src/grep 01.2 inp >> real 1.51 >> user 1.46 >> sys 0.05 >> $ grep-2.11/src/grep 01.2 inp >> real 1.53 >> user 1.48 >> sys 0.05 >> $ grep-2.12/src/grep 01.2 inp >> real 1.51 >> user 1.47 >> sys 0.03 >> $ grep-2.13/src/grep 01.2 inp >> real 1.52 >> user 1.47 >> sys 0.03 >> $ grep-2.14/src/grep 01.2 inp >> real 1.52 >> user 1.47 >> sys 0.04 >> $ grep-2.15/src/grep 01.2 inp >> real 1.55 >> user 1.49 >> sys 0.05 >> $ grep-2.16/src/grep 01.2 inp >> real 1.53 >> user 1.48 >> sys 0.04 >> $ grep-2.17/src/grep 01.2 inp >> real 1.53 >> user 1.48 >> sys 0.05 >> $ grep-2.18/src/grep 01.2 inp >> real 1.51 >> user 1.44 >> sys 0.06 >> $ grep-2.19/src/grep 01.2 inp >> real 0.06 >> user 0.02 >> sys 0.04 >> $ grep-2.20/src/grep 01.2 inp >> real 0.07 >> user 0.01 >> sys 0.05 >> $ grep-2.21/src/grep 01.2 inp >> real 0.06 >> user 0.02 >> sys 0.04 >> $ grep-2.22/src/grep 01.2 inp >> real 0.06 >> user 0.01 >> sys 0.05 >> $ grep-2.23/src/grep 01.2 inp >> real 0.09 >> user 0.04 >> sys 0.05 >> $ grep-2.24/src/grep 01.2 inp >> real 0.09 >> user 0.04 >> sys 0.04 >> $ grep-2.25/src/grep 01.2 inp >> real 0.09 >> user 0.05 >> sys 0.04 >> $ grep-2.26/src/grep 01.2 inp >> real 0.09 >> user 0.04 >> sys 0.05 >> $ grep-2.27/src/grep 01.2 inp >> real 0.09 >> user 0.04 >> sys 0.04 >> $ grep-2.28/src/grep 01.2 inp >> real 0.09 >> user 0.04 >> sys 0.04 >> $ grep-3.0/src/grep 01.2 inp >> real 0.09 >> user 0.04 >> sys 0.04 >> $ grep-3.1/src/grep 01.2 inp >> real 0.11 >> user 0.05 >> sys 0.06 >> $ grep-3.2/src/grep 01.2 inp >> real 0.37 >> user 0.32 >> sys 0.04 >> $ grep-3.3/src/grep 01.2 inp >> real 0.29 >> user 0.25 >> sys 0.04 >> >> Thanks, >> Norihiro > > Missing a patch for dfa. Re-send correct patch file. > From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 22 22:59:07 2019 Received: (at 34951) by debbugs.gnu.org; 23 Mar 2019 02:59:07 +0000 Received: from localhost ([127.0.0.1]:54890 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h7WsF-00011O-Dm for submit@debbugs.gnu.org; Fri, 22 Mar 2019 22:59:07 -0400 Received: from mail-yw1-f66.google.com ([209.85.161.66]:39979) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h7WsD-00010o-SE for 34951@debbugs.gnu.org; Fri, 22 Mar 2019 22:59:06 -0400 Received: by mail-yw1-f66.google.com with SMTP id p64so3304787ywg.7 for <34951@debbugs.gnu.org>; Fri, 22 Mar 2019 19:59:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=p+CUNJp5YF2WRU+tpa1AnfOrHlZehLbVI5nKD3iSTeY=; b=LDUoGsupS6fcci47wFhwGXChn0EyImvgbSIK8lCJKjgUKbru8KA9LCxpV5AfscgciP NQx5NH0npqTC1UsJWvCm1FFkKUOx3nWm2OhuXWjZdGmS7M7g9frVikRYEpRH2R6yis0H TuA6snh3A6wjJ7ZrONngYE2umUG5hP15VFj2aehpZTNOOCm/bo/HRTjbrbi/7H7nuYjA vrkci03XNpg89IH2TRbYrDTmDy+ACFBG83errX6YtGEAjl+Fx3zLU+hnNZlpO86NDNbo /H3m+QDYgigasDLxbnD1o4dQvJHV6J2HaaiPuLlQvBlA9JP2CjV80+apMswRyFRSt7/E DUSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=p+CUNJp5YF2WRU+tpa1AnfOrHlZehLbVI5nKD3iSTeY=; b=iCP8PlSqrHH5Ae0lN5pD6PvFpG+070XwNcwyiA6gvu1JInGibPwiMWjBvXOCjG4Snm pJro6IHFZpSLEUxgMjT1+SKT8dy+fJhqd9Eu+o/NF7TsWt+1mnvUd6laHT3SU8dp4qJY lHciQf8DtLnFKL3xISth9NeurX2FCwka1deK5zOWCx9Yi51o+D4TdkQVUl/5FMnFI+Jt m49gG9WeIjRpA3qWowziqP7odE/i5/3Gn3XnWoPilPdekmH+KA5T68TqQbF0m3/CD2ZB HL1fXkyyZpLwDTPNGiwTe9wq1qPXVXDPtK+94RJPDxgYDKhlUinWQ/uJv2ETL8GuKmgp g63A== X-Gm-Message-State: APjAAAWsIUeusc/aZMj21TKqvQVUR7X5ZAvHLKRCLm/M/r5YET+ZuL1W HK8WhNQtNf7hF3TT3mQ9erWZgOAk/ZJCPB0u0SSbY01c X-Google-Smtp-Source: APXvYqwsSTwoUyQFdgclPnmGMQrC3Ctj+TCbMK5DOfYHF1nKfF+NzmJyy78MPecaRGdc6pS1Wo16koJtW7DHVoXG5V0= X-Received: by 2002:a25:b31b:: with SMTP id l27mr11643745ybj.67.1553309940360; Fri, 22 Mar 2019 19:59:00 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a25:3794:0:0:0:0:0 with HTTP; Fri, 22 Mar 2019 19:59:00 -0700 (PDT) In-Reply-To: References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> From: Budi Date: Sat, 23 Mar 2019 09:59:00 +0700 Message-ID: Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher To: Norihiro Tanaka Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34951 Cc: 34951@debbugs.gnu.org, bug-gnulib@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) How make grep walking through FS by scanning breadth first instead of On 3/23/19, Budi wrote: > How make grep walinh through FS by scanning breadth first instead of > the usual depth > > On 3/23/19, Norihiro Tanaka wrote: >> On Sat, 23 Mar 2019 08:06:35 +0900 >> Norihiro Tanaka wrote: >> >>> A kwset matcher is not built in a grep matcher after token re-order is >>> introduced in commit 5c7a0371823876cca7a1347fa09ca26bbbff0c98 in dfa. >>> It caused performance degradation in some typical cases. This bug is >>> introduced in grep-3.2. >>> >>> DFAMUST() does not work if tokens which are parsed in dfa matcher are >>> re-ordered. Therefore, change as it is called after parse and before >>> tokens re-order. >>> >>> BTW, this change does not affect programs that do not use DFAMUST(), >>> such as sed or gawk. >>> >>> $ yes $(printf '%040d' 0) | head -10000000 >inp >>> $ grep-2.2/src/grep 01.2 inp >>> real 1.61 >>> user 1.53 >>> sys 0.07 >>> $ grep-2.3/src/grep 01.2 inp >>> real 1.57 >>> user 1.48 >>> sys 0.08 >>> $ grep-2.4/src/grep 01.2 inp >>> real 1.50 >>> user 1.44 >>> sys 0.05 >>> $ grep-2.4.1/src/grep 01.2 inp >>> real 1.53 >>> user 1.48 >>> sys 0.05 >>> $ grep-2.4.2/src/grep 01.2 inp >>> real 1.52 >>> user 1.47 >>> sys 0.04 >>> $ grep-2.5.4/src/grep 01.2 inp >>> real 1.53 >>> user 1.47 >>> sys 0.05 >>> $ grep-2.6/src/grep 01.2 inp >>> real 1.51 >>> user 1.47 >>> sys 0.04 >>> $ grep-2.6.1/src/grep 01.2 inp >>> real 1.50 >>> user 1.44 >>> sys 0.05 >>> $ grep-2.6.2/src/grep 01.2 inp >>> real 1.52 >>> user 1.46 >>> sys 0.05 >>> $ grep-2.6.3/src/grep 01.2 inp >>> real 1.52 >>> user 1.47 >>> sys 0.05 >>> $ grep-2.7/src/grep 01.2 inp >>> real 1.53 >>> user 1.49 >>> sys 0.04 >>> $ grep-2.8/src/grep 01.2 inp >>> real 1.52 >>> user 1.46 >>> sys 0.05 >>> $ grep-2.9/src/grep 01.2 inp >>> real 1.54 >>> user 1.50 >>> sys 0.04 >>> $ grep-2.10/src/grep 01.2 inp >>> real 1.51 >>> user 1.46 >>> sys 0.05 >>> $ grep-2.11/src/grep 01.2 inp >>> real 1.53 >>> user 1.48 >>> sys 0.05 >>> $ grep-2.12/src/grep 01.2 inp >>> real 1.51 >>> user 1.47 >>> sys 0.03 >>> $ grep-2.13/src/grep 01.2 inp >>> real 1.52 >>> user 1.47 >>> sys 0.03 >>> $ grep-2.14/src/grep 01.2 inp >>> real 1.52 >>> user 1.47 >>> sys 0.04 >>> $ grep-2.15/src/grep 01.2 inp >>> real 1.55 >>> user 1.49 >>> sys 0.05 >>> $ grep-2.16/src/grep 01.2 inp >>> real 1.53 >>> user 1.48 >>> sys 0.04 >>> $ grep-2.17/src/grep 01.2 inp >>> real 1.53 >>> user 1.48 >>> sys 0.05 >>> $ grep-2.18/src/grep 01.2 inp >>> real 1.51 >>> user 1.44 >>> sys 0.06 >>> $ grep-2.19/src/grep 01.2 inp >>> real 0.06 >>> user 0.02 >>> sys 0.04 >>> $ grep-2.20/src/grep 01.2 inp >>> real 0.07 >>> user 0.01 >>> sys 0.05 >>> $ grep-2.21/src/grep 01.2 inp >>> real 0.06 >>> user 0.02 >>> sys 0.04 >>> $ grep-2.22/src/grep 01.2 inp >>> real 0.06 >>> user 0.01 >>> sys 0.05 >>> $ grep-2.23/src/grep 01.2 inp >>> real 0.09 >>> user 0.04 >>> sys 0.05 >>> $ grep-2.24/src/grep 01.2 inp >>> real 0.09 >>> user 0.04 >>> sys 0.04 >>> $ grep-2.25/src/grep 01.2 inp >>> real 0.09 >>> user 0.05 >>> sys 0.04 >>> $ grep-2.26/src/grep 01.2 inp >>> real 0.09 >>> user 0.04 >>> sys 0.05 >>> $ grep-2.27/src/grep 01.2 inp >>> real 0.09 >>> user 0.04 >>> sys 0.04 >>> $ grep-2.28/src/grep 01.2 inp >>> real 0.09 >>> user 0.04 >>> sys 0.04 >>> $ grep-3.0/src/grep 01.2 inp >>> real 0.09 >>> user 0.04 >>> sys 0.04 >>> $ grep-3.1/src/grep 01.2 inp >>> real 0.11 >>> user 0.05 >>> sys 0.06 >>> $ grep-3.2/src/grep 01.2 inp >>> real 0.37 >>> user 0.32 >>> sys 0.04 >>> $ grep-3.3/src/grep 01.2 inp >>> real 0.29 >>> user 0.25 >>> sys 0.04 >>> >>> Thanks, >>> Norihiro >> >> Missing a patch for dfa. Re-send correct patch file. >> > From debbugs-submit-bounces@debbugs.gnu.org Sat Mar 23 08:39:37 2019 Received: (at 34951) by debbugs.gnu.org; 23 Mar 2019 12:39:37 +0000 Received: from localhost ([127.0.0.1]:55057 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h7fvz-0002FX-67 for submit@debbugs.gnu.org; Sat, 23 Mar 2019 08:39:36 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48648) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h7fvw-0002FH-VQ for 34951@debbugs.gnu.org; Sat, 23 Mar 2019 08:39:33 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2CEF08830A; Sat, 23 Mar 2019 12:39:27 +0000 (UTC) Received: from [10.3.116.65] (ovpn-116-65.phx2.redhat.com [10.3.116.65]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5EA1E1001DC9; Sat, 23 Mar 2019 12:39:26 +0000 (UTC) Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher To: Budi , Norihiro Tanaka References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> From: Eric Blake Openpgp: preference=signencrypt Autocrypt: addr=eblake@redhat.com; keydata= xsBNBEvHyWwBCACw7DwsQIh0kAbUXyqhfiKAKOTVu6OiMGffw2w90Ggrp4bdVKmCaEXlrVLU xphBM8mb+wsFkU+pq9YR621WXo9REYVIl0FxKeQo9dyQBZ/XvmUMka4NOmHtFg74nvkpJFCD TUNzmqfcjdKhfFV0d7P/ixKQeZr2WP1xMcjmAQY5YvQ2lUoHP43m8TtpB1LkjyYBCodd+LkV GmCx2Bop1LSblbvbrOm2bKpZdBPjncRNob73eTpIXEutvEaHH72LzpzksfcKM+M18cyRH+nP sAd98xIbVjm3Jm4k4d5oQyE2HwOur+trk2EcxTgdp17QapuWPwMfhaNq3runaX7x34zhABEB AAHNHkVyaWMgQmxha2UgPGVibGFrZUByZWRoYXQuY29tPsLAegQTAQgAJAIbAwULCQgHAwUV CgkICwUWAgMBAAIeAQIXgAUCS8fL9QIZAQAKCRCnoWtKJSdDahBHCACbl/5FGkUqJ89GAjeX RjpAeJtdKhujir0iS4CMSIng7fCiGZ0fNJCpL5RpViSo03Q7l37ss+No+dJI8KtAp6ID+PMz wTJe5Egtv/KGUKSDvOLYJ9WIIbftEObekP+GBpWP2+KbpADsc7EsNd70sYxExD3liwVJYqLc Rw7so1PEIFp+Ni9A1DrBR5NaJBnno2PHzHPTS9nmZVYm/4I32qkLXOcdX0XElO8VPDoVobG6 gELf4v/vIImdmxLh/w5WctUpBhWWIfQDvSOW2VZDOihm7pzhQodr3QP/GDLfpK6wI7exeu3P pfPtqwa06s1pae3ad13mZGzkBdNKs1HEm8x6zsBNBEvHyWwBCADGkMFzFjmmyqAEn5D+Mt4P zPdO8NatsDw8Qit3Rmzu+kUygxyYbz52ZO40WUu7EgQ5kDTOeRPnTOd7awWDQcl1gGBXgrkR pAlQ0l0ReO57Q0eglFydLMi5bkwYhfY+TwDPMh3aOP5qBXkm4qIYSsxb8A+i00P72AqFb9Q7 3weG/flxSPApLYQE5qWGSXjOkXJv42NGS6o6gd4RmD6Ap5e8ACo1lSMPfTpGzXlt4aRkBfvb NCfNsQikLZzFYDLbQgKBA33BDeV6vNJ9Cj0SgEGOkYyed4I6AbU0kIy1hHAm1r6+sAnEdIKj cHi3xWH/UPrZW5flM8Kqo14OTDkI9EtlABEBAAHCwF8EGAEIAAkFAkvHyWwCGwwACgkQp6Fr SiUnQ2q03wgAmRFGDeXzc58NX0NrDijUu0zx3Lns/qZ9VrkSWbNZBFjpWKaeL1fdVeE4TDGm I5mRRIsStjQzc2R9b+2VBUhlAqY1nAiBDv0Qnt+9cLiuEICeUwlyl42YdwpmY0ELcy5+u6wz mK/jxrYOpzXKDwLq5k4X+hmGuSNWWAN3gHiJqmJZPkhFPUIozZUCeEc76pS/IUN72NfprZmF Dp6/QDjDFtfS39bHSWXKVZUbqaMPqlj/z6Ugk027/3GUjHHr8WkeL1ezWepYDY7WSoXwfoAL 2UXYsMAr/uUncSKlfjvArhsej0S4zbqim2ZY6S8aRWw94J3bSvJR+Nwbs34GPTD4Pg== Organization: Red Hat, Inc. Message-ID: Date: Sat, 23 Mar 2019 07:39:25 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="KNOFsrLXqVMtyarpXlxuyoN6VICNPRART" X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Sat, 23 Mar 2019 12:39:27 +0000 (UTC) X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 34951 Cc: 34951@debbugs.gnu.org, bug-gnulib@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --KNOFsrLXqVMtyarpXlxuyoN6VICNPRART Content-Type: multipart/mixed; boundary="KUETXhy6PFjvv4kK0hvmY4GHjTV5paGWc"; protected-headers="v1" From: Eric Blake To: Budi , Norihiro Tanaka Cc: 34951@debbugs.gnu.org, bug-gnulib@gnu.org Message-ID: Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> In-Reply-To: --KUETXhy6PFjvv4kK0hvmY4GHjTV5paGWc Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable On 3/22/19 9:59 PM, Budi wrote: > How make grep walking through FS by scanning breadth first instead of >=20 > On 3/23/19, Budi wrote: >> How make grep walinh through FS by scanning breadth first instead of >> the usual depth >> >> On 3/23/19, Norihiro Tanaka wrote: >>> On Sat, 23 Mar 2019 08:06:35 +0900 >>> Norihiro Tanaka wrote: >>> >>>> A kwset matcher is not built in a grep matcher after token re-order = is Budi, Hijacking a tread on a posted patch to ask an unrelated question via top-posting is not very nice netiquette. Better is to start a new thread for asking questions, and to use bottom posting for technical list= s. That said, the answer to your question is that there is no way to change the way that grep walks the file system when using 'grep -r'. And when you consider that 'grep -r' is a GNU extension not required by POSIX (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html), and that we are reluctant to bloat grep any further when 'find' already exists as the POSIX-sanctioned file walker, you are better off getting 'find' to do the traversal you want (where find or xargs is used to invoke plain 'grep' on the resulting files) rather than trying to convince us to patch 'grep -r' to have more flexibility. --=20 Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org --KUETXhy6PFjvv4kK0hvmY4GHjTV5paGWc-- --KNOFsrLXqVMtyarpXlxuyoN6VICNPRART Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAlyWKP0ACgkQp6FrSiUn Q2rm4ggAn52MxuaJE52EBjp9rBxkBXiMcujTwN6GAfj7s8w1pqhTLn3NAM9WC9By fAzsFi41vlQPb2KutWnTpKaCN1fNOtRyQ2ALGcc6BCGMLxxKM+apgZwLQF5GVReQ GrH93doCoTvTOK4R5WRptT7ckvvs0QVgr5KBkEnobFuWA8BJ8ahrLtSowb0Cm5Jm CSpALFhlKbvXW6PXNkQJCADdwjcLRFRU6hwSx01uvyOZN9dfb0Zt+DUX0BY0cEDj 4LVDoaWz4WPVtdmFjecicnGT2y5WJVrwOW+ZsQQztVRBNdHKPbl9P4dsZOtEZqaq yvE4Wu2fz6i/mSmnSFGZ+R9LxM5XBw== =zNfe -----END PGP SIGNATURE----- --KNOFsrLXqVMtyarpXlxuyoN6VICNPRART-- From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 29 06:59:06 2019 Received: (at 34951) by debbugs.gnu.org; 29 Mar 2019 10:59:06 +0000 Received: from localhost ([127.0.0.1]:35065 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h9pE1-0002oa-Re for submit@debbugs.gnu.org; Fri, 29 Mar 2019 06:59:05 -0400 Received: from freefriends.org ([96.88.95.60]:60246) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h9pDy-0002oH-5Y for 34951@debbugs.gnu.org; Fri, 29 Mar 2019 06:59:02 -0400 X-Envelope-From: arnold@skeeve.com Received: from freefriends.org (freefriends.org [96.88.95.60]) by freefriends.org (8.14.7/8.14.7) with ESMTP id x2TAx0u2027188 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 29 Mar 2019 04:59:00 -0600 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id x2TAww9u027187; Fri, 29 Mar 2019 04:58:58 -0600 From: arnold@skeeve.com Message-Id: <201903291058.x2TAww9u027187@freefriends.org> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@skeeve.com using -f Date: Fri, 29 Mar 2019 04:58:58 -0600 To: noritnk@kcn.ne.jp, 34951@debbugs.gnu.org Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> In-Reply-To: <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: bug-gnulib@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi. Norihiro Tanaka wrote: > Missing a patch for dfa. Re-send correct patch file. Paul - is this going to be merged into GNULIB? If so, I'll put it into gawk now; I want to make a release soon. Thanks, Arnold [ From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 11 18:25:59 2019 Received: (at 34951) by debbugs.gnu.org; 11 Dec 2019 23:25:59 +0000 Received: from localhost ([127.0.0.1]:58765 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifBMl-0004F2-7K for submit@debbugs.gnu.org; Wed, 11 Dec 2019 18:25:59 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:46530) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifBMi-0004Eo-Mo for 34951@debbugs.gnu.org; Wed, 11 Dec 2019 18:25:57 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 12DB116008F; Wed, 11 Dec 2019 15:25:51 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 3cjkt_Oqnm8M; Wed, 11 Dec 2019 15:25:49 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id CF8D1160158; Wed, 11 Dec 2019 15:25:49 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 5rE2ZMxojoCv; Wed, 11 Dec 2019 15:25:49 -0800 (PST) Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id AFB0816008F; Wed, 11 Dec 2019 15:25:49 -0800 (PST) Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher To: Norihiro Tanaka , 34951@debbugs.gnu.org References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> Date: Wed, 11 Dec 2019 15:25:48 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> Content-Type: multipart/mixed; boundary="------------D01426E8E1E2BC03E164985F" Content-Language: en-US X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: Aharon Robbins , bug-gnulib@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) This is a multi-part message in MIME format. --------------D01426E8E1E2BC03E164985F Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit On 3/22/19 7:49 PM, Norihiro Tanaka wrote: > Missing a patch for dfa. Re-send correct patch file. Thanks, I installed the DFA-relevant parts of your proposed fix into Gnulib. (The grep parts still need doing.) I also installed the attached commentary followup. While I was at it I installed a patch to fix an unlikely integer overflow that I noticed while reviewing your fix. I also installed some internal changes to prefer signed to unsigned integers for indexes, as this should make future integer overflows easier to catch. See: https://lists.gnu.org/r/bug-gnulib/2019-12/msg00058.html https://lists.gnu.org/r/bug-gnulib/2019-12/msg00059.html I'd also like to change dfa.h's API to prefer ptrdiff_t to size_t, for the same integer-overflow reason. This would be a (minor) API change so I thought I'd ask first. Any objections? PS. Arnold, the above discusses all the changes I know about for dfa.c and dfa.h. The proposed API change (size_t->ptrdiff_t) could be installed either before or after the next Gawk release. --------------D01426E8E1E2BC03E164985F Content-Type: text/x-patch; charset=UTF-8; name="0001-dfa-update-commentary-for-previous-change.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0001-dfa-update-commentary-for-previous-change.patch" >From 360cbd3b17a314807e808626e100ef47dcf4d162 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Wed, 11 Dec 2019 13:40:01 -0800 Subject: [PATCH] dfa: update commentary for previous change * NEWS: Mention the change. * lib/dfa.c, lib/dfa.h (dfaparse, dfamust, dfacomp): Update comments. --- ChangeLog | 6 ++++++ NEWS | 4 ++++ lib/dfa.c | 9 +++++---- lib/dfa.h | 14 ++++++++------ 4 files changed, 23 insertions(+), 10 deletions(-) diff --git a/ChangeLog b/ChangeLog index f80f33b38..bc912c771 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2019-12-11 Paul Eggert + + dfa: update commentary for previous change + * NEWS: Mention the change. + * lib/dfa.c, lib/dfa.h (dfaparse, dfamust, dfacomp): Update comments. + 2019-12-11 Norihiro Tanaka dfa: separate parse and compile phase diff --git a/NEWS b/NEWS index 8085c353e..b73c9088a 100644 --- a/NEWS +++ b/NEWS @@ -58,6 +58,10 @@ User visible incompatible changes Date Modules Changes +2019-12-11 dfa To call dfamust, one must now call dfaparse + without yet calling dfacomp. This fixes a bug + introduced on 2018-10-22 that broke dfamust. + 2019-12-07 xstrtol This module no longer defines the function xstrtoll 'xstrtol_fatal'. Program that need this function xstrtoimax should add the module 'xstrtol-error' to the list diff --git a/lib/dfa.c b/lib/dfa.c index 1e125b4d2..2347a91c1 100644 --- a/lib/dfa.c +++ b/lib/dfa.c @@ -1966,9 +1966,8 @@ regexp (struct dfa *dfa) } } -/* Main entry point for the parser. S is a string to be parsed, len is the - length of the string, so s can include NUL characters. D is a pointer to - the struct dfa to parse into. */ +/* Parse a string S of length LEN into D. S can include NUL characters. + This is the main entry point for the parser. */ void dfaparse (char const *s, size_t len, struct dfa *d) { @@ -3741,7 +3740,9 @@ dfassbuild (struct dfa *d) } } -/* Parse and analyze a single string of the given length. */ +/* Parse a string S of length LEN into D (but skip this step if S is null). + Then analyze D and build a matcher for it. + SEARCHFLAG says whether to build a searching or an exact matcher. */ void dfacomp (char const *s, size_t len, struct dfa *d, bool searchflag) { diff --git a/lib/dfa.h b/lib/dfa.h index 221f7d172..bf87703e8 100644 --- a/lib/dfa.h +++ b/lib/dfa.h @@ -65,18 +65,20 @@ enum extern void dfasyntax (struct dfa *, struct localeinfo const *, reg_syntax_t, int); -/* Build and return the struct dfamust from the given struct dfa. */ +/* Parse the given string of given length into the given struct dfa. */ +extern void dfaparse (char const *, size_t, struct dfa *); + +/* Allocate and return a struct dfamust from a struct dfa that was + initialized by dfaparse and not yet given to dfacomp. */ extern struct dfamust *dfamust (struct dfa const *); /* Free the storage held by the components of a struct dfamust. */ extern void dfamustfree (struct dfamust *); -/* Parse the given string of given length into the given struct dfa. */ -extern void dfaparse (char const *, size_t, struct dfa *); - /* Compile the given string of the given length into the given struct dfa. - Final argument is a flag specifying whether to build a searching or an - exact matcher. */ + The last argument says whether to build a searching or an exact matcher. + A null first argument means the struct dfa has already been + initialized by dfaparse; the second argument is ignored. */ extern void dfacomp (char const *, size_t, struct dfa *, bool); /* Search through a buffer looking for a match to the given struct dfa. -- 2.23.0 --------------D01426E8E1E2BC03E164985F-- From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 12 02:23:18 2019 Received: (at 34951) by debbugs.gnu.org; 12 Dec 2019 07:23:19 +0000 Received: from localhost ([127.0.0.1]:58966 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifIog-000361-LC for submit@debbugs.gnu.org; Thu, 12 Dec 2019 02:23:18 -0500 Received: from freefriends.org ([96.88.95.60]:43520) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifIod-00035r-2R for 34951@debbugs.gnu.org; Thu, 12 Dec 2019 02:23:16 -0500 X-Envelope-From: arnold@skeeve.com Received: from freefriends.org (freefriends.org [96.88.95.60]) by freefriends.org (8.14.7/8.14.7) with ESMTP id xBC7N8Of031362 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 12 Dec 2019 00:23:08 -0700 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id xBC7N62j031361; Thu, 12 Dec 2019 00:23:06 -0700 From: arnold@skeeve.com Message-Id: <201912120723.xBC7N62j031361@freefriends.org> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@skeeve.com using -f Date: Thu, 12 Dec 2019 00:23:06 -0700 To: noritnk@kcn.ne.jp, eggert@cs.ucla.edu, 34951@debbugs.gnu.org Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> In-Reply-To: <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: arnold@skeeve.com, bug-gnulib@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi Paul. Paul Eggert wrote: > On 3/22/19 7:49 PM, Norihiro Tanaka wrote: > > Missing a patch for dfa. Re-send correct patch file. > > Thanks, I installed the DFA-relevant parts of your proposed fix into > Gnulib. (The grep parts still need doing.) I also installed the attached > commentary followup. > > While I was at it I installed a patch to fix an unlikely integer > overflow that I noticed while reviewing your fix. I also installed some > internal changes to prefer signed to unsigned integers for indexes, as > this should make future integer overflows easier to catch. See: > > https://lists.gnu.org/r/bug-gnulib/2019-12/msg00058.html > https://lists.gnu.org/r/bug-gnulib/2019-12/msg00059.html I am reviewing these. In general using signed integers internally looks OK to me. > I'd also like to change dfa.h's API to prefer ptrdiff_t to size_t, for > the same integer-overflow reason. This would be a (minor) API change so > I thought I'd ask first. Any objections? Yes. I object. Strongly. We're passing length and count values and those are supposed to be size_t. If you REALLY want signed values, then I could live with ssize_t (as returned by read(2), for example), but I would find ptrdiff_t to be ugly and unintuitive. > PS. Arnold, the above discusses all the changes I know about for dfa.c > and dfa.h. The proposed API change (size_t->ptrdiff_t) could be > installed either before or after the next Gawk release. Thanks. I'm skimming the other changes now. Arnold From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 12 02:31:12 2019 Received: (at 34951) by debbugs.gnu.org; 12 Dec 2019 07:31:12 +0000 Received: from localhost ([127.0.0.1]:58971 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifIwK-0003I4-Fw for submit@debbugs.gnu.org; Thu, 12 Dec 2019 02:31:12 -0500 Received: from freefriends.org ([96.88.95.60]:43640) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifIwJ-0003Hx-4M for 34951@debbugs.gnu.org; Thu, 12 Dec 2019 02:31:11 -0500 X-Envelope-From: arnold@skeeve.com Received: from freefriends.org (freefriends.org [96.88.95.60]) by freefriends.org (8.14.7/8.14.7) with ESMTP id xBC7V7FM031768 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 12 Dec 2019 00:31:07 -0700 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id xBC7V6gD031767; Thu, 12 Dec 2019 00:31:06 -0700 From: arnold@skeeve.com Message-Id: <201912120731.xBC7V6gD031767@freefriends.org> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@skeeve.com using -f Date: Thu, 12 Dec 2019 00:31:06 -0700 To: noritnk@kcn.ne.jp, eggert@cs.ucla.edu, 34951@debbugs.gnu.org Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> In-Reply-To: <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: arnold@skeeve.com, bug-gnulib@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi Paul. Paul Eggert wrote: > https://lists.gnu.org/r/bug-gnulib/2019-12/msg00058.html > https://lists.gnu.org/r/bug-gnulib/2019-12/msg00059.html Looking at this: | @@ -1733,11 +1733,11 @@ add_utf8_anychar (struct dfa *dfa) | /* f0-f7: 4-byte sequence. */ | CHARCLASS_INIT (0, 0, 0, 0, 0, 0, 0, 0xff0000) | }; | - const unsigned int n = sizeof (utf8_classes) / sizeof (utf8_classes[0]); | + int n = sizeof utf8_classes / sizeof *utf8_classes; Why are you throwing away const here? Other than this, I think internally too, I'd prefer that you 1,$s/ptrdiff_t/ssize_t/g (and fix any printf calls). It just feels like an abuse of the type, which is for representing differences between pointers, and not regular large signed integeers. However, I'm not going to insist about it internally, whereas I would object strongly to the use of ptrdiff_t in the API. Thanks! Arnold From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 12 02:48:03 2019 Received: (at 34951) by debbugs.gnu.org; 12 Dec 2019 07:48:03 +0000 Received: from localhost ([127.0.0.1]:59001 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifJCd-00061M-Lb for submit@debbugs.gnu.org; Thu, 12 Dec 2019 02:48:03 -0500 Received: from freefriends.org ([96.88.95.60]:43788) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifJCc-00061D-Hd for 34951@debbugs.gnu.org; Thu, 12 Dec 2019 02:48:02 -0500 X-Envelope-From: arnold@skeeve.com Received: from freefriends.org (freefriends.org [96.88.95.60]) by freefriends.org (8.14.7/8.14.7) with ESMTP id xBC7lvlc032273 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 12 Dec 2019 00:47:58 -0700 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id xBC7lv1V032272; Thu, 12 Dec 2019 00:47:57 -0700 From: arnold@skeeve.com Message-Id: <201912120747.xBC7lv1V032272@freefriends.org> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@skeeve.com using -f Date: Thu, 12 Dec 2019 00:47:57 -0700 To: noritnk@kcn.ne.jp, eggert@cs.ucla.edu, arnold@skeeve.com, 34951@debbugs.gnu.org Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> <201912120731.xBC7V6gD031767@freefriends.org> In-Reply-To: <201912120731.xBC7V6gD031767@freefriends.org> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: arnold@skeeve.com, bug-gnulib@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) arnold@skeeve.com wrote: > Other than this, I think internally too, I'd prefer that you > > 1,$s/ptrdiff_t/ssize_t/g I did this, just to see. gawk passes its test suite, both in 64- and 32-bit mode. FWIW. Thanks, Arnold From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 12 17:26:56 2019 Received: (at 34951) by debbugs.gnu.org; 12 Dec 2019 22:26:56 +0000 Received: from localhost ([127.0.0.1]:32774 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifWv9-00024c-Qz for submit@debbugs.gnu.org; Thu, 12 Dec 2019 17:26:56 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:35642) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifWv7-00024P-4v for 34951@debbugs.gnu.org; Thu, 12 Dec 2019 17:26:54 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 7022B160080; Thu, 12 Dec 2019 14:26:47 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id gmcafTyd0Qhb; Thu, 12 Dec 2019 14:26:46 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id B593E16027C; Thu, 12 Dec 2019 14:26:46 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id GK0R_M__goPm; Thu, 12 Dec 2019 14:26:46 -0800 (PST) Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 98981160080; Thu, 12 Dec 2019 14:26:46 -0800 (PST) Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher To: arnold@skeeve.com, noritnk@kcn.ne.jp, 34951@debbugs.gnu.org References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> <201912120731.xBC7V6gD031767@freefriends.org> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <38194689-56ba-41f4-3810-50df1ff9019c@cs.ucla.edu> Date: Thu, 12 Dec 2019 14:26:46 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <201912120731.xBC7V6gD031767@freefriends.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: bug-gnulib@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) On 12/11/19 11:31 PM, arnold@skeeve.com wrote: > 1,$s/ptrdiff_t/ssize_t/g ssize_t can be narrower than ptrdiff_t, so it's not a good type to use for this notion. Its original motivation was "the type that 'read' returns", and on systems where 'read' can return at most INT_MAX, ssize_t can be 32 bits even if size_t is 64 bits. From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 13 03:10:10 2019 Received: (at 34951) by debbugs.gnu.org; 13 Dec 2019 08:10:10 +0000 Received: from localhost ([127.0.0.1]:32966 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifg1a-0003SO-Ig for submit@debbugs.gnu.org; Fri, 13 Dec 2019 03:10:10 -0500 Received: from freefriends.org ([96.88.95.60]:56862) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifg1Y-0003SE-Kj for 34951@debbugs.gnu.org; Fri, 13 Dec 2019 03:10:09 -0500 X-Envelope-From: arnold@skeeve.com Received: from freefriends.org (freefriends.org [96.88.95.60]) by freefriends.org (8.14.7/8.14.7) with ESMTP id xBD89whI021731 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 13 Dec 2019 01:09:58 -0700 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id xBD89uUG021729; Fri, 13 Dec 2019 01:09:56 -0700 From: arnold@skeeve.com Message-Id: <201912130809.xBD89uUG021729@freefriends.org> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@skeeve.com using -f Date: Fri, 13 Dec 2019 01:09:56 -0700 To: noritnk@kcn.ne.jp, eggert@cs.ucla.edu, arnold@skeeve.com, 34951@debbugs.gnu.org Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> <201912120731.xBC7V6gD031767@freefriends.org> <38194689-56ba-41f4-3810-50df1ff9019c@cs.ucla.edu> In-Reply-To: <38194689-56ba-41f4-3810-50df1ff9019c@cs.ucla.edu> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: bug-gnulib@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi Paul. Paul Eggert wrote: > On 12/11/19 11:31 PM, arnold@skeeve.com wrote: > > > 1,$s/ptrdiff_t/ssize_t/g > > ssize_t can be narrower than ptrdiff_t, so it's not a good type to use > for this notion. Its original motivation was "the type that 'read' > returns", and on systems where 'read' can return at most INT_MAX, > ssize_t can be 32 bits even if size_t is 64 bits. In practice, how many system are there where ssize_t is 32 bits and size_t is 64? If that number is <= 5 then I wouldn't worry about using ssize_t. In any case, as I said, I can live with ptrdiff_t in the implementation, even though I don't like it that much. (A nice block comment at the top of dfa.c explaining why ptrdiff_t is used would be appropriate.) But I really don't want ptrdiff_t in the API. Thanks, Arnold Thanks, Arnold From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 13 07:08:37 2019 Received: (at 34951) by debbugs.gnu.org; 13 Dec 2019 12:08:37 +0000 Received: from localhost ([127.0.0.1]:33118 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifjkK-0003qk-TA for submit@debbugs.gnu.org; Fri, 13 Dec 2019 07:08:37 -0500 Received: from freefriends.org ([96.88.95.60]:58984) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifjkJ-0003qd-TF for 34951@debbugs.gnu.org; Fri, 13 Dec 2019 07:08:36 -0500 X-Envelope-From: arnold@skeeve.com Received: from freefriends.org (freefriends.org [96.88.95.60]) by freefriends.org (8.14.7/8.14.7) with ESMTP id xBDC8Sac032710 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 13 Dec 2019 05:08:28 -0700 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id xBDC8QLo032709; Fri, 13 Dec 2019 05:08:26 -0700 From: arnold@skeeve.com Message-Id: <201912131208.xBDC8QLo032709@freefriends.org> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@skeeve.com using -f Date: Fri, 13 Dec 2019 05:08:26 -0700 To: noritnk@kcn.ne.jp, eggert@cs.ucla.edu, arnold@skeeve.com, 34951@debbugs.gnu.org Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> <201912120731.xBC7V6gD031767@freefriends.org> <38194689-56ba-41f4-3810-50df1ff9019c@cs.ucla.edu> <201912130809.xBD89uUG021729@freefriends.org> In-Reply-To: <201912130809.xBD89uUG021729@freefriends.org> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: bug-gnulib@gnu.org, jim@meyering.net X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) arnold@skeeve.com wrote: > But I really don't want ptrdiff_t in the API. I see that Paul has made the change to the API over my objections. Jim --- do you have an opinion on this? Thanks, Arnold From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 13 12:53:22 2019 Received: (at 34951) by debbugs.gnu.org; 13 Dec 2019 17:53:22 +0000 Received: from localhost ([127.0.0.1]:34612 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifp7y-0006IR-GC for submit@debbugs.gnu.org; Fri, 13 Dec 2019 12:53:22 -0500 Received: from mail-wr1-f66.google.com ([209.85.221.66]:43147) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifp7v-0006IE-RZ for 34951@debbugs.gnu.org; Fri, 13 Dec 2019 12:53:21 -0500 Received: by mail-wr1-f66.google.com with SMTP id d16so337989wre.10 for <34951@debbugs.gnu.org>; Fri, 13 Dec 2019 09:53:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=oII2DAOcpoqa0v2twFF1fpdt9182MErbN9oYY0+S1Yc=; b=oZSjX83sYVqWVmnFyY4tBv1d22tpsfjH/n/X4qywnHyGrsAbaPRUkBw2zxi12ybwsC 2jARHXOnr3abQtzD1pbotY9fKzFz9D9fl/ZdkoSBPDeEC2hBZJclJsSLZ6ONOrTw+WHW jVFp8VOPx/67tGogls+xptDOUzPIJMKGvdvX+YJHtLhtT0ZtG8Ncf+swIxYkwmPnxf7U v3944yLkoO6IJN1u95Yzdhm+azL8snG53iLyDIAzhbQCHpP+yakoaAJY4SgFjNyB3XHv cAO1gWkxE/FuUzJp/Mjivl5s9QcHy2xkk2Aae20J3gnM0/mTSYy6xumbv96t33164SY1 AuwQ== X-Gm-Message-State: APjAAAWgnB4PXGc68r0ahk5VMZOfd53o18QQEACDjLFmhyOBwezw7xy/ qurTUMjNWZQE3qC7SEHSLgKltu73pFp2MKuziuM= X-Google-Smtp-Source: APXvYqwQ9N7L/5ZGPjPyj03swHoWW8e8PDL72Ld2K1CUdr9zLmNf6S8WKeHSqF6wi4wtv18cXPI1Sd9z1xn67LiI1i4= X-Received: by 2002:a5d:4386:: with SMTP id i6mr14221502wrq.63.1576259594008; Fri, 13 Dec 2019 09:53:14 -0800 (PST) MIME-Version: 1.0 References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> <201912120731.xBC7V6gD031767@freefriends.org> <38194689-56ba-41f4-3810-50df1ff9019c@cs.ucla.edu> <201912130809.xBD89uUG021729@freefriends.org> <201912131208.xBDC8QLo032709@freefriends.org> In-Reply-To: <201912131208.xBDC8QLo032709@freefriends.org> From: Jim Meyering Date: Fri, 13 Dec 2019 09:53:01 -0800 Message-ID: Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher To: Aharon Robbins Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 34951 Cc: Paul Eggert , "bug-gnulib@gnu.org List" , Norihiro Tanaka , 34951@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) On Fri, Dec 13, 2019 at 4:08 AM wrote: > arnold@skeeve.com write: > > > But I really don't want ptrdiff_t in the API. > > I see that Paul has made the change to the API over my objections. > > Jim --- do you have an opinion on this? Hi Aharon, I used to feel the way you do. However, given the way compilers and static/dynamic analysis have evolved, I have come around to Paul's point of view. It still feels "wrong" in some sense, but using the signed type makes the code more robust, enabling automatic detection/avoidance of more bugs than with unsigned types. Thus, a net improvement. Paul, can you point to a link that lists the benefits/tradeoffs? If I had such a link handy, I would have provided it here. Jim From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 13 15:01:09 2019 Received: (at 34951) by debbugs.gnu.org; 13 Dec 2019 20:01:09 +0000 Received: from localhost ([127.0.0.1]:34674 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifr7d-0002RS-Gw for submit@debbugs.gnu.org; Fri, 13 Dec 2019 15:01:09 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:51462) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifr7b-0002R5-20 for 34951@debbugs.gnu.org; Fri, 13 Dec 2019 15:01:08 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 68DFA160549; Fri, 13 Dec 2019 12:01:01 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id Vaj09CIN7oYi; Fri, 13 Dec 2019 12:00:59 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 82D2B160514; Fri, 13 Dec 2019 12:00:59 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id grRCVS-lJTZ1; Fri, 13 Dec 2019 12:00:59 -0800 (PST) Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 5230D1604FF; Fri, 13 Dec 2019 12:00:59 -0800 (PST) Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher To: Jim Meyering , Aharon Robbins References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> <201912120731.xBC7V6gD031767@freefriends.org> <38194689-56ba-41f4-3810-50df1ff9019c@cs.ucla.edu> <201912130809.xBD89uUG021729@freefriends.org> <201912131208.xBDC8QLo032709@freefriends.org> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <20db9cb5-da09-fe26-f7fc-884fc194daaa@cs.ucla.edu> Date: Fri, 13 Dec 2019 12:00:58 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: 34951@debbugs.gnu.org, Gnulib bugs , Norihiro Tanaka X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) >> I see that Paul has made the change to the API over my objections. I made the change while responding to Bruno's objections, but before seeing yours. Ooops. Sorry about that. However, I hope the followup emails have addressed your comments, at least to some extent. > Paul, can you point to a link that lists the benefits/tradeoffs? If I > had such a link handy, I would have provided it here. Avoiding unsigned types for indexes and sizes seems to be a growing movement. Admittedly there are arguments for unsigned, but these arguments are getting weaker with time. Here are a couple of links, the first for C and the second for C++: https://www.gnu.org/software/emacs/manual/html_node/elisp/C-Integer-Types.html http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1428r0.pdf As for ssize_t vs ptrdiff_t: ssize_t is less central to the C language (ptrdiff_t is in the C standard but ssize_t is not). And ssize_t is less convenient: for example, there's no simple, portable way to printf an ssize_t value, as there is with "%td" and ptrdiff_t. So there are technical reasons for preferring ptrdiff_t to ssize_t for this sort of thing (even though "ssize_t" is a shorter and better name). Thich is why Emacs, other parts of Gnulib, and other Gnu applications have used ptrdiff_t instead of ssize_t for this sort of thing. From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 15 03:14:43 2019 Received: (at 34951) by debbugs.gnu.org; 15 Dec 2019 08:14:43 +0000 Received: from localhost ([127.0.0.1]:35975 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igP35-0006Z5-BT for submit@debbugs.gnu.org; Sun, 15 Dec 2019 03:14:43 -0500 Received: from freefriends.org ([96.88.95.60]:52846) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igP33-0006Yx-CR for 34951@debbugs.gnu.org; Sun, 15 Dec 2019 03:14:42 -0500 X-Envelope-From: arnold@skeeve.com Received: from freefriends.org (freefriends.org [96.88.95.60]) by freefriends.org (8.14.7/8.14.7) with ESMTP id xBF8EQW2005060 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 15 Dec 2019 01:14:27 -0700 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id xBF8EOXW005059; Sun, 15 Dec 2019 01:14:24 -0700 From: arnold@skeeve.com Message-Id: <201912150814.xBF8EOXW005059@freefriends.org> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@skeeve.com using -f Date: Sun, 15 Dec 2019 01:14:24 -0700 To: jim@meyering.net, eggert@cs.ucla.edu, arnold@skeeve.com Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> <201912120731.xBC7V6gD031767@freefriends.org> <38194689-56ba-41f4-3810-50df1ff9019c@cs.ucla.edu> <201912130809.xBD89uUG021729@freefriends.org> <201912131208.xBDC8QLo032709@freefriends.org> <20db9cb5-da09-fe26-f7fc-884fc194daaa@cs.ucla.edu> In-Reply-To: <20db9cb5-da09-fe26-f7fc-884fc194daaa@cs.ucla.edu> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: 34951@debbugs.gnu.org, bug-gnulib@gnu.org, noritnk@kcn.ne.jp X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) OK. I skimmed the links. But why not write the code to say what we mean? For example: #include typedef int64_t dfa_size_t; extern void dfaparse (char const *, dfa_size_t, struct dfa *); extern void dfacomp (char const *, dfa_size_t, struct dfa *, bool); bool allow_nl, dfa_size_t *count, bool *backref); Using ptrdiff_t directly simply because it is defined to be the largest signed integer remains ugly (and Paul has already moved to a typedef in the implementation.) int64_t is just as standard as ptrdiff_t and just as clear. Thanks, Arnold Paul Eggert wrote: > >> I see that Paul has made the change to the API over my objections. > > I made the change while responding to Bruno's objections, but before > seeing yours. Ooops. Sorry about that. However, I hope the followup > emails have addressed your comments, at least to some extent. > > > Paul, can you point to a link that lists the benefits/tradeoffs? If I > > had such a link handy, I would have provided it here. > > Avoiding unsigned types for indexes and sizes seems to be a growing > movement. Admittedly there are arguments for unsigned, but these > arguments are getting weaker with time. Here are a couple of links, the > first for C and the second for C++: > > https://www.gnu.org/software/emacs/manual/html_node/elisp/C-Integer-Types.html > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1428r0.pdf > > As for ssize_t vs ptrdiff_t: ssize_t is less central to the C language > (ptrdiff_t is in the C standard but ssize_t is not). And ssize_t is less > convenient: for example, there's no simple, portable way to printf an > ssize_t value, as there is with "%td" and ptrdiff_t. So there are > technical reasons for preferring ptrdiff_t to ssize_t for this sort of > thing (even though "ssize_t" is a shorter and better name). Thich is why > Emacs, other parts of Gnulib, and other Gnu applications have used > ptrdiff_t instead of ssize_t for this sort of thing. From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 16 04:56:22 2019 Received: (at 34951) by debbugs.gnu.org; 16 Dec 2019 09:56:23 +0000 Received: from localhost ([127.0.0.1]:37921 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ign70-0005Iz-L2 for submit@debbugs.gnu.org; Mon, 16 Dec 2019 04:56:22 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:33032) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ign6x-0005Ik-SM for 34951@debbugs.gnu.org; Mon, 16 Dec 2019 04:56:20 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1E2C01600FC; Mon, 16 Dec 2019 01:56:14 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id iwD4f3uTtqnJ; Mon, 16 Dec 2019 01:56:13 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 6D0A91605CA; Mon, 16 Dec 2019 01:56:13 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id BDGSEVE_XzVz; Mon, 16 Dec 2019 01:56:13 -0800 (PST) Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com [23.242.74.103]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 35D401600FC; Mon, 16 Dec 2019 01:56:13 -0800 (PST) Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher To: arnold@skeeve.com, jim@meyering.net References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> <201912120731.xBC7V6gD031767@freefriends.org> <38194689-56ba-41f4-3810-50df1ff9019c@cs.ucla.edu> <201912130809.xBD89uUG021729@freefriends.org> <201912131208.xBDC8QLo032709@freefriends.org> <20db9cb5-da09-fe26-f7fc-884fc194daaa@cs.ucla.edu> <201912150814.xBF8EOXW005059@freefriends.org> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <206e5f61-2c80-d6f0-c4a7-b7365c0b523d@cs.ucla.edu> Date: Mon, 16 Dec 2019 01:56:12 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <201912150814.xBF8EOXW005059@freefriends.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: 34951@debbugs.gnu.org, bug-gnulib@gnu.org, noritnk@kcn.ne.jp X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) On 12/15/19 12:14 AM, arnold@skeeve.com wrote: > int64_t is just as standard as ptrdiff_t and just as clear. Actually, int64_t is optional (as even C18 and POSIX-2018 do not require it), whereas ptrdiff_t has been required since C89. More importantly, int64_t would be overkill on 32-bit GNU/Linux, whereas ptrdiff_t suffices and is typically more efficient. (Besides, what would we do if 72-bit pointers came back into vogue? :-) From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 16 05:12:29 2019 Received: (at 34951) by debbugs.gnu.org; 16 Dec 2019 10:12:29 +0000 Received: from localhost ([127.0.0.1]:37930 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ignMb-0005m1-F7 for submit@debbugs.gnu.org; Mon, 16 Dec 2019 05:12:29 -0500 Received: from freefriends.org ([96.88.95.60]:38198) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ignMZ-0005ls-GV for 34951@debbugs.gnu.org; Mon, 16 Dec 2019 05:12:27 -0500 X-Envelope-From: arnold@skeeve.com Received: from freefriends.org (freefriends.org [96.88.95.60]) by freefriends.org (8.14.7/8.14.7) with ESMTP id xBGACCld023524 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 Dec 2019 03:12:12 -0700 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id xBGACBmm023523; Mon, 16 Dec 2019 03:12:11 -0700 From: arnold@skeeve.com Message-Id: <201912161012.xBGACBmm023523@freefriends.org> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@skeeve.com using -f Date: Mon, 16 Dec 2019 03:12:11 -0700 To: jim@meyering.net, eggert@cs.ucla.edu, arnold@skeeve.com Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> <201912120731.xBC7V6gD031767@freefriends.org> <38194689-56ba-41f4-3810-50df1ff9019c@cs.ucla.edu> <201912130809.xBD89uUG021729@freefriends.org> <201912131208.xBDC8QLo032709@freefriends.org> <20db9cb5-da09-fe26-f7fc-884fc194daaa@cs.ucla.edu> <201912150814.xBF8EOXW005059@freefriends.org> <206e5f61-2c80-d6f0-c4a7-b7365c0b523d@cs.ucla.edu> In-Reply-To: <206e5f61-2c80-d6f0-c4a7-b7365c0b523d@cs.ucla.edu> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: 34951@debbugs.gnu.org, bug-gnulib@gnu.org, noritnk@kcn.ne.jp X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Paul Eggert wrote: > On 12/15/19 12:14 AM, arnold@skeeve.com wrote: > > > int64_t is just as standard as ptrdiff_t and just as clear. > > Actually, int64_t is optional (as even C18 and POSIX-2018 do not require it), > whereas ptrdiff_t has been required since C89. More importantly, int64_t would > be overkill on 32-bit GNU/Linux, whereas ptrdiff_t suffices and is typically > more efficient. > > (Besides, what would we do if 72-bit pointers came back into vogue? :-) Fine. What about typedef ptrdiff_t dfa_size_t ? From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 19 22:18:25 2019 Received: (at 34951) by debbugs.gnu.org; 20 Dec 2019 03:18:25 +0000 Received: from localhost ([127.0.0.1]:45817 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ii8o5-0004ah-41 for submit@debbugs.gnu.org; Thu, 19 Dec 2019 22:18:25 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:48068) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ii8o3-0004aT-8X for 34951@debbugs.gnu.org; Thu, 19 Dec 2019 22:18:23 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 64B2616027E; Thu, 19 Dec 2019 19:18:16 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id Ogv5zpe6uKWZ; Thu, 19 Dec 2019 19:18:15 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 9F37216016A; Thu, 19 Dec 2019 19:18:15 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id osvktYHLUJH5; Thu, 19 Dec 2019 19:18:15 -0800 (PST) Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com [23.242.74.103]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 6AF6016027E; Thu, 19 Dec 2019 19:18:15 -0800 (PST) Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher To: arnold@skeeve.com, jim@meyering.net References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> <201912120731.xBC7V6gD031767@freefriends.org> <38194689-56ba-41f4-3810-50df1ff9019c@cs.ucla.edu> <201912130809.xBD89uUG021729@freefriends.org> <201912131208.xBDC8QLo032709@freefriends.org> <20db9cb5-da09-fe26-f7fc-884fc194daaa@cs.ucla.edu> <201912150814.xBF8EOXW005059@freefriends.org> <206e5f61-2c80-d6f0-c4a7-b7365c0b523d@cs.ucla.edu> <201912161012.xBGACBmm023523@freefriends.org> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <948ed75f-f2c0-e3e4-01c5-1425bd54520a@cs.ucla.edu> Date: Thu, 19 Dec 2019 19:18:15 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <201912161012.xBGACBmm023523@freefriends.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: 34951@debbugs.gnu.org, bug-gnulib@gnu.org, noritnk@kcn.ne.jp X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) On 12/16/19 2:12 AM, arnold@skeeve.com wrote: > What about > > typedef ptrdiff_t dfa_size_t That declaration would imply that the type is specific to DFAs. However, the type is used (with exactly the same meaning) in a lot of other places. This is why I used the more-generic name "idx_t" internally dfa.c. From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 19 22:41:57 2019 Received: (at 34951-done) by debbugs.gnu.org; 20 Dec 2019 03:41:57 +0000 Received: from localhost ([127.0.0.1]:45822 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ii9Ar-0005A7-6a for submit@debbugs.gnu.org; Thu, 19 Dec 2019 22:41:57 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:50476) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ii9Ap-00059p-0X for 34951-done@debbugs.gnu.org; Thu, 19 Dec 2019 22:41:56 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 0F23516016A; Thu, 19 Dec 2019 19:41:49 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id BJJMxqqam9oh; Thu, 19 Dec 2019 19:41:48 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 597F2160231; Thu, 19 Dec 2019 19:41:48 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 73QGXDKCnJyW; Thu, 19 Dec 2019 19:41:48 -0800 (PST) Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com [23.242.74.103]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 3706F16016A; Thu, 19 Dec 2019 19:41:48 -0800 (PST) Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher From: Paul Eggert To: Norihiro Tanaka , 34951-done@debbugs.gnu.org References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> Organization: UCLA Computer Science Department Message-ID: <3ff805f9-89dc-fb26-29c7-acf7f25ce895@cs.ucla.edu> Date: Thu, 19 Dec 2019 19:41:47 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951-done X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) On 12/11/19 3:25 PM, Paul Eggert wrote: > On 3/22/19 7:49 PM, Norihiro Tanaka wrote: >> Missing a patch for dfa.=C2=A0 Re-send correct patch file. >=20 > Thanks, I installed the DFA-relevant parts of your proposed fix into Gn= ulib. > (The grep parts still need doing.) I finally got around to reviewing the grep parts, and installed them into= 'grep' master. Thanks again for the fix, and sorry about the delay. Closing the = bug report, as the original bug has been fixed (though we can still talk abou= t what name to give ptrdiff_t, in bug-gnulib perhaps). I followed up with this NEWS entry: A performance bug has been fixed for patterns like '01.2' that cause grep to reorder tokens internally. [Bug#34951 introduced in grep 3.2] From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 20 05:35:36 2019 Received: (at 34951) by debbugs.gnu.org; 20 Dec 2019 10:35:36 +0000 Received: from localhost ([127.0.0.1]:45939 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1iiFd9-000712-Tv for submit@debbugs.gnu.org; Fri, 20 Dec 2019 05:35:36 -0500 Received: from freefriends.org ([96.88.95.60]:44254) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1iiFd8-00070u-BW for 34951@debbugs.gnu.org; Fri, 20 Dec 2019 05:35:35 -0500 X-Envelope-From: arnold@skeeve.com Received: from freefriends.org (freefriends.org [96.88.95.60]) by freefriends.org (8.14.7/8.14.7) with ESMTP id xBKAZLMs015870 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 20 Dec 2019 03:35:21 -0700 Received: (from arnold@localhost) by freefriends.org (8.14.7/8.14.7/Submit) id xBKAZJcx015869; Fri, 20 Dec 2019 03:35:19 -0700 From: arnold@skeeve.com Message-Id: <201912201035.xBKAZJcx015869@freefriends.org> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@skeeve.com using -f Date: Fri, 20 Dec 2019 03:35:19 -0700 To: jim@meyering.net, eggert@cs.ucla.edu, arnold@skeeve.com Subject: Re: bug#34951: [PATCH] grep: a kwset matcher not work in a grep matcher References: <20190323080618.E6EB.27F6AC2D@kcn.ne.jp> <20190323114902.E6F6.27F6AC2D@kcn.ne.jp> <75091466-e105-c35c-fcd6-19ccca325914@cs.ucla.edu> <201912120731.xBC7V6gD031767@freefriends.org> <38194689-56ba-41f4-3810-50df1ff9019c@cs.ucla.edu> <201912130809.xBD89uUG021729@freefriends.org> <201912131208.xBDC8QLo032709@freefriends.org> <20db9cb5-da09-fe26-f7fc-884fc194daaa@cs.ucla.edu> <201912150814.xBF8EOXW005059@freefriends.org> <206e5f61-2c80-d6f0-c4a7-b7365c0b523d@cs.ucla.edu> <201912161012.xBGACBmm023523@freefriends.org> <948ed75f-f2c0-e3e4-01c5-1425bd54520a@cs.ucla.edu> In-Reply-To: <948ed75f-f2c0-e3e4-01c5-1425bd54520a@cs.ucla.edu> User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34951 Cc: 34951@debbugs.gnu.org, bug-gnulib@gnu.org, noritnk@kcn.ne.jp X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Paul Eggert wrote: > On 12/16/19 2:12 AM, arnold@skeeve.com wrote: > > What about > > > > typedef ptrdiff_t dfa_size_t > > That declaration would imply that the type is specific to DFAs. However, the > type is used (with exactly the same meaning) in a lot of other places. This is > why I used the more-generic name "idx_t" internally dfa.c. I give up. Leave it ptrdiff_t. I may submit comment changes for dfa.h later. Arnold From unknown Wed Jun 18 23:07:13 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 17 Jan 2020 12:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator