From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Norihiro Tanaka Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Wed, 09 Apr 2014 13:56:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: patch To: 17229@debbugs.gnu.org X-Debbugs-Original-To: bug-grep@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.139705173010378 (code B ref -1); Wed, 09 Apr 2014 13:56:02 +0000 Received: (at submit) by debbugs.gnu.org; 9 Apr 2014 13:55:30 +0000 Received: from localhost ([127.0.0.1]:38696 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WXsyO-0002hI-P6 for submit@debbugs.gnu.org; Wed, 09 Apr 2014 09:55:29 -0400 Received: from eggs.gnu.org ([208.118.235.92]:45960) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WXsyH-0002ge-K1 for submit@debbugs.gnu.org; Wed, 09 Apr 2014 09:55:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WXsy4-0006Rt-Vr for submit@debbugs.gnu.org; Wed, 09 Apr 2014 09:55:16 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:46956) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WXsy4-0006Rp-Sf for submit@debbugs.gnu.org; Wed, 09 Apr 2014 09:55:08 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47632) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WXsxy-0000by-Jp for bug-grep@gnu.org; Wed, 09 Apr 2014 09:55:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WXsxr-00067l-Ou for bug-grep@gnu.org; Wed, 09 Apr 2014 09:55:02 -0400 Received: from pbsg500.nifty.com ([202.248.238.70]:34749) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WXsxr-00066W-9F for bug-grep@gnu.org; Wed, 09 Apr 2014 09:54:55 -0400 Received: from [10.120.1.47] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) (authenticated) by pbsg500.nifty.com with ESMTP id s39Dsfv5017066 for ; Wed, 9 Apr 2014 22:54:42 +0900 X-Nifty-SrcIP: [118.21.128.66] Date: Wed, 09 Apr 2014 22:54:43 +0900 From: Norihiro Tanaka Message-Id: <20140409225442.7863.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------_5343FA290000000010DE_MULTIPART_MIXED_" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) --------_5343FA290000000010DE_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit memchr() of glibc is faster than seeking by delta1 on some platforms. So, when there is no chance to match for a while, use it on them. Speed-up about 3x in best case, 10% in normal case by this patch. It's still available only for x86 and x86-64 platform. Norihiro --------_5343FA290000000010DE_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII"; name="patch.txt" Content-Disposition: attachment; filename="patch.txt" Content-Transfer-Encoding: base64 RnJvbSA5MGFjMjU5NzUxMWRkOGVkNDRlZWU2MThkZjM5YzFlZjk0N2VhMTI5IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE YXRlOiBUaHUsIDMgQXByIDIwMTQgMjI6NTc6MDkgKzA5MDAKU3ViamVjdDogW1BBVENIIDIvMl0g Z3JlcDogc3BlZWQtdXAgYnkgdXNpbmcgbWVtY2hyKCkgaW4gQm95ZXItTW9vcmUgc2VhcmNoaW5n CgptZW1jaHIoKSBvZiBnbGliYyBpcyBmYXN0ZXIgdGhhbiBzZWVraW5nIGJ5IGRlbHRhMSBvbiBz b21lIHBsYXRmb3Jtcy4KV2hlbiB0aGVyZSBpcyBubyBjaGFuY2UgdG8gbWF0Y2ggZm9yIGEgd2hp bGUsIHVzZSBpdCBvbiB0aGVtLgoqIHNyYy9rd3NldC5jIChibWV4ZWMpOiBVc2UgbWVtY2hyKCkg aW4gQm95ZXItTW9vcmUgc2VhcmNoaW5nLgotLS0KIHNyYy9rd3NldC5jIHwgMjMgKysrKysrKysr KysrKysrKysrKysrLS0KIDEgZmlsZSBjaGFuZ2VkLCAyMSBpbnNlcnRpb25zKCspLCAyIGRlbGV0 aW9ucygtKQoKZGlmZiAtLWdpdCBhL3NyYy9rd3NldC5jIGIvc3JjL2t3c2V0LmMKaW5kZXggOGQ4 YWEwNy4uYzk1YmM1NiAxMDA2NDQKLS0tIGEvc3JjL2t3c2V0LmMKKysrIGIvc3JjL2t3c2V0LmMK QEAgLTUzMSw4ICs1MzEsMjcgQEAgYm1leGVjIChrd3NldF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0 ZXh0LCBzaXplX3Qgc2l6ZSkKICAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBk OwogICAgICAgICAgICAgaWYgKGQgPT0gMCkKICAgICAgICAgICAgICAgZ290byBmb3VuZDsKLSAg ICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBkOwotICAgICAgICAgICAgZCA9IGQx W1UodHBbLTFdKV0sIHRwICs9IGQ7CisgICAgICAgICAgICAvKiBtZW1jaGFyKCkgb2YgZ2xpYmMg aXMgZmFzdGVyIHRoYW4gc2Vla2luZyBieSBkZWx0YTEgb24KKyAgICAgICAgICAgICAgIHNvbWUg cGxhdGZvcm1zLiAgV2hlbiB0aGVyZSBpcyBubyBjaGFuY2UgdG8gbWF0Y2ggZm9yIGEKKyAgICAg ICAgICAgICAgIHdoaWxlLCB1c2UgaXQgb24gdGhlbS4gICovCisjaWYgZGVmaW5lZChfX0dMSUJD X18pICYmIChkZWZpbmVkKF9faTM4Nl9fKSB8fCBkZWZpbmVkKF9feDg2XzY0X18pKQorICAgICAg ICAgICAgaWYgKCF0cmFucykKKyAgICAgICAgICAgICAgeworICAgICAgICAgICAgICAgIHRwID0g bWVtY2hyICh0cCAtIDEsIGdjMSwgc2l6ZSArIHRleHQgLSB0cCArIDEpOworICAgICAgICAgICAg ICAgIGlmICh0cCkKKyAgICAgICAgICAgICAgICAgIHsKKyAgICAgICAgICAgICAgICAgICAgKyt0 cDsKKyAgICAgICAgICAgICAgICAgICAgZ290byBmb3VuZDsKKyAgICAgICAgICAgICAgICAgIH0K KyAgICAgICAgICAgICAgICBlbHNlCisgICAgICAgICAgICAgICAgICByZXR1cm4gLTE7CisgICAg ICAgICAgICAgIH0KKyAgICAgICAgICAgIGVsc2UKKyNlbmRpZgorICAgICAgICAgICAgICB7Cisg ICAgICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7CisgICAgICAgICAgICAg ICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7CisgICAgICAgICAgICAgIH0KICAgICAgICAg ICB9CiAgICAgICAgIGJyZWFrOwogICAgICAgZm91bmQ6Ci0tIAoxLjkuMQoK --------_5343FA290000000010DE_MULTIPART_MIXED_-- From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Thu, 10 Apr 2014 00:36:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: patch To: Norihiro Tanaka , 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.139709014917193 (code B ref 17229); Thu, 10 Apr 2014 00:36:02 +0000 Received: (at 17229) by debbugs.gnu.org; 10 Apr 2014 00:35:49 +0000 Received: from localhost ([127.0.0.1]:39438 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WY2y4-0004TF-OI for submit@debbugs.gnu.org; Wed, 09 Apr 2014 20:35:49 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:40554) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WY2y2-0004Sz-8J for 17229@debbugs.gnu.org; Wed, 09 Apr 2014 20:35:47 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 3607239E8017; Wed, 9 Apr 2014 17:35:40 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HCwug0lreQrs; Wed, 9 Apr 2014 17:35:31 -0700 (PDT) Received: from penguin.cs.ucla.edu (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 9CA5C39E8008; Wed, 9 Apr 2014 17:35:31 -0700 (PDT) Message-ID: <5345E753.6040108@cs.ucla.edu> Date: Wed, 09 Apr 2014 17:35:31 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 References: <20140409225442.7863.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140409225442.7863.27F6AC2D@kcn.ne.jp> Content-Type: multipart/mixed; boundary="------------040905010709030803080706" X-Spam-Score: -2.6 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.6 (--) This is a multi-part message in MIME format. --------------040905010709030803080706 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 04/09/2014 06:54 AM, Norihiro Tanaka wrote: > memchr() of glibc is faster than seeking by delta1 on some platforms. > So, when there is no chance to match for a while, use it on them. > > Speed-up about 3x in best case, 10% in normal case by this patch. > > It's still available only for x86 and x86-64 platform. > Thanks, I have some questions about this one. What benchmark did you use to time this? THe patch refers to a variable called 'trans' that is not in the current savannah git master. Is there some other patch that establishes this variable, a patch that is a prerequisite for this one? If trans is a cached copy of kwset->trans, isn't that guaranteed to be NULL here? What is delta1? It's mentioned in a comment but not in the code. It'd be simpler to use memchr on all platforms; is there a major performance downside to that? For example, what about the attached patch instead? It also simplifies things a bit to avoid a label and its associated gotos, under the assumption that trans is NULL here. --------------040905010709030803080706 Content-Type: text/x-patch; name="memchr.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="memchr.diff" diff --git a/src/kwset.c b/src/kwset.c index d5db12f..9163482 100644 --- a/src/kwset.c +++ b/src/kwset.c @@ -494,29 +494,32 @@ bmexec (kwset_t kwset, char const *text, size_t size) /* Significance of 12: 1 (initial offset) + 10 (skip loop) + 1 (md2). */ if (size > 12 * len) /* 11 is not a bug, the initial offset happens only once. */ - for (ep = text + size - 11 * len;;) + for (ep = text + size - 11 * len; tp <= ep; ) { - while (tp <= ep) + d = d1[U(tp[-1])], tp += d; + d = d1[U(tp[-1])], tp += d; + if (d != 0) { d = d1[U(tp[-1])], tp += d; d = d1[U(tp[-1])], tp += d; - if (d == 0) - goto found; - d = d1[U(tp[-1])], tp += d; - d = d1[U(tp[-1])], tp += d; - d = d1[U(tp[-1])], tp += d; - if (d == 0) - goto found; - d = d1[U(tp[-1])], tp += d; - d = d1[U(tp[-1])], tp += d; - d = d1[U(tp[-1])], tp += d; - if (d == 0) - goto found; - d = d1[U(tp[-1])], tp += d; d = d1[U(tp[-1])], tp += d; + if (d != 0) + { + d = d1[U(tp[-1])], tp += d; + d = d1[U(tp[-1])], tp += d; + d = d1[U(tp[-1])], tp += d; + if (d != 0) + { + /* It's typically faster to use memchr when there is + no chance to match for a while. */ + tp = memchr (tp - 1, gc1, text + size - (tp - 1)); + if (! tp) + return -1; + tp++; + } + } } - break; - found: + d = len; while (true) { --------------040905010709030803080706-- From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Norihiro Tanaka Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Thu, 10 Apr 2014 11:38:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: patch To: Paul Eggert Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.13971298721851 (code B ref 17229); Thu, 10 Apr 2014 11:38:02 +0000 Received: (at 17229) by debbugs.gnu.org; 10 Apr 2014 11:37:52 +0000 Received: from localhost ([127.0.0.1]:39621 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WYDIl-0000Tk-BG for submit@debbugs.gnu.org; Thu, 10 Apr 2014 07:37:52 -0400 Received: from mailgw01.kcn.ne.jp ([61.86.7.208]:36064) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WYDIh-0000TX-OM for 17229@debbugs.gnu.org; Thu, 10 Apr 2014 07:37:49 -0400 Received: from imp03 (mailgw7.kcn.ne.jp [61.86.15.238]) by mailgw01.kcn.ne.jp (Postfix) with ESMTP id E68288030B for <17229@debbugs.gnu.org>; Thu, 10 Apr 2014 20:37:43 +0900 (JST) Received: from mail04.kcn.ne.jp ([61.86.6.183]) by imp03 with bizsmtp id oBdj1n00K3wvxAM01Bdjz3; Thu, 10 Apr 2014 20:37:43 +0900 X-OrgRCPT: 17229@debbugs.gnu.org Received: from [10.120.1.33] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail04.kcn.ne.jp (Postfix) with ESMTPA id 8A99D1290022; Thu, 10 Apr 2014 20:37:43 +0900 (JST) Date: Thu, 10 Apr 2014 20:37:45 +0900 From: Norihiro Tanaka In-Reply-To: <5345E753.6040108@cs.ucla.edu> References: <20140409225442.7863.27F6AC2D@kcn.ne.jp> <5345E753.6040108@cs.ucla.edu> Message-Id: <20140410203744.5CF7.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.6 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) Hi Paul, Hi Paul, Thanks for the review for the patch. > What benchmark did you use to time this? I measured below. $ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 >k Before: $ time -p env LANG=C src/grep jk k real 1.53 user 1.21 sys 0.31 After: $ time -p env LANG=C src/grep jk k real 0.33 user 0.03 sys 0.29 > Is there some other patch that establishes this variable, a patch that > is a prerequisite for this one? The patch requires below. bug#17230 [PATCH 1/2] grep: may also use Boyer-Moore algorithm for case-insensitive matching > What is delta1? It's mentioned in a comment but not in the code. It means `d1' in kwset.c:bmexec. > It'd be simpler to use memchr on all platforms; > is there a major performance downside to that? Yes. As far as I was confirmed, it's slow in HP-UX on Itanium and Solaris on SPARC. I think that that it depends on the implementation of memchr() and the rate of the increment instruction for the `add' instruction on the platform. Norihiro From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Thu, 10 Apr 2014 21:12:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: patch To: Norihiro Tanaka Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.13971643043381 (code B ref 17229); Thu, 10 Apr 2014 21:12:01 +0000 Received: (at 17229) by debbugs.gnu.org; 10 Apr 2014 21:11:44 +0000 Received: from localhost ([127.0.0.1]:44885 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WYMG6-0000sS-Sw for submit@debbugs.gnu.org; Thu, 10 Apr 2014 17:11:43 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:36189) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WYMG3-0000sE-TL for 17229@debbugs.gnu.org; Thu, 10 Apr 2014 17:11:40 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id C8737A60004; Thu, 10 Apr 2014 14:11:33 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3jc7DMdAHul1; Thu, 10 Apr 2014 14:11:25 -0700 (PDT) Received: from penguin.cs.ucla.edu (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 4519839E8019; Thu, 10 Apr 2014 14:11:25 -0700 (PDT) Message-ID: <534708F6.8090708@cs.ucla.edu> Date: Thu, 10 Apr 2014 14:11:18 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 References: <20140409225442.7863.27F6AC2D@kcn.ne.jp> <5345E753.6040108@cs.ucla.edu> <20140410203744.5CF7.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140410203744.5CF7.27F6AC2D@kcn.ne.jp> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.9 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.9 (--) On 04/10/2014 04:37 AM, Norihiro Tanaka wrote: > The patch requires below. bug#17230 [PATCH 1/2] grep: may also use > Boyer-Moore algorithm for case-insensitive matching OK, thanks, I'll work on Bug #17230 first, before worrying about 17229. > It'd be simpler to use memchr on all platforms; > is there a major performance downside to that? > Yes. As far as I was confirmed, it's slow in HP-UX on Itanium and > Solaris on SPARC. I think that that it depends on the implementation of > memchr() and the rate of the increment instruction for the `add' > instruction on the platform. If it's *way* slower with memchr and HP-UX/Itanium or Solaris/SPARC we may need to complicate the code, but if it's just a bit slower then it's probably better to use the simpler implementation. There's a tradeoff between simplified maintenance and performance on non-GNU hosts; sometimes the former is more important than the latter, and this may be one of those times. From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Eric Blake Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Thu, 10 Apr 2014 21:34:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: patch To: Paul Eggert , Norihiro Tanaka Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.13971656085602 (code B ref 17229); Thu, 10 Apr 2014 21:34:01 +0000 Received: (at 17229) by debbugs.gnu.org; 10 Apr 2014 21:33:28 +0000 Received: from localhost ([127.0.0.1]:44896 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WYMb9-0001SI-Hz for submit@debbugs.gnu.org; Thu, 10 Apr 2014 17:33:27 -0400 Received: from mx1.redhat.com ([209.132.183.28]:23223) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WYMb5-0001S4-Sm for 17229@debbugs.gnu.org; Thu, 10 Apr 2014 17:33:25 -0400 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s3ALXAM0032572 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 10 Apr 2014 17:33:11 -0400 Received: from [10.3.113.140] (ovpn-113-140.phx2.redhat.com [10.3.113.140]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s3ALXAfv031148; Thu, 10 Apr 2014 17:33:10 -0400 Message-ID: <53470E16.3060500@redhat.com> Date: Thu, 10 Apr 2014 15:33:10 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 References: <20140409225442.7863.27F6AC2D@kcn.ne.jp> <5345E753.6040108@cs.ucla.edu> <20140410203744.5CF7.27F6AC2D@kcn.ne.jp> <534708F6.8090708@cs.ucla.edu> In-Reply-To: <534708F6.8090708@cs.ucla.edu> X-Enigmail-Version: 1.6 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="BDrIpalHm1GJx3HT5X2vG5nuDOTHgaELg" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 X-Spam-Score: -5.6 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.6 (-----) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --BDrIpalHm1GJx3HT5X2vG5nuDOTHgaELg Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 04/10/2014 03:11 PM, Paul Eggert wrote: > On 04/10/2014 04:37 AM, Norihiro Tanaka wrote: >> The patch requires below. bug#17230 [PATCH 1/2] grep: may also use >> Boyer-Moore algorithm for case-insensitive matching >=20 > OK, thanks, I'll work on Bug #17230 first, before worrying about 17229.= >=20 >> It'd be simpler to use memchr on all platforms; >> is there a major performance downside to that? >> Yes. As far as I was confirmed, it's slow in HP-UX on Itanium and >> Solaris on SPARC. I think that that it depends on the implementation = of >> memchr() and the rate of the increment instruction for the `add' >> instruction on the platform. >=20 > If it's *way* slower with memchr and HP-UX/Itanium or Solaris/SPARC we > may need to complicate the code, but if it's just a bit slower then it'= s > probably better to use the simpler implementation. There's a tradeoff > between simplified maintenance and performance on non-GNU hosts; > sometimes the former is more important than the latter, and this may be= > one of those times. In the past, we've used gnulib to work around painfully slow implementations. Remember that many platforms have a quadratic strstr(); gnulib has the ability to replace it with a guaranteed-linear version written in straight C code (side note - glibc has since taken gnulib's implementation and further optimized it; maybe we should consider feeding the glibc improvements back to gnulib). If the native memchr can be demonstrated to be slower on certain platforms in comparison to an implementation that we can write in straight C, then having gnulib supply the faster replacement is so much easier to maintain than to special-case non-GNU hosts in the main body of grep code= =2E --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --BDrIpalHm1GJx3HT5X2vG5nuDOTHgaELg Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJTRw4WAAoJEKeha0olJ0NqYRgH/1V6xGLeECpnaBPODwKz4CWy nQgWk4k1H0eanSkKHC7J2OGhDbmOQkffUEnQAyP94Tk82mak6OvGQPs3f48M6Bu7 ElhOyE/ibTIssYn5ojeYUNpWMhbmtc+6cifeVdvxiVvWvG/6v1jye97N7rMHhAgD 8Wg7SOLqVsAchgcNBtV1yorerYLtgiNyXlWyIUKqiejQ9sXYYyWlzl0VWLdmc+V9 NEQLvBIifnsde9dHJPUgQHlAU/BZ/MDseEPeuw8UqJwCDzyEvEc/TuMJZxEQ00z6 TTMFuuIcTNpjR1xuU0f5HwB5bcCDeBt8mOO0hOSXcJzt1ie5A/zyXj3bXwoMKaI= =tQN/ -----END PGP SIGNATURE----- --BDrIpalHm1GJx3HT5X2vG5nuDOTHgaELg-- From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Norihiro Tanaka Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Thu, 10 Apr 2014 23:26:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: patch To: Paul Eggert Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.139717233421189 (code B ref 17229); Thu, 10 Apr 2014 23:26:01 +0000 Received: (at 17229) by debbugs.gnu.org; 10 Apr 2014 23:25:34 +0000 Received: from localhost ([127.0.0.1]:44929 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WYOLc-0005Vd-Gy for submit@debbugs.gnu.org; Thu, 10 Apr 2014 19:25:33 -0400 Received: from mailgw04.kcn.ne.jp ([61.86.7.211]:42864) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WYOLX-0005VO-Jr for 17229@debbugs.gnu.org; Thu, 10 Apr 2014 19:25:30 -0400 Received: from imp01 (mailgw5.kcn.ne.jp [61.86.15.231]) by mailgw04.kcn.ne.jp (Postfix) with ESMTP id 6C9B66C1D6A for <17229@debbugs.gnu.org>; Fri, 11 Apr 2014 08:25:24 +0900 (JST) Received: from mail07.kcn.ne.jp ([61.86.6.186]) by imp01 with bizsmtp id oPRQ1n00440oyB901PRQPM; Fri, 11 Apr 2014 08:25:24 +0900 X-OrgRCPT: 17229@debbugs.gnu.org Received: from [10.120.1.19] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail07.kcn.ne.jp (Postfix) with ESMTPA id 232F8D50020; Fri, 11 Apr 2014 08:25:24 +0900 (JST) Date: Fri, 11 Apr 2014 08:25:25 +0900 From: Norihiro Tanaka In-Reply-To: <534708F6.8090708@cs.ucla.edu> References: <20140410203744.5CF7.27F6AC2D@kcn.ne.jp> <534708F6.8090708@cs.ucla.edu> Message-Id: <20140411082524.0DE5.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------_534725B1000000000E2B_MULTIPART_MIXED_" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.6 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) --------_534725B1000000000E2B_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit I believe that memchr() isn't slower on HP-UX and Solaris but more faster on x86 processors. Now I wrote the additional patch. It replaces `tp += d' to the other code in bmexec() and cwexec() for x86 processors. I see that the code is more complex and slower theoretically, but it's faster actually when `d' is small. (We can test it by `grep -i jk k' after patch#17230 and this patch.) Therefore, I think that memchr() is faster than `tp += d' because the increment instruction is more faster than the add instruction on x86 processors. KWSet may be slower than DFA on them in the worst case, even if doesn't use delta2. Try to run and compare below and compare without this patches. The former uses only KWSet and the later uses only DFA. Later is faster on Linux x86, because DFA uses increment instrument for shift of text pointer. It may generate a wrong usage among users. I think that it should be avoided. $ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 >k $ env LANG=C time -p src/grep jk k $ env LANG=C time -p src/grep -i jk k > sometimes the former is more important than the latter, and this may be > one of those times. I see that grep attaches a high value to performance. It uses not only regex but also DFA. Further more, DFA has the specific codes for utf-8 locale, which is used most frequently. I think that grep may also have specific codes for x86 processors, which are also used most frequently. Norihiro --------_534725B1000000000E2B_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII"; name="patch.txt" Content-Disposition: attachment; filename="patch.txt" Content-Transfer-Encoding: base64 RnJvbSA3MGZkODFmN2NmNzMxMTU4OWIxNDNlMGFkYWEyYWFkNDFlN2FiMGRhIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE YXRlOiBUaHUsIDEwIEFwciAyMDE0IDIwOjUyOjU0ICswOTAwClN1YmplY3Q6IFtQQVRDSCAzLzNd IGdyZXA6IHNwZWVkLXVwIGJ5IHJlcGxhY2luZyBgaW5jcicgdG8gYGFkZCcgaW4geDg2IGFuZAog eDg2LTY0CgpUaGUgaW5jcmVtZW50IGluc3RydW1lbnQgaXMgbW9yZSBmYXN0ZXIgdGhhbiB0aGUg YWRkIGluc3R1Y3Rpb24gaW4geDg2CmFuZCB4ODYtNjQgYXJjaGl0ZWN0dXJlLiAgU28gcHJlZmVy IHRoZSBhZGQgaW5zdHJ1bWVudCB0byB0aGUgaW5jcmVtZW50Cmluc3RydW1lbnQgZm9yIHRoZW0u Ciogc3JjL2t3c2V0LmMgKFNISUZUKTogTmV3IG1hY3JvLiAgSXQgdXNlcyB0aGUgaW5jcmVtZW50 IGluc3RydW1lbnQKaW5zdGVhZCBvZiB0aGUgYWRkIGluc3RydW1lbnQgZm9yIHg4NiBhbmQgeDg2 LTY0LgooYm1leGVjLCBjd2V4ZWMpOiBVc2UgaXQuCi0tLQogc3JjL2t3c2V0LmMgfCA5MiArKysr KysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKy0tLS0tLS0tLS0tLS0tLS0tLS0t CiAxIGZpbGUgY2hhbmdlZCwgNjIgaW5zZXJ0aW9ucygrKSwgMzAgZGVsZXRpb25zKC0pCgpkaWZm IC0tZ2l0IGEvc3JjL2t3c2V0LmMgYi9zcmMva3dzZXQuYwppbmRleCBjOTViYzU2Li45OThhYWI3 IDEwMDY0NAotLS0gYS9zcmMva3dzZXQuYworKysgYi9zcmMva3dzZXQuYwpAQCAtNTUsNiArNTUs MzUgQEAKIAogI2RlZmluZSBVKGMpICh0b191Y2hhciAoYykpCiAKKyNpZiBkZWZpbmVkKF9faTM4 Nl9fKSB8fCBkZWZpbmVkKF9feDg2XzY0X18pCisjZGVmaW5lIFNISUZUKHAsIGQpCVwKKyAgc3dp dGNoIChkKQkJXAorICAgIHsJCQlcCisgICAgZGVmYXVsdDoJCVwKKyAgICAgIHAgKz0gZDsJCVwK KyAgICAgIGJyZWFrOwkJXAorICAgIGNhc2UgODoJCVwKKyAgICAgICsrcDsJCVwKKyAgICBjYXNl IDc6CQlcCisgICAgICArK3A7CQlcCisgICAgY2FzZSA2OgkJXAorICAgICAgKytwOwkJXAorICAg IGNhc2UgNToJCVwKKyAgICAgICsrcDsJCVwKKyAgICBjYXNlIDQ6CQlcCisgICAgICArK3A7CQlc CisgICAgY2FzZSAzOgkJXAorICAgICAgKytwOwkJXAorICAgIGNhc2UgMjoJCVwKKyAgICAgICsr cDsJCVwKKyAgICBjYXNlIDE6CQlcCisgICAgICArK3A7CQlcCisgICAgfQorI2Vsc2UKKyNkZWZp bmUgU0hJRlQocCwgZCkJXAorICBwICs9IGQ7CisjZW5kaWYKKwogLyogQmFsYW5jZWQgdHJlZSBv ZiBlZGdlcyBhbmQgbGFiZWxzIGxlYXZpbmcgYSBnaXZlbiB0cmllIG5vZGUuICovCiBzdHJ1Y3Qg dHJlZQogewpAQCAtNTE3LDE4ICs1NDYsMTggQEAgYm1leGVjIChrd3NldF90IGt3c2V0LCBjaGFy IGNvbnN0ICp0ZXh0LCBzaXplX3Qgc2l6ZSkKICAgICAgIHsKICAgICAgICAgd2hpbGUgKHRwIDw9 IGVwKQogICAgICAgICAgIHsKLSAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBk OwotICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7CisgICAgICAgICAgICBk ID0gZDFbVSh0cFstMV0pXTsgU0hJRlQodHAsIGQpOworICAgICAgICAgICAgZCA9IGQxW1UodHBb LTFdKV07IFNISUZUKHRwLCBkKTsKICAgICAgICAgICAgIGlmIChkID09IDApCiAgICAgICAgICAg ICAgIGdvdG8gZm91bmQ7Ci0gICAgICAgICAgICBkID0gZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsK LSAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBkOwotICAgICAgICAgICAgZCA9 IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7CisgICAgICAgICAgICBkID0gZDFbVSh0cFstMV0pXTsg U0hJRlQodHAsIGQpOworICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV07IFNISUZUKHRwLCBk KTsKKyAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldOyBTSElGVCh0cCwgZCk7CiAgICAgICAg ICAgICBpZiAoZCA9PSAwKQogICAgICAgICAgICAgICBnb3RvIGZvdW5kOwotICAgICAgICAgICAg ZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7Ci0gICAgICAgICAgICBkID0gZDFbVSh0cFstMV0p XSwgdHAgKz0gZDsKLSAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBkOworICAg ICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV07IFNISUZUKHRwLCBkKTsKKyAgICAgICAgICAgIGQg PSBkMVtVKHRwWy0xXSldOyBTSElGVCh0cCwgZCk7CisgICAgICAgICAgICBkID0gZDFbVSh0cFst MV0pXTsgU0hJRlQodHAsIGQpOwogICAgICAgICAgICAgaWYgKGQgPT0gMCkKICAgICAgICAgICAg ICAgZ290byBmb3VuZDsKICAgICAgICAgICAgIC8qIG1lbWNoYXIoKSBvZiBnbGliYyBpcyBmYXN0 ZXIgdGhhbiBzZWVraW5nIGJ5IGRlbHRhMSBvbgpAQCAtNTQ5LDggKzU3OCw4IEBAIGJtZXhlYyAo a3dzZXRfdCBrd3NldCwgY2hhciBjb25zdCAqdGV4dCwgc2l6ZV90IHNpemUpCiAgICAgICAgICAg ICBlbHNlCiAjZW5kaWYKICAgICAgICAgICAgICAgewotICAgICAgICAgICAgICAgIGQgPSBkMVtV KHRwWy0xXSldLCB0cCArPSBkOwotICAgICAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0 cCArPSBkOworICAgICAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldOyBTSElGVCh0cCwgZCk7 CisgICAgICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV07IFNISUZUKHRwLCBkKTsKICAgICAg ICAgICAgICAgfQogICAgICAgICAgIH0KICAgICAgICAgYnJlYWs7CkBAIC01NzMsMjQgKzYwMiwy NCBAQCBibWV4ZWMgKGt3c2V0X3Qga3dzZXQsIGNoYXIgY29uc3QgKnRleHQsIHNpemVfdCBzaXpl KQogICAgICAgICAgICAgICAgICAgICAgICAgaWYgKGkgPiBsZW4pCiAgICAgICAgICAgICAgICAg ICAgICAgICAgIHJldHVybiB0cCAtIGxlbiAtIHRleHQ7CiAgICAgICAgICAgICAgICAgICAgICAg fQotICAgICAgICAgICAgICAgICAgICBkID0ga3dzZXQtPnNoaWZ0W2kgLSAyXTsgdHAgKz0gZDsK KyAgICAgICAgICAgICAgICAgICAgZCA9IGt3c2V0LT5zaGlmdFtpIC0gMl07IFNISUZUKHRwLCBk KTsKICAgICAgICAgICAgICAgICAgICAgaWYgKHRwID4gZXApCiAgICAgICAgICAgICAgICAgICAg ICAgYnJlYWs7CiAgICAgICAgICAgICAgICAgICAgIGlmICh0cmFuc1tVKHRwWy0xXSldICE9IGdj MSkKICAgICAgICAgICAgICAgICAgICAgICB7Ci0gICAgICAgICAgICAgICAgICAgICAgICBkID0g ZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKKyAgICAgICAgICAgICAgICAgICAgICAgIGQgPSBkMVtV KHRwWy0xXSldOyBTSElGVCh0cCwgZCk7CiAgICAgICAgICAgICAgICAgICAgICAgICBicmVhazsK ICAgICAgICAgICAgICAgICAgICAgICB9CiAgICAgICAgICAgICAgICAgICAgIHNraXAgPSBpIC0g MTsKICAgICAgICAgICAgICAgICAgIH0KICAgICAgICAgICAgICAgICBlbHNlCiAgICAgICAgICAg ICAgICAgICB7Ci0gICAgICAgICAgICAgICAgICAgIGQgPSBrd3NldC0+c2hpZnRbMF07IHRwICs9 IGQ7CisgICAgICAgICAgICAgICAgICAgIGQgPSBrd3NldC0+c2hpZnRbMF07IFNISUZUKHRwLCBk KTsKICAgICAgICAgICAgICAgICAgICAgaWYgKHRwID4gZXApCiAgICAgICAgICAgICAgICAgICAg ICAgYnJlYWs7CiAgICAgICAgICAgICAgICAgICAgIGlmICh0cmFuc1tVKHRwWy0xXSldICE9IGdj MSkKICAgICAgICAgICAgICAgICAgICAgICB7Ci0gICAgICAgICAgICAgICAgICAgICAgICBkID0g ZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKKyAgICAgICAgICAgICAgICAgICAgICAgIGQgPSBkMVtV KHRwWy0xXSldOyBTSElGVCh0cCwgZCk7CiAgICAgICAgICAgICAgICAgICAgICAgICBicmVhazsK ICAgICAgICAgICAgICAgICAgICAgICB9CiAgICAgICAgICAgICAgICAgICAgIHNraXAgPSAxOwpA QCAtNjE0LDI0ICs2NDMsMjQgQEAgYm1leGVjIChrd3NldF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0 ZXh0LCBzaXplX3Qgc2l6ZSkKICAgICAgICAgICAgICAgICAgICAgICAgIGlmIChpID4gbGVuKQog ICAgICAgICAgICAgICAgICAgICAgICAgICByZXR1cm4gdHAgLSBsZW4gLSB0ZXh0OwogICAgICAg ICAgICAgICAgICAgICAgIH0KLSAgICAgICAgICAgICAgICAgICAgZCA9IGt3c2V0LT5zaGlmdFtp IC0gMl07IHRwICs9IGQ7CisgICAgICAgICAgICAgICAgICAgIGQgPSBrd3NldC0+c2hpZnRbaSAt IDJdOyBTSElGVCh0cCwgZCk7CiAgICAgICAgICAgICAgICAgICAgIGlmICh0cCA+IGVwKQogICAg ICAgICAgICAgICAgICAgICAgIGJyZWFrOwogICAgICAgICAgICAgICAgICAgICBpZiAodHBbLTFd ICE9IGdjMSkKICAgICAgICAgICAgICAgICAgICAgICB7Ci0gICAgICAgICAgICAgICAgICAgICAg ICBkID0gZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKKyAgICAgICAgICAgICAgICAgICAgICAgIGQg PSBkMVtVKHRwWy0xXSldOyBTSElGVCh0cCwgZCk7CiAgICAgICAgICAgICAgICAgICAgICAgICBi cmVhazsKICAgICAgICAgICAgICAgICAgICAgICB9CiAgICAgICAgICAgICAgICAgICAgIHNraXAg PSBpIC0gMTsKICAgICAgICAgICAgICAgICAgIH0KICAgICAgICAgICAgICAgICBlbHNlCiAgICAg ICAgICAgICAgICAgICB7Ci0gICAgICAgICAgICAgICAgICAgIGQgPSBrd3NldC0+c2hpZnRbMF07 IHRwICs9IGQ7CisgICAgICAgICAgICAgICAgICAgIGQgPSBrd3NldC0+c2hpZnRbMF07IFNISUZU KHRwLCBkKTsKICAgICAgICAgICAgICAgICAgICAgaWYgKHRwID4gZXApCiAgICAgICAgICAgICAg ICAgICAgICAgYnJlYWs7CiAgICAgICAgICAgICAgICAgICAgIGlmICh0cFstMV0gIT0gZ2MxKQog ICAgICAgICAgICAgICAgICAgICAgIHsKLSAgICAgICAgICAgICAgICAgICAgICAgIGQgPSBkMVtV KHRwWy0xXSldLCB0cCArPSBkOworICAgICAgICAgICAgICAgICAgICAgICAgZCA9IGQxW1UodHBb LTFdKV07IFNISUZUKHRwLCBkKTsKICAgICAgICAgICAgICAgICAgICAgICAgIGJyZWFrOwogICAg ICAgICAgICAgICAgICAgICAgIH0KICAgICAgICAgICAgICAgICAgICAgc2tpcCA9IDE7CkBAIC02 NDYsNyArNjc1LDggQEAgYm1leGVjIChrd3NldF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0ZXh0LCBz aXplX3Qgc2l6ZSkKICAgZCA9IGQxW1UodHBbLTFdKV07CiAgIHdoaWxlIChkIDw9IGVwIC0gdHAp CiAgICAgewotICAgICAgZCA9IGQxW1UoKHRwICs9IGQpWy0xXSldOworICAgICAgU0hJRlQodHAs IGQpOworICAgICAgZCA9IGQxW1UodHBbLTFdKV07CiAgICAgICBpZiAoZCAhPSAwKQogICAgICAg ICBjb250aW51ZTsKICAgICAgIGQgPSBsZW47CkBAIC02NjcsNyArNjk3LDcgQEAgYm1leGVjIChr d3NldF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0ZXh0LCBzaXplX3Qgc2l6ZSkKICAgICAgICAgICAg ICAgICAgICAgICBpZiAoaSA+IGxlbikKICAgICAgICAgICAgICAgICAgICAgICAgIHJldHVybiB0 cCAtIGxlbiAtIHRleHQ7CiAgICAgICAgICAgICAgICAgICAgIH0KLSAgICAgICAgICAgICAgICAg IGQgPSBrd3NldC0+c2hpZnRbaSAtIDJdOyB0cCArPSBkOworICAgICAgICAgICAgICAgICAgZCA9 IGt3c2V0LT5zaGlmdFtpIC0gMl07IFNISUZUKHRwLCBkKTsKICAgICAgICAgICAgICAgICAgIGlm ICh0cCA+IGVwKQogICAgICAgICAgICAgICAgICAgICBicmVhazsKICAgICAgICAgICAgICAgICAg IGlmICh0cmFuc1tVKHRwWy0xXSldICE9IGdjMSkKQEAgLTY3OSw3ICs3MDksNyBAQCBibWV4ZWMg KGt3c2V0X3Qga3dzZXQsIGNoYXIgY29uc3QgKnRleHQsIHNpemVfdCBzaXplKQogICAgICAgICAg ICAgICAgIH0KICAgICAgICAgICAgICAgZWxzZQogICAgICAgICAgICAgICAgIHsKLSAgICAgICAg ICAgICAgICAgIGQgPSBrd3NldC0+c2hpZnRbMF07IHRwICs9IGQ7CisgICAgICAgICAgICAgICAg ICBkID0ga3dzZXQtPnNoaWZ0WzBdOyBTSElGVCh0cCwgZCk7CiAgICAgICAgICAgICAgICAgICBp ZiAodHAgPiBlcCkKICAgICAgICAgICAgICAgICAgICAgYnJlYWs7CiAgICAgICAgICAgICAgICAg ICBpZiAodHJhbnNbVSh0cFstMV0pXSAhPSBnYzEpCkBAIC03MDgsNyArNzM4LDcgQEAgYm1leGVj IChrd3NldF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0ZXh0LCBzaXplX3Qgc2l6ZSkKICAgICAgICAg ICAgICAgICAgICAgICBpZiAoaSA+IGxlbikKICAgICAgICAgICAgICAgICAgICAgICAgIHJldHVy biB0cCAtIGxlbiAtIHRleHQ7CiAgICAgICAgICAgICAgICAgICAgIH0KLSAgICAgICAgICAgICAg ICAgIGQgPSBrd3NldC0+c2hpZnRbaSAtIDJdOyB0cCArPSBkOworICAgICAgICAgICAgICAgICAg ZCA9IGt3c2V0LT5zaGlmdFtpIC0gMl07IFNISUZUKHRwLCBkKTsKICAgICAgICAgICAgICAgICAg IGlmICh0cCA+IGVwKQogICAgICAgICAgICAgICAgICAgICBicmVhazsKICAgICAgICAgICAgICAg ICAgIGlmICh0cFstMV0gIT0gZ2MxKQpAQCAtNzIwLDcgKzc1MCw3IEBAIGJtZXhlYyAoa3dzZXRf dCBrd3NldCwgY2hhciBjb25zdCAqdGV4dCwgc2l6ZV90IHNpemUpCiAgICAgICAgICAgICAgICAg fQogICAgICAgICAgICAgICBlbHNlCiAgICAgICAgICAgICAgICAgewotICAgICAgICAgICAgICAg ICAgZCA9IGt3c2V0LT5zaGlmdFswXTsgdHAgKz0gZDsKKyAgICAgICAgICAgICAgICAgIGQgPSBr d3NldC0+c2hpZnRbMF07IFNISUZUKHRwLCBkKTsKICAgICAgICAgICAgICAgICAgIGlmICh0cCA+ IGVwKQogICAgICAgICAgICAgICAgICAgICBicmVhazsKICAgICAgICAgICAgICAgICAgIGlmICh0 cFstMV0gIT0gZ2MxKQpAQCAtNzgxLDE3ICs4MTEsMTggQEAgY3dleGVjIChrd3NldF90IGt3c2V0 LCBjaGFyIGNvbnN0ICp0ZXh0LCBzaXplX3QgbGVuLCBzdHJ1Y3Qga3dzbWF0Y2ggKmt3c21hdGNo KQogICAgIHsKICAgICAgIGlmIChxbGltICYmIGVuZCA8PSBxbGltKQogICAgICAgICB7Ci0gICAg ICAgICAgZW5kICs9IGQgLSAxOwogICAgICAgICAgIHdoaWxlICgoZCA9IGRlbHRhW2MgPSAqZW5k XSkgJiYgZW5kIDwgcWxpbSkKICAgICAgICAgICAgIHsKLSAgICAgICAgICAgICAgZW5kICs9IGQ7 Ci0gICAgICAgICAgICAgIGVuZCArPSBkZWx0YVtVKCplbmQpXTsKLSAgICAgICAgICAgICAgZW5k ICs9IGRlbHRhW1UoKmVuZCldOworICAgICAgICAgICAgICBTSElGVChlbmQsIGQpOworICAgICAg ICAgICAgICBkID0gZGVsdGFbVShlbmRbLTFdKV07IFNISUZUKGVuZCwgZCk7CisgICAgICAgICAg ICAgIGQgPSBkZWx0YVtVKGVuZFstMV0pXTsgU0hJRlQoZW5kLCBkKTsKICAgICAgICAgICAgIH0K LSAgICAgICAgICArK2VuZDsKICAgICAgICAgfQogICAgICAgZWxzZQotICAgICAgICBkID0gZGVs dGFbYyA9IChlbmQgKz0gZClbLTFdXTsKKyAgICAgICAgeworICAgICAgICAgIFNISUZUKGVuZCwg ZCk7CisgICAgICAgICAgZCA9IGRlbHRhW2MgPSBlbmRbLTFdXTsKKyAgICAgICAgfQogICAgICAg aWYgKGQpCiAgICAgICAgIGNvbnRpbnVlOwogICAgICAgYmVnID0gZW5kIC0gMTsKQEAgLTg0MCw3 ICs4NzEsOCBAQCBjd2V4ZWMgKGt3c2V0X3Qga3dzZXQsIGNoYXIgY29uc3QgKnRleHQsIHNpemVf dCBsZW4sIHN0cnVjdCBrd3NtYXRjaCAqa3dzbWF0Y2gpCiAgIGQgPSAxOwogICB3aGlsZSAobGlt IC0gZW5kID49IGQpCiAgICAgewotICAgICAgaWYgKChkID0gZGVsdGFbYyA9IChlbmQgKz0gZClb LTFdXSkgIT0gMCkKKyAgICAgIFNISUZUKGVuZCwgZCk7CisgICAgICBpZiAoKGQgPSBkZWx0YVtj ID0gZW5kWy0xXV0pICE9IDApCiAgICAgICAgIGNvbnRpbnVlOwogICAgICAgYmVnID0gZW5kIC0g MTsKICAgICAgIGlmICghKHRyaWUgPSBuZXh0W2NdKSkKLS0gCjEuOS4yCgo= --------_534725B1000000000E2B_MULTIPART_MIXED_-- From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Norihiro Tanaka Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Fri, 11 Apr 2014 00:00:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: patch To: 17229@debbugs.gnu.org Cc: Paul Eggert Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.139717439024649 (code B ref 17229); Fri, 11 Apr 2014 00:00:02 +0000 Received: (at 17229) by debbugs.gnu.org; 10 Apr 2014 23:59:50 +0000 Received: from localhost ([127.0.0.1]:44944 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WYOsn-0006PU-8o for submit@debbugs.gnu.org; Thu, 10 Apr 2014 19:59:49 -0400 Received: from mailgw04.kcn.ne.jp ([61.86.7.211]:49880) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WYOsk-0006PD-5U for 17229@debbugs.gnu.org; Thu, 10 Apr 2014 19:59:47 -0400 Received: from imp02 (mailgw6.kcn.ne.jp [61.86.15.232]) by mailgw04.kcn.ne.jp (Postfix) with ESMTP id E01966C13CF for <17229@debbugs.gnu.org>; Fri, 11 Apr 2014 08:59:39 +0900 (JST) Received: from mail08.kcn.ne.jp ([61.86.6.187]) by imp02 with bizsmtp id oPzf1n00F426eXR01PzfoF; Fri, 11 Apr 2014 08:59:39 +0900 X-OrgRCPT: 17229@debbugs.gnu.org Received: from [10.120.1.19] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail08.kcn.ne.jp (Postfix) with ESMTPA id A58D612B809A; Fri, 11 Apr 2014 08:59:39 +0900 (JST) Date: Fri, 11 Apr 2014 08:59:41 +0900 From: Norihiro Tanaka In-Reply-To: <20140411082524.0DE5.27F6AC2D@kcn.ne.jp> References: <534708F6.8090708@cs.ucla.edu> <20140411082524.0DE5.27F6AC2D@kcn.ne.jp> Message-Id: <20140411085940.0DED.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.6 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) Paul Eggert wrote: > sometimes the former is more important than the latter, and this may be one of those times. I also like simple, and I don't like so much platform specific optimization. However, I confirmed 10% speed-up with wikipedia database and the simple word `Wikipedia'. I think that 10% speed-up cannot ignorable on most frequently used platform and in most simple and frequently used usage. $ env LANG=C time -p src/grep Wikipedia pages-articles.xml http://dumps.wikimedia.org/jawiki/latest/ Norihiro From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Wed, 23 Apr 2014 07:10:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: patch To: Norihiro Tanaka Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.139823695117863 (code B ref 17229); Wed, 23 Apr 2014 07:10:03 +0000 Received: (at 17229) by debbugs.gnu.org; 23 Apr 2014 07:09:11 +0000 Received: from localhost ([127.0.0.1]:55632 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WcrIo-0004dt-WE for submit@debbugs.gnu.org; Wed, 23 Apr 2014 03:09:11 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:53266) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WcrIh-0004dB-7s for 17229@debbugs.gnu.org; Wed, 23 Apr 2014 03:09:04 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 3E21F39E8014; Wed, 23 Apr 2014 00:08:58 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BNGE-bLnGJq5; Wed, 23 Apr 2014 00:08:52 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 80C6739E8008; Wed, 23 Apr 2014 00:08:52 -0700 (PDT) Message-ID: <53576704.5020202@cs.ucla.edu> Date: Wed, 23 Apr 2014 00:08:52 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 References: <20140410203744.5CF7.27F6AC2D@kcn.ne.jp> <534708F6.8090708@cs.ucla.edu> <20140411082524.0DE5.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140411082524.0DE5.27F6AC2D@kcn.ne.jp> Content-Type: multipart/mixed; boundary="------------010201060701080407040300" X-Spam-Score: -3.0 (---) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) This is a multi-part message in MIME format. --------------010201060701080407040300 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Norihiro Tanaka wrote: > Now I wrote the additional patch. Sorry, I don't understand that patch. I applied your earlier patch (attachment 1) and then merged that patch (attachment 2), but the resulting 'grep' failed 'make check'; I got a "FAIL: fedora" and the test after "file" never finished. This was on Fedora 20 x86-64 with my own installation of GCC 4.9.0. Instead of trying to understand that patch, I wrote a different patch (attachment 3) that simplifies the code and removes the part that is x86-specific. This improves the performance quite a bit for the test case given in the ChangeLog entry. I also tested performance on Sparc Solaris 10, and memchr was a big win there. Perhaps there are platforms where memchr is a big loss, but we can cross that bridge if we come to it. Perhaps I merged that patch incorrectly, so I'll leave this bug open for now. To be honest I hope that patch doesn't improve performance significantly if we get it to work correctly, as it looks tricky. --------------010201060701080407040300 Content-Type: text/plain; charset=UTF-8; name="0001-grep-speed-up-by-using-memchr-in-Boyer-Moore-searchi.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename*0="0001-grep-speed-up-by-using-memchr-in-Boyer-Moore-searchi.pa"; filename*1="tch" RnJvbSAwZmIyYTUzMTAxMWVlY2E4OTNhMmQ3NjZiMDE2MzI1Mzk5YmZmOGIxIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5l LmpwPgpEYXRlOiBUaHUsIDMgQXByIDIwMTQgMjI6NTc6MDkgKzA5MDAKU3ViamVjdDogW1BB VENIIDEvMl0gZ3JlcDogc3BlZWQtdXAgYnkgdXNpbmcgbWVtY2hyKCkgaW4gQm95ZXItTW9v cmUgc2VhcmNoaW5nCgptZW1jaHIoKSBvZiBnbGliYyBpcyBmYXN0ZXIgdGhhbiBzZWVraW5n IGJ5IGRlbHRhMSBvbiBzb21lIHBsYXRmb3Jtcy4KV2hlbiB0aGVyZSBpcyBubyBjaGFuY2Ug dG8gbWF0Y2ggZm9yIGEgd2hpbGUsIHVzZSBpdCBvbiB0aGVtLgoqIHNyYy9rd3NldC5jIChi bWV4ZWMpOiBVc2UgbWVtY2hyKCkgaW4gQm95ZXItTW9vcmUgc2VhcmNoaW5nLgotLS0KIHNy Yy9rd3NldC5jIHwgMjMgKysrKysrKysrKysrKysrKysrKysrLS0KIDEgZmlsZSBjaGFuZ2Vk LCAyMSBpbnNlcnRpb25zKCspLCAyIGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL3NyYy9r d3NldC5jIGIvc3JjL2t3c2V0LmMKaW5kZXggNjliMGE1NS4uNzhmYjBiMiAxMDA2NDQKLS0t IGEvc3JjL2t3c2V0LmMKKysrIGIvc3JjL2t3c2V0LmMKQEAgLTU4Miw4ICs1ODIsMjcgQEAg Ym1leGVjIChrd3NldF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0ZXh0LCBzaXplX3Qgc2l6ZSkK ICAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBkOwogICAgICAgICAgICAg aWYgKGQgPT0gMCkKICAgICAgICAgICAgICAgZ290byBmb3VuZDsKLSAgICAgICAgICAgIGQg PSBkMVtVKHRwWy0xXSldLCB0cCArPSBkOwotICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFd KV0sIHRwICs9IGQ7CisgICAgICAgICAgICAvKiBtZW1jaGFyKCkgb2YgZ2xpYmMgaXMgZmFz dGVyIHRoYW4gc2Vla2luZyBieSBkZWx0YTEgb24KKyAgICAgICAgICAgICAgIHNvbWUgcGxh dGZvcm1zLiAgV2hlbiB0aGVyZSBpcyBubyBjaGFuY2UgdG8gbWF0Y2ggZm9yIGEKKyAgICAg ICAgICAgICAgIHdoaWxlLCB1c2UgaXQgb24gdGhlbS4gICovCisjaWYgZGVmaW5lZChfX0dM SUJDX18pICYmIChkZWZpbmVkKF9faTM4Nl9fKSB8fCBkZWZpbmVkKF9feDg2XzY0X18pKQor ICAgICAgICAgICAgaWYgKCF0cmFucykKKyAgICAgICAgICAgICAgeworICAgICAgICAgICAg ICAgIHRwID0gbWVtY2hyICh0cCAtIDEsIGdjMSwgc2l6ZSArIHRleHQgLSB0cCArIDEpOwor ICAgICAgICAgICAgICAgIGlmICh0cCkKKyAgICAgICAgICAgICAgICAgIHsKKyAgICAgICAg ICAgICAgICAgICAgKyt0cDsKKyAgICAgICAgICAgICAgICAgICAgZ290byBmb3VuZDsKKyAg ICAgICAgICAgICAgICAgIH0KKyAgICAgICAgICAgICAgICBlbHNlCisgICAgICAgICAgICAg ICAgICByZXR1cm4gLTE7CisgICAgICAgICAgICAgIH0KKyAgICAgICAgICAgIGVsc2UKKyNl bmRpZgorICAgICAgICAgICAgICB7CisgICAgICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFd KV0sIHRwICs9IGQ7CisgICAgICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9 IGQ7CisgICAgICAgICAgICAgIH0KICAgICAgICAgICB9CiAgICAgICAgIGJyZWFrOwogICAg ICAgZm91bmQ6Ci0tIAoxLjkuMAoK --------------010201060701080407040300 Content-Type: text/plain; charset=UTF-8; name="0001-grep-speed-up-by-replacing-incr-to-add-in-x86-and-x8.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename*0="0001-grep-speed-up-by-replacing-incr-to-add-in-x86-and-x8.pa"; filename*1="tch" RnJvbSBkZDBkZTBkY2U1OTZhZGQ5MDk4ZTFmYjZlMjk0YzE5MGM5YjA0ZjE4IE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5l LmpwPgpEYXRlOiBUaHUsIDEwIEFwciAyMDE0IDIwOjUyOjU0ICswOTAwClN1YmplY3Q6IFtQ QVRDSF0gZ3JlcDogc3BlZWQtdXAgYnkgcmVwbGFjaW5nIGBpbmNyJyB0byBgYWRkJyBpbiB4 ODYgYW5kIHg4Ni02NAoKVGhlIGluY3JlbWVudCBpbnN0cnVtZW50IGlzIG1vcmUgZmFzdGVy IHRoYW4gdGhlIGFkZCBpbnN0dWN0aW9uIGluIHg4NgphbmQgeDg2LTY0IGFyY2hpdGVjdHVy ZS4gIFNvIHByZWZlciB0aGUgYWRkIGluc3RydW1lbnQgdG8gdGhlIGluY3JlbWVudAppbnN0 cnVtZW50IGZvciB0aGVtLgoqIHNyYy9rd3NldC5jIChTSElGVCk6IE5ldyBtYWNyby4gIEl0 IHVzZXMgdGhlIGluY3JlbWVudCBpbnN0cnVtZW50Cmluc3RlYWQgb2YgdGhlIGFkZCBpbnN0 cnVtZW50IGZvciB4ODYgYW5kIHg4Ni02NC4KKGJtZXhlYywgY3dleGVjKTogVXNlIGl0Lgot LS0KIHNyYy9rd3NldC5jIHwgNzMgKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysr KysrKysrKysrKystLS0tLS0tLS0tLS0tLS0tLQogMSBmaWxlIGNoYW5nZWQsIDUzIGluc2Vy dGlvbnMoKyksIDIwIGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL3NyYy9rd3NldC5jIGIv c3JjL2t3c2V0LmMKaW5kZXggNzhmYjBiMi4uY2UxMjdlMiAxMDA2NDQKLS0tIGEvc3JjL2t3 c2V0LmMKKysrIGIvc3JjL2t3c2V0LmMKQEAgLTU1LDYgKzU1LDM1IEBACiAKICNkZWZpbmUg VShjKSAodG9fdWNoYXIgKGMpKQogCisjaWYgZGVmaW5lZChfX2kzODZfXykgfHwgZGVmaW5l ZChfX3g4Nl82NF9fKQorI2RlZmluZSBTSElGVChwLCBkKQlcCisgIHN3aXRjaCAoZCkJCVwK KyAgICB7CQkJXAorICAgIGRlZmF1bHQ6CQlcCisgICAgICBwICs9IGQ7CQlcCisgICAgICBi cmVhazsJCVwKKyAgICBjYXNlIDg6CQlcCisgICAgICArK3A7CQlcCisgICAgY2FzZSA3OgkJ XAorICAgICAgKytwOwkJXAorICAgIGNhc2UgNjoJCVwKKyAgICAgICsrcDsJCVwKKyAgICBj YXNlIDU6CQlcCisgICAgICArK3A7CQlcCisgICAgY2FzZSA0OgkJXAorICAgICAgKytwOwkJ XAorICAgIGNhc2UgMzoJCVwKKyAgICAgICsrcDsJCVwKKyAgICBjYXNlIDI6CQlcCisgICAg ICArK3A7CQlcCisgICAgY2FzZSAxOgkJXAorICAgICAgKytwOwkJXAorICAgIH0KKyNlbHNl CisjZGVmaW5lIFNISUZUKHAsIGQpCVwKKyAgcCArPSBkOworI2VuZGlmCisKIC8qIEJhbGFu Y2VkIHRyZWUgb2YgZWRnZXMgYW5kIGxhYmVscyBsZWF2aW5nIGEgZ2l2ZW4gdHJpZSBub2Rl LiAqLwogc3RydWN0IHRyZWUKIHsKQEAgLTUwOCwxMyArNTM3LDE0IEBAIGJtX2RlbHRhMl9z ZWFyY2ggKGNoYXIgY29uc3QgKip0cHAsIGNoYXIgY29uc3QgKmVwLCBjaGFyIGNvbnN0ICpz cCwgaW50IGxlbiwKICAgICAgICAgICAgIH0KICAgICAgICAgfQogCi0gICAgICB0cCArPSBk ID0ga3dzZXQtPnNoaWZ0W2kgLSAyXTsKKyAgICAgIGQgPSBrd3NldC0+c2hpZnRbaSAtIDJd OworICAgICAgU0hJRlQodHAsIGQpOwogICAgICAgaWYgKHRwID4gZXApCiAgICAgICAgIGJy ZWFrOwogICAgICAgaWYgKHRyICh0cmFucywgdHBbLTFdKSAhPSBnYzEpCiAgICAgICAgIHsK ICAgICAgICAgICBpZiAoZDEpCi0gICAgICAgICAgICB0cCArPSBkMVtVKHRwWy0xXSldOwor ICAgICAgICAgICAgU0hJRlQodHAsIGQxW1UodHBbLTFdKV0pOwogICAgICAgICAgIGJyZWFr OwogICAgICAgICB9CiAgICAgICBza2lwID0gaSAtIDE7CkBAIC01NjgsMTggKzU5OCwxOCBA QCBibWV4ZWMgKGt3c2V0X3Qga3dzZXQsIGNoYXIgY29uc3QgKnRleHQsIHNpemVfdCBzaXpl KQogICAgICAgewogICAgICAgICB3aGlsZSAodHAgPD0gZXApCiAgICAgICAgICAgewotICAg ICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7Ci0gICAgICAgICAgICBkID0g ZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKKyAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSld OyBTSElGVCh0cCwgZCk7CisgICAgICAgICAgICBkID0gZDFbVSh0cFstMV0pXTsgU0hJRlQo dHAsIGQpOwogICAgICAgICAgICAgaWYgKGQgPT0gMCkKICAgICAgICAgICAgICAgZ290byBm b3VuZDsKLSAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBkOwotICAgICAg ICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7Ci0gICAgICAgICAgICBkID0gZDFb VSh0cFstMV0pXSwgdHAgKz0gZDsKKyAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldOyBT SElGVCh0cCwgZCk7CisgICAgICAgICAgICBkID0gZDFbVSh0cFstMV0pXTsgU0hJRlQodHAs IGQpOworICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV07IFNISUZUKHRwLCBkKTsKICAg ICAgICAgICAgIGlmIChkID09IDApCiAgICAgICAgICAgICAgIGdvdG8gZm91bmQ7Ci0gICAg ICAgICAgICBkID0gZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKLSAgICAgICAgICAgIGQgPSBk MVtVKHRwWy0xXSldLCB0cCArPSBkOwotICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0s IHRwICs9IGQ7CisgICAgICAgICAgICBkID0gZDFbVSh0cFstMV0pXTsgU0hJRlQodHAsIGQp OworICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV07IFNISUZUKHRwLCBkKTsKKyAgICAg ICAgICAgIGQgPSBkMVtVKHRwWy0xXSldOyBTSElGVCh0cCwgZCk7CiAgICAgICAgICAgICBp ZiAoZCA9PSAwKQogICAgICAgICAgICAgICBnb3RvIGZvdW5kOwogICAgICAgICAgICAgLyog bWVtY2hhcigpIG9mIGdsaWJjIGlzIGZhc3RlciB0aGFuIHNlZWtpbmcgYnkgZGVsdGExIG9u CkBAIC02MDAsOCArNjMwLDggQEAgYm1leGVjIChrd3NldF90IGt3c2V0LCBjaGFyIGNvbnN0 ICp0ZXh0LCBzaXplX3Qgc2l6ZSkKICAgICAgICAgICAgIGVsc2UKICNlbmRpZgogICAgICAg ICAgICAgICB7Ci0gICAgICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7 Ci0gICAgICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7CisgICAgICAg ICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV07IFNISUZUKHRwLCBkKTsKKyAgICAgICAgICAg ICAgICBkID0gZDFbVSh0cFstMV0pXTsgU0hJRlQodHAsIGQpOwogICAgICAgICAgICAgICB9 CiAgICAgICAgICAgfQogICAgICAgICBicmVhazsKQEAgLTYxNiw3ICs2NDYsOCBAQCBibWV4 ZWMgKGt3c2V0X3Qga3dzZXQsIGNoYXIgY29uc3QgKnRleHQsIHNpemVfdCBzaXplKQogICBk ID0gZDFbVSh0cFstMV0pXTsKICAgd2hpbGUgKGQgPD0gZXAgLSB0cCkKICAgICB7Ci0gICAg ICBkID0gZDFbVSgodHAgKz0gZClbLTFdKV07CisgICAgICBTSElGVCh0cCwgZCk7CisgICAg ICBkID0gZDFbVSh0cFstMV0pXTsKICAgICAgIGlmIChkICE9IDApCiAgICAgICAgIGNvbnRp bnVlOwogICAgICAgaWYgKGJtX2RlbHRhMl9zZWFyY2ggKCZ0cCwgZXAsIHNwLCBsZW4sIHRy YW5zLCBnYzEsIGdjMiwgTlVMTCwga3dzZXQpKQpAQCAtNjcwLDE3ICs3MDEsMTggQEAgY3dl eGVjIChrd3NldF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0ZXh0LCBzaXplX3QgbGVuLCBzdHJ1 Y3Qga3dzbWF0Y2ggKmt3c21hdGNoKQogICAgIHsKICAgICAgIGlmIChxbGltICYmIGVuZCA8 PSBxbGltKQogICAgICAgICB7Ci0gICAgICAgICAgZW5kICs9IGQgLSAxOwogICAgICAgICAg IHdoaWxlICgoZCA9IGRlbHRhW2MgPSAqZW5kXSkgJiYgZW5kIDwgcWxpbSkKICAgICAgICAg ICAgIHsKLSAgICAgICAgICAgICAgZW5kICs9IGQ7Ci0gICAgICAgICAgICAgIGVuZCArPSBk ZWx0YVtVKCplbmQpXTsKLSAgICAgICAgICAgICAgZW5kICs9IGRlbHRhW1UoKmVuZCldOwor ICAgICAgICAgICAgICBTSElGVChlbmQsIGQpOworICAgICAgICAgICAgICBkID0gZGVsdGFb VShlbmRbLTFdKV07IFNISUZUKGVuZCwgZCk7CisgICAgICAgICAgICAgIGQgPSBkZWx0YVtV KGVuZFstMV0pXTsgU0hJRlQoZW5kLCBkKTsKICAgICAgICAgICAgIH0KLSAgICAgICAgICAr K2VuZDsKICAgICAgICAgfQogICAgICAgZWxzZQotICAgICAgICBkID0gZGVsdGFbYyA9IChl bmQgKz0gZClbLTFdXTsKKyAgICAgICAgeworICAgICAgICAgIFNISUZUKGVuZCwgZCk7Cisg ICAgICAgICAgZCA9IGRlbHRhW2MgPSBlbmRbLTFdXTsKKyAgICAgICAgfQogICAgICAgaWYg KGQpCiAgICAgICAgIGNvbnRpbnVlOwogICAgICAgYmVnID0gZW5kIC0gMTsKQEAgLTcyOSw3 ICs3NjEsOCBAQCBjd2V4ZWMgKGt3c2V0X3Qga3dzZXQsIGNoYXIgY29uc3QgKnRleHQsIHNp emVfdCBsZW4sIHN0cnVjdCBrd3NtYXRjaCAqa3dzbWF0Y2gpCiAgIGQgPSAxOwogICB3aGls ZSAobGltIC0gZW5kID49IGQpCiAgICAgewotICAgICAgaWYgKChkID0gZGVsdGFbYyA9IChl bmQgKz0gZClbLTFdXSkgIT0gMCkKKyAgICAgIFNISUZUKGVuZCwgZCk7CisgICAgICBpZiAo KGQgPSBkZWx0YVtjID0gZW5kWy0xXV0pICE9IDApCiAgICAgICAgIGNvbnRpbnVlOwogICAg ICAgYmVnID0gZW5kIC0gMTsKICAgICAgIGlmICghKHRyaWUgPSBuZXh0W2NdKSkKLS0gCjEu OS4wCgo= --------------010201060701080407040300 Content-Type: text/plain; charset=UTF-8; name="0002-kwset-simplify-and-speed-up-Boyer-Moore-unibyte-i-in.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename*0="0002-kwset-simplify-and-speed-up-Boyer-Moore-unibyte-i-in.pa"; filename*1="tch" RnJvbSAyMDVhMGU3YWFlNDFmYjgxOWM4ZGJkNzhlMTQ1ZjM0MTYxZWM4YTRjIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBUdWUsIDIyIEFwciAyMDE0IDIzOjM0OjIyIC0wNzAwClN1YmplY3Q6IFtQQVRD SCAyLzJdIGt3c2V0OiBzaW1wbGlmeSBhbmQgc3BlZWQgdXAgQm95ZXItTW9vcmUgdW5pYnl0 ZSAtaSBpbgogc29tZSBjYXNlcwoKVGhpcyBpbXByb3ZlcyB0aGUgcGVyZm9ybWFuY2Ugb2Ys IGZvciBleGFtcGxlLAp5ZXMgampqampqampqampqampqampqampqampqampqampqampqampq aiB8IGhlYWQgLTEwMDAwMDAwIHwgZ3JlcCAtaSBqawppbiBhIHVuaWJ5dGUgbG9jYWxlLgoq IHNyYy9rd3NldC5jIChtZW1jaHJfdHJhbnMpOiBOZXcgZnVuY3Rpb24uCihibWV4ZWMpOiBV c2UgaXQuICBTaW1wbGlmeSB0aGUgY29kZSBhbmQgcmVtb3ZlIHNvbWUgb2YgdGhlCmNvbmZ1 c2luZyBnb3RvcyBhbmQgYnJlYWtzIGFuZCBsYWJlbHMuICBEbyBub3QgdHJlYXQgZ2xpYmMg bWVtY2hyCmFzIGEgc3BlY2lhbCBjYXNlOyBpZiBub24tZ2xpYmMgbWVtY2hyIGlzIHNsb3cs IHRoYXQgaXMgbG93ZXIKcHJpb3JpdHkgYW5kIEkgc3VwcG9zZSB3ZSBjYW4gdHJ5IHRvIHdv cmsgYXJvdW5kIHRoZSBwcm9ibGVtIGluCmdudWxpYi4KLS0tCiBzcmMva3dzZXQuYyB8IDc3 ICsrKysrKysrKysrKysrKysrKysrKysrKysrLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0KIDEgZmlsZSBjaGFuZ2VkLCAzMyBpbnNlcnRpb25zKCspLCA0NCBkZWxldGlv bnMoLSkKCmRpZmYgLS1naXQgYS9zcmMva3dzZXQuYyBiL3NyYy9rd3NldC5jCmluZGV4IDc4 ZmIwYjIuLmY4NmVlMDMgMTAwNjQ0Ci0tLSBhL3NyYy9rd3NldC5jCisrKyBiL3NyYy9rd3Nl dC5jCkBAIC01MjQsNiArNTI0LDIwIEBAIGJtX2RlbHRhMl9zZWFyY2ggKGNoYXIgY29uc3Qg Kip0cHAsIGNoYXIgY29uc3QgKmVwLCBjaGFyIGNvbnN0ICpzcCwgaW50IGxlbiwKICAgcmV0 dXJuIGZhbHNlOwogfQogCisvKiBSZXR1cm4gdGhlIGFkZHJlc3Mgb2YgdGhlIGZpcnN0IGJ5 dGUgaW4gdGhlIGJ1ZmZlciBTIHRoYXQgZXF1YWxzIEMuCisgICBTIGNvbnRhaW5zIE4gYnl0 ZXMuICBJZiBUUkFOUyBpcyBub25udWxsLCB1c2UgaXQgdG8gdHJhbnNsaXRlcmF0ZQorICAg UydzIGJ5dGVzIGJlZm9yZSBjb21wYXJpbmcgdGhlbS4gICovCitzdGF0aWMgY2hhciBjb25z dCAqCittZW1jaHJfdHJhbnMgKGNoYXIgY29uc3QgKnMsIGNoYXIgYywgc2l6ZV90IG4sIGNo YXIgY29uc3QgKnRyYW5zKQoreworICBpZiAoISB0cmFucykKKyAgICByZXR1cm4gbWVtY2hy IChzLCBjLCBuKTsKKyAgY2hhciBjb25zdCAqc2xpbSA9IHMgKyBuOworICBmb3IgKDsgcyA8 IHNsaW07IHMrKykKKyAgICBpZiAodHJhbnNbVSgqcyldID09IGMpCisgICAgICByZXR1cm4g czsKKyAgcmV0dXJuIE5VTEw7Cit9CiAKIC8qIEZhc3QgYm95ZXItbW9vcmUgc2VhcmNoLiAq Lwogc3RhdGljIHNpemVfdCBfR0xfQVRUUklCVVRFX1BVUkUKQEAgLTU0MSwxOCArNTU1LDgg QEAgYm1leGVjIChrd3NldF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0ZXh0LCBzaXplX3Qgc2l6 ZSkKICAgICByZXR1cm4gLTE7CiAgIGlmIChsZW4gPT0gMSkKICAgICB7Ci0gICAgICBpZiAo dHJhbnMpCi0gICAgICAgIHsKLSAgICAgICAgICBmb3IgKHRwID0gdGV4dDsgdHAgPCB0ZXh0 ICsgc2l6ZTsgdHArKykKLSAgICAgICAgICAgIGlmICh0cmFuc1tVKCp0cCldID09IGt3c2V0 LT50YXJnZXRbMF0pCi0gICAgICAgICAgICAgIHJldHVybiB0cCAtIHRleHQ7Ci0gICAgICAg ICAgcmV0dXJuIC0xOwotICAgICAgICB9Ci0gICAgICBlbHNlCi0gICAgICAgIHsKLSAgICAg ICAgICB0cCA9IG1lbWNociAodGV4dCwga3dzZXQtPnRhcmdldFswXSwgc2l6ZSk7Ci0gICAg ICAgICAgcmV0dXJuIHRwID8gdHAgLSB0ZXh0IDogLTE7Ci0gICAgICAgIH0KKyAgICAgIHRw ID0gbWVtY2hyX3RyYW5zICh0ZXh0LCBrd3NldC0+dGFyZ2V0WzBdLCBzaXplLCB0cmFucyk7 CisgICAgICByZXR1cm4gdHAgPyB0cCAtIHRleHQgOiAtMTsKICAgICB9CiAKICAgZDEgPSBr d3NldC0+ZGVsdGE7CkBAIC01NjQsNDggKzU2OCwzMyBAQCBibWV4ZWMgKGt3c2V0X3Qga3dz ZXQsIGNoYXIgY29uc3QgKnRleHQsIHNpemVfdCBzaXplKQogICAvKiBTaWduaWZpY2FuY2Ug b2YgMTI6IDEgKGluaXRpYWwgb2Zmc2V0KSArIDEwIChza2lwIGxvb3ApICsgMSAobWQyKS4g Ki8KICAgaWYgKHNpemUgPiAxMiAqIGxlbikKICAgICAvKiAxMSBpcyBub3QgYSBidWcsIHRo ZSBpbml0aWFsIG9mZnNldCBoYXBwZW5zIG9ubHkgb25jZS4gKi8KLSAgICBmb3IgKGVwID0g dGV4dCArIHNpemUgLSAxMSAqIGxlbjs7KQorICAgIGZvciAoZXAgPSB0ZXh0ICsgc2l6ZSAt IDExICogbGVuOyB0cCA8PSBlcDsgKQogICAgICAgewotICAgICAgICB3aGlsZSAodHAgPD0g ZXApCisgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBkOworICAgICAgICBkID0g ZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKKyAgICAgICAgaWYgKGQgIT0gMCkKICAgICAgICAg ICB7CiAgICAgICAgICAgICBkID0gZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKICAgICAgICAg ICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBkOwotICAgICAgICAgICAgaWYgKGQgPT0g MCkKLSAgICAgICAgICAgICAgZ290byBmb3VuZDsKLSAgICAgICAgICAgIGQgPSBkMVtVKHRw Wy0xXSldLCB0cCArPSBkOwotICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9 IGQ7Ci0gICAgICAgICAgICBkID0gZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKLSAgICAgICAg ICAgIGlmIChkID09IDApCi0gICAgICAgICAgICAgIGdvdG8gZm91bmQ7Ci0gICAgICAgICAg ICBkID0gZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKICAgICAgICAgICAgIGQgPSBkMVtVKHRw Wy0xXSldLCB0cCArPSBkOwotICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9 IGQ7Ci0gICAgICAgICAgICBpZiAoZCA9PSAwKQotICAgICAgICAgICAgICBnb3RvIGZvdW5k OwotICAgICAgICAgICAgLyogbWVtY2hhcigpIG9mIGdsaWJjIGlzIGZhc3RlciB0aGFuIHNl ZWtpbmcgYnkgZGVsdGExIG9uCi0gICAgICAgICAgICAgICBzb21lIHBsYXRmb3Jtcy4gIFdo ZW4gdGhlcmUgaXMgbm8gY2hhbmNlIHRvIG1hdGNoIGZvciBhCi0gICAgICAgICAgICAgICB3 aGlsZSwgdXNlIGl0IG9uIHRoZW0uICAqLwotI2lmIGRlZmluZWQoX19HTElCQ19fKSAmJiAo ZGVmaW5lZChfX2kzODZfXykgfHwgZGVmaW5lZChfX3g4Nl82NF9fKSkKLSAgICAgICAgICAg IGlmICghdHJhbnMpCi0gICAgICAgICAgICAgIHsKLSAgICAgICAgICAgICAgICB0cCA9IG1l bWNociAodHAgLSAxLCBnYzEsIHNpemUgKyB0ZXh0IC0gdHAgKyAxKTsKLSAgICAgICAgICAg ICAgICBpZiAodHApCi0gICAgICAgICAgICAgICAgICB7Ci0gICAgICAgICAgICAgICAgICAg ICsrdHA7Ci0gICAgICAgICAgICAgICAgICAgIGdvdG8gZm91bmQ7Ci0gICAgICAgICAgICAg ICAgICB9Ci0gICAgICAgICAgICAgICAgZWxzZQotICAgICAgICAgICAgICAgICAgcmV0dXJu IC0xOwotICAgICAgICAgICAgICB9Ci0gICAgICAgICAgICBlbHNlCi0jZW5kaWYKKyAgICAg ICAgICAgIGlmIChkICE9IDApCiAgICAgICAgICAgICAgIHsKICAgICAgICAgICAgICAgICBk ID0gZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKICAgICAgICAgICAgICAgICBkID0gZDFbVSh0 cFstMV0pXSwgdHAgKz0gZDsKKyAgICAgICAgICAgICAgICBkID0gZDFbVSh0cFstMV0pXSwg dHAgKz0gZDsKKyAgICAgICAgICAgICAgICBpZiAoZCAhPSAwKQorICAgICAgICAgICAgICAg ICAgeworICAgICAgICAgICAgICAgICAgICAvKiBUeXBpY2FsbHkgbWVtY2hyIGlzIGZhc3Rl ciB0aGFuIHNlZWtpbmcgYnkKKyAgICAgICAgICAgICAgICAgICAgICAgZGVsdGExIHdoZW4g dGhlcmUgaXMgbm8gY2hhbmNlIHRvIG1hdGNoIGZvcgorICAgICAgICAgICAgICAgICAgICAg ICBhIHdoaWxlLiAgKi8KKyAgICAgICAgICAgICAgICAgICAgdHAtLTsKKyAgICAgICAgICAg ICAgICAgICAgdHAgPSBtZW1jaHJfdHJhbnMgKHRwLCBnYzEsIHRleHQgKyBzaXplIC0gdHAs IHRyYW5zKTsKKyAgICAgICAgICAgICAgICAgICAgaWYgKCEgdHApCisgICAgICAgICAgICAg ICAgICAgICAgcmV0dXJuIC0xOworICAgICAgICAgICAgICAgICAgICB0cCsrOworICAgICAg ICAgICAgICAgICAgfQogICAgICAgICAgICAgICB9CiAgICAgICAgICAgfQotICAgICAgICBi cmVhazsKLSAgICAgIGZvdW5kOgogICAgICAgICBpZiAoYm1fZGVsdGEyX3NlYXJjaCAoJnRw LCBlcCwgc3AsIGxlbiwgdHJhbnMsIGdjMSwgZ2MyLCBkMSwga3dzZXQpKQogICAgICAgICAg IHJldHVybiB0cCAtIHRleHQ7CiAgICAgICB9Ci0tIAoxLjkuMAoK --------------010201060701080407040300-- From debbugs-submit-bounces@debbugs.gnu.org Wed Apr 23 03:12:02 2014 Received: (at control) by debbugs.gnu.org; 23 Apr 2014 07:12:02 +0000 Received: from localhost ([127.0.0.1]:55636 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WcrLe-0004jp-0i for submit@debbugs.gnu.org; Wed, 23 Apr 2014 03:12:02 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:53385) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WcrLb-0004jM-Dx for control@debbugs.gnu.org; Wed, 23 Apr 2014 03:12:00 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 2599739E8016 for ; Wed, 23 Apr 2014 00:11:59 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QYbIHnYWXndZ for ; Wed, 23 Apr 2014 00:11:50 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id A4BEC39E8014 for ; Wed, 23 Apr 2014 00:11:50 -0700 (PDT) Message-ID: <535767B6.6010506@cs.ucla.edu> Date: Wed, 23 Apr 2014 00:11:50 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: control@debbugs.gnu.org Subject: 17229 needs more info Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) tags 17229 + moreinfo From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Norihiro Tanaka Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Wed, 23 Apr 2014 17:52:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: Paul Eggert Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.139827550717600 (code B ref 17229); Wed, 23 Apr 2014 17:52:01 +0000 Received: (at 17229) by debbugs.gnu.org; 23 Apr 2014 17:51:47 +0000 Received: from localhost ([127.0.0.1]:56399 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wd1Kk-0004Zl-Tn for submit@debbugs.gnu.org; Wed, 23 Apr 2014 13:51:47 -0400 Received: from mailgw06.kcn.ne.jp ([61.86.7.213]:46403) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wd1Ki-0004Zc-1G for 17229@debbugs.gnu.org; Wed, 23 Apr 2014 13:51:44 -0400 Received: from imp02 (mailgw6.kcn.ne.jp [61.86.15.232]) by mailgw06.kcn.ne.jp (Postfix) with ESMTP id 4D791E80337 for <17229@debbugs.gnu.org>; Thu, 24 Apr 2014 02:51:42 +0900 (JST) Received: from mail06.kcn.ne.jp ([61.86.6.185]) by imp02 with bizsmtp id tVri1n0053zXHqt01Vri9G; Thu, 24 Apr 2014 02:51:42 +0900 X-OrgRCPT: 17229@debbugs.gnu.org Received: from [10.120.1.12] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail06.kcn.ne.jp (Postfix) with ESMTPA id 1323E1BF0092; Thu, 24 Apr 2014 02:51:42 +0900 (JST) Date: Thu, 24 Apr 2014 02:51:43 +0900 From: Norihiro Tanaka In-Reply-To: <53576704.5020202@cs.ucla.edu> References: <20140411082524.0DE5.27F6AC2D@kcn.ne.jp> <53576704.5020202@cs.ucla.edu> Message-Id: <20140424025143.3203.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Paul Eggert wrote: > This improves the performance quite a bit for the test case given in > the ChangeLog entry. I also tested performance on Sparc Solaris 10, > and memchr was a big win there. You are right. I also ran the tests on Solaris 10 and HP-UX 11iv2. memchr() was win on both machines. Especially on HP-UX 11iv2, memchr was 10x faster than delta1 searching. By the way, could you also test below for master and original grep-2.18? $ yes abcdabc | head -50000000 >../k $ env LANG=C time -p src/grep abcd.bd ../k Perhaps, later will be faster. 0001-grep-speed-up-by-replacing-incr-to-add-in-x86-and-x8.patch will fix it. delta2 searching is higher cost than mind2 searching in original grep-2.18. We need to reduce it for delta2 searching. Norihiro From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Thu, 24 Apr 2014 06:34:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: Norihiro Tanaka Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.139832120310408 (code B ref 17229); Thu, 24 Apr 2014 06:34:02 +0000 Received: (at 17229) by debbugs.gnu.org; 24 Apr 2014 06:33:23 +0000 Received: from localhost ([127.0.0.1]:56801 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WdDDm-0002hn-IX for submit@debbugs.gnu.org; Thu, 24 Apr 2014 02:33:22 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:57164) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WdDDj-0002hY-9T for 17229@debbugs.gnu.org; Thu, 24 Apr 2014 02:33:20 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 9945439E8016; Wed, 23 Apr 2014 23:33:18 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QsGnkbOyI-zI; Wed, 23 Apr 2014 23:33:14 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 31ADC39E8012; Wed, 23 Apr 2014 23:33:14 -0700 (PDT) Message-ID: <5358B029.6050203@cs.ucla.edu> Date: Wed, 23 Apr 2014 23:33:13 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 References: <20140411082524.0DE5.27F6AC2D@kcn.ne.jp> <53576704.5020202@cs.ucla.edu> <20140424025143.3203.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140424025143.3203.27F6AC2D@kcn.ne.jp> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.0 (---) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) Norihiro Tanaka wrote: > could you also test below for master and original grep-2.18? > > $ yes abcdabc | head -50000000 >../k > $ env LANG=C time -p src/grep abcd.bd ../k > > Perhaps, later will be faster. Yes, grep 2.18 is a bit faster on that benchmark for me; it's about 3.6s real-time, whereas the master is about 4.2s. > 0001-grep-speed-up-by-replacing-incr-to-add-in-x86-and-x8.patch will fix > it. delta2 searching is higher cost than mind2 searching in original > grep-2.18. We need to reduce it for delta2 searching. Unfortunately it doesn't help for me; it causes the same benchmark to take about 4.3s real-time. Here, I am talking about the version resulting from applying the patch in to the master. From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Norihiro Tanaka Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Thu, 24 Apr 2014 12:31:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: Paul Eggert Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.139834261623778 (code B ref 17229); Thu, 24 Apr 2014 12:31:02 +0000 Received: (at 17229) by debbugs.gnu.org; 24 Apr 2014 12:30:16 +0000 Received: from localhost ([127.0.0.1]:56876 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WdIn8-0006BO-LY for submit@debbugs.gnu.org; Thu, 24 Apr 2014 08:30:16 -0400 Received: from mailgw01.kcn.ne.jp ([61.86.7.208]:58811) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WdImz-0006A1-AP for 17229@debbugs.gnu.org; Thu, 24 Apr 2014 08:30:07 -0400 Received: from imp02 (mailgw6.kcn.ne.jp [61.86.15.232]) by mailgw01.kcn.ne.jp (Postfix) with ESMTP id C035C802AE for <17229@debbugs.gnu.org>; Thu, 24 Apr 2014 21:30:02 +0900 (JST) Received: from mail01.kcn.ne.jp ([61.86.6.180]) by imp02 with bizsmtp id toW21n00J3t2w9Z01oW2Pw; Thu, 24 Apr 2014 21:30:02 +0900 X-OrgRCPT: 17229@debbugs.gnu.org Received: from [10.120.1.44] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail01.kcn.ne.jp (Postfix) with ESMTPA id 58A625A8227; Thu, 24 Apr 2014 21:30:02 +0900 (JST) Date: Thu, 24 Apr 2014 21:30:04 +0900 From: Norihiro Tanaka In-Reply-To: <5358B029.6050203@cs.ucla.edu> References: <20140424025143.3203.27F6AC2D@kcn.ne.jp> <5358B029.6050203@cs.ucla.edu> Message-Id: <20140424213003.B534.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Paul Eggert wrote: > Unfortunately it doesn't help for me; it causes the same benchmark to > take about 4.3s real-time. Here, I am talking about the version > resulting from applying the patch in > to the master. Sorry, the improvement for `abcd.bd' is not 0001-grep-speed-up-by-replacing-incr-to-add-in-x86-and-x8.patch but 0001-kwset-simplify-Boyer-Moore-with-unibyte-i.patch. Therefore, Ignore this case. From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Norihiro Tanaka Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Fri, 25 Apr 2014 16:15:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: Paul Eggert Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.13984424735183 (code B ref 17229); Fri, 25 Apr 2014 16:15:02 +0000 Received: (at 17229) by debbugs.gnu.org; 25 Apr 2014 16:14:33 +0000 Received: from localhost ([127.0.0.1]:58791 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wdilj-0001LV-C3 for submit@debbugs.gnu.org; Fri, 25 Apr 2014 12:14:32 -0400 Received: from mailgw04.kcn.ne.jp ([61.86.7.211]:56680) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wdilf-0001LF-Tn for 17229@debbugs.gnu.org; Fri, 25 Apr 2014 12:14:28 -0400 Received: from imp03 (mailgw7.kcn.ne.jp [61.86.15.238]) by mailgw04.kcn.ne.jp (Postfix) with ESMTP id 191B96C1DCB for <17229@debbugs.gnu.org>; Sat, 26 Apr 2014 01:14:26 +0900 (JST) Received: from mail03.kcn.ne.jp ([61.86.6.182]) by imp03 with bizsmtp id uGES1n0013veGq501GES1y; Sat, 26 Apr 2014 01:14:26 +0900 X-OrgRCPT: 17229@debbugs.gnu.org Received: from [10.120.1.44] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail03.kcn.ne.jp (Postfix) with ESMTPA id CA43B141009A; Sat, 26 Apr 2014 01:14:25 +0900 (JST) Date: Sat, 26 Apr 2014 01:14:22 +0900 From: Norihiro Tanaka In-Reply-To: <5358B029.6050203@cs.ucla.edu> References: <20140424025143.3203.27F6AC2D@kcn.ne.jp> <5358B029.6050203@cs.ucla.edu> Message-Id: <20140426011422.B551.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) grep 2.18 is slow for below, because always d == 1. $ env LANG=C src/grep jk k Therefore, I wrote 0001-grep-speed-up-by-using-memchr-in-Boyer-Moore-searchi.patch. When `d' is small, speeds up. memchr() is faster than or as fast as delta1 search even when `d' is sufficiently large. However, this patch can't apply to case-insensitive matching. In fast, when `d' is large as following case, memchr_trans() imitated memchr() will occur slowdown. $ env LANG=C src/grep -i kkkkkkkkkkkkkkkkkkkkl k So my patch works for only case-sensitive matching. By the way, for below it's still slow with my patch. Especially, on x86 machines, it's slower than grep 2.18. $ env LANG=C src/grep -i jk k I think that the master should be faster than original version (grep 2.18), which uses not kwset but DFA for it. Accordingly, I added 0001-grep-speed-up-by-replacing-incr-to-add-in-x86-and-x8.patch I noticed that DFA uses increments to proceed text pointer, and found that the repeat of `++tp;' is faster than tp += d on x86 machines, when `d' is small. `++tp; ++tp; ++tp' is faster than `tp += d;' on it. This patch inprements it. I looked at the performance with below. $ env LANG=C src/grep -i jk k $ env LANG=C src/grep -i jkk k $ env LANG=C src/grep -i jkkk k $ env LANG=C src/grep -i jkkkk k $ env LANG=C src/grep -i jkkkkk k $ env LANG=C src/grep -i jkkkkkk k $ env LANG=C src/grep -i jkkkkkkk k $ env LANG=C src/grep -i jkkkkkkkk k $ env LANG=C src/grep -i jkkkkkkkkk k When `d' is greater than 8, It uses tp += d so that it's as fast as repeat of `++tp;'. OTOH, when `d' is less than or equal to 8, uses the repeat of `++tp', so that it's as fast as or faster than `tp += d'. I used some macros in my patches, because I don't know how to replace it to other expression without slowdown. Norihiro From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Eric Blake Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Fri, 25 Apr 2014 16:23:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: Norihiro Tanaka , Paul Eggert Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.13984429776245 (code B ref 17229); Fri, 25 Apr 2014 16:23:02 +0000 Received: (at 17229) by debbugs.gnu.org; 25 Apr 2014 16:22:57 +0000 Received: from localhost ([127.0.0.1]:58800 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wdits-0001cd-Oh for submit@debbugs.gnu.org; Fri, 25 Apr 2014 12:22:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:8800) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wdito-0001cR-Nn for 17229@debbugs.gnu.org; Fri, 25 Apr 2014 12:22:54 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s3PGMhMT004082 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 25 Apr 2014 12:22:43 -0400 Received: from [10.3.113.60] (ovpn-113-60.phx2.redhat.com [10.3.113.60]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id s3PGMgdY017827; Fri, 25 Apr 2014 12:22:42 -0400 Message-ID: <535A8BD2.4050709@redhat.com> Date: Fri, 25 Apr 2014 10:22:42 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 References: <20140424025143.3203.27F6AC2D@kcn.ne.jp> <5358B029.6050203@cs.ucla.edu> <20140426011422.B551.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140426011422.B551.27F6AC2D@kcn.ne.jp> X-Enigmail-Version: 1.6 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="seKK0J0kjmSbXKodueQINS9Dewt2KM0l9" X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-Spam-Score: -5.7 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.7 (-----) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --seKK0J0kjmSbXKodueQINS9Dewt2KM0l9 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 04/25/2014 10:14 AM, Norihiro Tanaka wrote: > grep 2.18 is slow for below, because always d =3D=3D 1. >=20 > $ env LANG=3DC src/grep jk k >=20 > Therefore, I wrote 0001-grep-speed-up-by-using-memchr-in-Boyer-Moore-se= archi.patch. >=20 > When `d' is small, speeds up. memchr() is faster than or as fast as > delta1 search even when `d' is sufficiently large. >=20 > However, this patch can't apply to case-insensitive matching. > In fast, when `d' is large as following case, memchr_trans() imitated > memchr() will occur slowdown. Gnulib includes a memchr2() interface, which efficiently searches for one of two byte values across a known memory length. It is not quite as fast as optimized assembly memchr for a single byte search, but when searching for two bytes in parallel, it is hands down faster than two sequential memchr() operations or any naive byte-by-byte comparisons. I suspect that using memchr2() for case-insensitive searches may allow you a speedup when searching for (the first byte of) two potential matches in the search string to the first character of a case-insensitive pattern= =2E --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --seKK0J0kjmSbXKodueQINS9Dewt2KM0l9 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJTWovSAAoJEKeha0olJ0Nq+YkH/3HvcXmJfQaHc4O58/M0Qzyg 1om2+8tTVw6mfawxI+a2idVzm1pNHYu+JdDhxIfBrI2v8PfYLYHJg2jNEMN29vsK vAMTQTOzlw6/a5Cl3UjhSDIzrkvoMoZEiZFzN3b2p07YesB7WGiEmN0Pqo1MCYRf M85eD+0Dzx9oKJIUXAFrchjkQnBT/y8moMsrJVLgKWGAV5zQAKf4u6Q4uS6R8UMz qQ3YwozDa8/N5ArWAm/33fSjyJGDH57GdM0Npq9DAlG/qTZsrzv3xLMxr0uzdt6G 6M8hwUN8Y1tYVEboml3QGzmG3fOPCu7Wh79Iv4ERMLDodCN7GnETsydL1QFd4wg= =6xJs -----END PGP SIGNATURE----- --seKK0J0kjmSbXKodueQINS9Dewt2KM0l9-- From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Norihiro Tanaka Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sat, 26 Apr 2014 00:27:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: Eric Blake Cc: Paul Eggert , 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.139847200011139 (code B ref 17229); Sat, 26 Apr 2014 00:27:01 +0000 Received: (at 17229) by debbugs.gnu.org; 26 Apr 2014 00:26:40 +0000 Received: from localhost ([127.0.0.1]:58879 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WdqRz-0002tZ-HX for submit@debbugs.gnu.org; Fri, 25 Apr 2014 20:26:39 -0400 Received: from mailgw06.kcn.ne.jp ([61.86.7.213]:54034) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WdqRw-0002tI-1E for 17229@debbugs.gnu.org; Fri, 25 Apr 2014 20:26:37 -0400 Received: from imp01 (mailgw5.kcn.ne.jp [61.86.15.231]) by mailgw06.kcn.ne.jp (Postfix) with ESMTP id C0D7E13800C for <17229@debbugs.gnu.org>; Sat, 26 Apr 2014 09:26:33 +0900 (JST) Received: from mail08.kcn.ne.jp ([61.86.6.187]) by imp01 with bizsmtp id uQSZ1n00d426eXR01QSZ1M; Sat, 26 Apr 2014 09:26:33 +0900 X-OrgRCPT: 17229@debbugs.gnu.org Received: from [10.120.1.8] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail08.kcn.ne.jp (Postfix) with ESMTPA id 7D54012B802E; Sat, 26 Apr 2014 09:26:33 +0900 (JST) Date: Sat, 26 Apr 2014 09:26:34 +0900 From: Norihiro Tanaka In-Reply-To: <535A8BD2.4050709@redhat.com> References: <20140426011422.B551.27F6AC2D@kcn.ne.jp> <535A8BD2.4050709@redhat.com> Message-Id: <20140426092633.94C8.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Eric Blake wrote: > It is not quite as fast as optimized assembly memchr for a single byte > search, but when searching for two bytes in parallel, it is hands down > faster than two sequential memchr() operations or any naive byte-by-byte > comparisons. Thanks, I look at the performance and compare them. memchr2() is the fastest for case-insensitive except for large d Solaris or HP-UX. However, kwset doesn't know that `trans' doesn't have counterparts more than one for any character. So we need to check it before run memchr2(). << Linux 86-64 >> memchr2 : real 0.55 user 0.07 sys 0.48 memchr_trans : real 0.71 user 0.14 sys 0.57 my patch small d: real 1.06 user 0.57 sys 0.48 my patch large d: real 0.54 user 0.01 sys 0.52 DFA : real 1.45 user 0.91 sys 0.53 << Solaris 10 >> memchr2 : real 3.17 user 2.45 sys 0.71 memchr_trans : real 4.16 user 3.45 sys 0.70 my patch small d: real 6.43 user 5.71 sys 0.71 my patch large d: real 1.29 user 0.57 sys 0.71 DFA : real 11.08 user 10.35 sys 0.71 << HP-UX ia64 >> memchr2 : real 0.9 user 0.6 sys 0.2 memchr_trans : real 2.5 user 2.3 sys 0.2 my patch small d: real 2.1 user 1.9 sys 0.2 my patch large d: real 0.3 user 0.1 sys 0.2 DFA : real 3.2 user 3.0 sys 0.2 For small d: $ env LANG=C time -p src/grep -i jk ../k For large d: $ env LANG=C time -p src/grep -i kkkkkkkkkkkkkkkkkkkk ../k For DFA: $ env LANG=C time -p src/grep -i 'k\|l' ../k By the way, 0001-grep-speed-up-by-replacing-incr-to-add-in-x86-and-x8.patch is effective still, even when we use memchr2(). It's useful for below, which doesn't reach at memchr2() to run alternately with delta1 searching and delta2 searching. $ yes kjkjkjkjkjkjkjkjkjkjkjkjkjkjkjkjkjkjkjkj | head -10000000 >k $ env LANG=C src/grep -i jj k before: real 0.85 user 0.64 sys 0.21 after : real 0.54 user 0.29 sys 0.25 Norihiro From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Norihiro Tanaka Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sat, 26 Apr 2014 08:45:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: 17229@debbugs.gnu.org Cc: Paul Eggert , Eric Blake Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.13985018485307 (code B ref 17229); Sat, 26 Apr 2014 08:45:01 +0000 Received: (at 17229) by debbugs.gnu.org; 26 Apr 2014 08:44:08 +0000 Received: from localhost ([127.0.0.1]:58964 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WdyDO-0001NV-DJ for submit@debbugs.gnu.org; Sat, 26 Apr 2014 04:44:07 -0400 Received: from mailgw01.kcn.ne.jp ([61.86.7.208]:53441) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WdyDJ-0001Mp-Km for 17229@debbugs.gnu.org; Sat, 26 Apr 2014 04:44:03 -0400 Received: from imp02 (mailgw6.kcn.ne.jp [61.86.15.232]) by mailgw01.kcn.ne.jp (Postfix) with ESMTP id F3EBD802A3 for <17229@debbugs.gnu.org>; Sat, 26 Apr 2014 17:43:58 +0900 (JST) Received: from mail03.kcn.ne.jp ([61.86.6.182]) by imp02 with bizsmtp id uYjy1n00M3veGq501Yjy3W; Sat, 26 Apr 2014 17:43:58 +0900 X-OrgRCPT: 17229@debbugs.gnu.org Received: from [10.120.1.8] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail03.kcn.ne.jp (Postfix) with ESMTPA id A8AC514100BB; Sat, 26 Apr 2014 17:43:58 +0900 (JST) Date: Sat, 26 Apr 2014 17:43:57 +0900 From: Norihiro Tanaka In-Reply-To: <20140426092633.94C8.27F6AC2D@kcn.ne.jp> References: <535A8BD2.4050709@redhat.com> <20140426092633.94C8.27F6AC2D@kcn.ne.jp> Message-Id: <20140426174357.94D5.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------_535B6EAB000000009513_MULTIPART_MIXED_" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --------_535B6EAB000000009513_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit I submit the patch to use memchr2() for case-insensitive matching in bmexec(). It counts last characters of the patterns in kwsprep(), if it's just two, use memchr2(). The new version uses memchr() still, if last character of the pattern is non-alphabetical. Norihiro --------_535B6EAB000000009513_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII"; name="patch.txt" Content-Disposition: attachment; filename="patch.txt" Content-Transfer-Encoding: base64 RnJvbSA0N2Q5ZTdjOTljZDM3ZDE2ZDEzZGIwNWE3YmYyYmRiYTY3MjNhZTJlIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE YXRlOiBTYXQsIDI2IEFwciAyMDE0IDE3OjAzOjMxICswOTAwClN1YmplY3Q6IFtQQVRDSCAxLzJd IG1haW50OiBSZXZlcnQgImt3c2V0OiBzaW1wbGlmeSBCb3llci1Nb29yZSB3aXRoIHVuaWJ5dGUK IC1pIgoKVGhpcyByZXZlcnRzIGNvbW1pdCBjN2VhNWFlYTkxMWI5NTBiMjM5ODQ1NGNhODljY2Uy M2NhYmQzYTQwLgpJdCBvY2N1cnMgc2xvd2Rvd24gaW4gc29tZSBjYXNlcy4KCkNvbmZsaWN0czoK CXNyYy9rd3NldC5jCi0tLQogc3JjL2t3c2V0LmMgfCAxNzMgKysrKysrKysrKysrKysrKysrKysr KysrKysrKysrKysrKysrKy0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tCiAxIGZpbGUgY2hhbmdlZCwg MTA4IGluc2VydGlvbnMoKyksIDY1IGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL3NyYy9rd3Nl dC5jIGIvc3JjL2t3c2V0LmMKaW5kZXggZjg2ZWUwMy4uOGE1ZmEyMyAxMDA2NDQKLS0tIGEvc3Jj L2t3c2V0LmMKKysrIGIvc3JjL2t3c2V0LmMKQEAgLTQ2NCw2NiArNDY0LDc1IEBAIGt3c3ByZXAg KGt3c2V0X3Qga3dzZXQpCiAgICAgICBrd3NldC0+ZGVsdGFbaV0gPSBkZWx0YVtVKHRyYW5zW2ld KV07CiB9CiAKLS8qIFVzZSBUUkFOUyB0byB0cmFuc2xpdGVyYXRlIEMuICBBIG51bGwgVFJBTlMg ZG9lcyBubyB0cmFuc2xpdGVyYXRpb24uICAqLwotc3RhdGljIGNoYXIKLXRyIChjaGFyIGNvbnN0 ICp0cmFucywgY2hhciBjKQotewotICByZXR1cm4gdHJhbnMgPyB0cmFuc1tVKGMpXSA6IGM7Ci19 Ci0KLS8qIERlbHRhMiBwb3J0aW9uIG9mIGEgQm95ZXItTW9vcmUgc2VhcmNoLiAgKlRQIGlzIHRo ZSBzdHJpbmcgdGV4dAotICAgcG9pbnRlcjsgaXQgaXMgdXBkYXRlZCBpbiBwbGFjZS4gIEVQIGlz IHRoZSBlbmQgb2YgdGhlIHN0cmluZyB0ZXh0LAotICAgYW5kIFNQIHRoZSBlbmQgb2YgdGhlIHBh dHRlcm4uICBMRU4gaXMgdGhlIHBhdHRlcm4gbGVuZ3RoOyBpdCBtdXN0Ci0gICBiZSBhdCBsZWFz dCAyLiAgVFJBTlMsIGlmIG5vbm51bGwsIGlzIHRoZSBpbnB1dCB0cmFuc2xhdGlvbiB0YWJsZS4K LSAgIEdDMSBhbmQgR0MyIGFyZSB0aGUgbGFzdCBhbmQgc2Vjb25kLWZyb20gbGFzdCBieXRlcyBv ZiB0aGUgcGF0dGVybiwKLSAgIHRyYW5zbGl0ZXJhdGVkIGJ5IFRSQU5TOyB0aGUgY2FsbGVyIHBy ZWNvbXB1dGVzIHRoZW0gZm9yCi0gICBlZmZpY2llbmN5LiAgSWYgRDEgaXMgbm9ubnVsbCwgaXQg aXMgYSBkZWx0YTEgdGFibGUgZm9yIHNoaWZ0aW5nICpUUAotICAgd2hlbiBmYWlsaW5nLiAgS1dT RVQtPnNoaWZ0IHNheXMgaG93IG11Y2ggdG8gc2hpZnQuICAqLwotc3RhdGljIGlubGluZSBib29s Ci1ibV9kZWx0YTJfc2VhcmNoIChjaGFyIGNvbnN0ICoqdHBwLCBjaGFyIGNvbnN0ICplcCwgY2hh ciBjb25zdCAqc3AsIGludCBsZW4sCi0gICAgICAgICAgICAgICAgICBjaGFyIGNvbnN0ICp0cmFu cywgY2hhciBnYzEsIGNoYXIgZ2MyLAotICAgICAgICAgICAgICAgICAgdW5zaWduZWQgY2hhciBj b25zdCAqZDEsIGt3c2V0X3Qga3dzZXQpCi17Ci0gIGNoYXIgY29uc3QgKnRwID0gKnRwcDsKLSAg aW50IGQgPSBsZW4sIHNraXAgPSAwOwotCi0gIHdoaWxlICh0cnVlKQotICAgIHsKLSAgICAgIGlu dCBpID0gMjsKLSAgICAgIGlmICh0ciAodHJhbnMsIHRwWy0yXSkgPT0gZ2MyKQotICAgICAgICB7 Ci0gICAgICAgICAgd2hpbGUgKCsraSA8PSBkKQotICAgICAgICAgICAgaWYgKHRyICh0cmFucywg dHBbLWldKSAhPSB0ciAodHJhbnMsIHNwWy1pXSkpCi0gICAgICAgICAgICAgIGJyZWFrOwotICAg ICAgICAgIGlmIChpID4gZCkKLSAgICAgICAgICAgIHsKLSAgICAgICAgICAgICAgZm9yIChpID0g ZCArIHNraXAgKyAxOyBpIDw9IGxlbjsgKytpKQotICAgICAgICAgICAgICAgIGlmICh0ciAodHJh bnMsIHRwWy1pXSkgIT0gdHIgKHRyYW5zLCBzcFstaV0pKQotICAgICAgICAgICAgICAgICAgYnJl YWs7Ci0gICAgICAgICAgICAgIGlmIChpID4gbGVuKQotICAgICAgICAgICAgICAgIHsKLSAgICAg ICAgICAgICAgICAgICp0cHAgPSB0cCAtIGxlbjsKLSAgICAgICAgICAgICAgICAgIHJldHVybiB0 cnVlOwotICAgICAgICAgICAgICAgIH0KLSAgICAgICAgICAgIH0KLSAgICAgICAgfQotCi0gICAg ICB0cCArPSBkID0ga3dzZXQtPnNoaWZ0W2kgLSAyXTsKLSAgICAgIGlmICh0cCA+IGVwKQotICAg ICAgICBicmVhazsKLSAgICAgIGlmICh0ciAodHJhbnMsIHRwWy0xXSkgIT0gZ2MxKQotICAgICAg ICB7Ci0gICAgICAgICAgaWYgKGQxKQotICAgICAgICAgICAgdHAgKz0gZDFbVSh0cFstMV0pXTsK LSAgICAgICAgICBicmVhazsKLSAgICAgICAgfQotICAgICAgc2tpcCA9IGkgLSAxOworI2RlZmlu ZSBCTV9ERUxUQTJfU0VBUkNIIAkJCQlcCisgIGlmIChUUkFOUyh0cFstMl0pID09IGdjMikJCQkJ XAorICAgIHsJCQkJCQkJXAorICAgICAgZm9yIChpID0gMzsgaSA8PSBsZW47ICsraSkJCQlcCisg ICAgICAgIGlmIChUUkFOUyh0cFstaV0pICE9IFRSQU5TKHNwWy1pXSkpCQlcCisgICAgICAgICAg YnJlYWs7CQkJCQlcCisgICAgICBpZiAoaSA+IGxlbikJCQkJCVwKKyAgICAgICAgcmV0dXJuIHRw IC0gbGVuIC0gdGV4dDsJCQkJXAorICAgICAgZCA9IGt3c2V0LT5zaGlmdFtpIC0gMl07IHRwICs9 IGQ7CQkJXAorICAgICAgaWYgKHRwID4gZXApCQkJCQlcCisgICAgICAgIGJyZWFrOwkJCQkJCVwK KyAgICAgIGlmIChUUkFOUyh0cFstMV0pICE9IGdjMSkJCQkJXAorICAgICAgICB7CQkJCQkJXAor ICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldOyBMQVNUX1NISUZUOwkJXAorICAgICAgICAgIGNv bnRpbnVlOwkJCQkJXAorICAgICAgICB9CQkJCQkJXAorICAgICAgc2tpcCA9IGkgLSAxOwkJCQkJ XAorICAgIH0JCQkJCQkJXAorICBlbHNlCQkJCQkJCVwKKyAgICB7CQkJCQkJCVwKKyAgICAgIGQg PSBrd3NldC0+c2hpZnRbMF07IHRwICs9IGQ7CQkJXAorICAgICAgaWYgKHRwID4gZXApCQkJCQlc CisgICAgICAgIGJyZWFrOwkJCQkJCVwKKyAgICAgIGlmIChUUkFOUyh0cFstMV0pICE9IGdjMSkJ CQkJXAorICAgICAgICB7CQkJCQkJXAorICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldOyBMQVNU X1NISUZUOwkJXAorICAgICAgICAgIGNvbnRpbnVlOwkJCQkJXAorICAgICAgICB9CQkJCQkJXAor ICAgICAgc2tpcCA9IDE7CQkJCQkJXAorICAgIH0JCQkJCQkJXAorICB3aGlsZSAodHJ1ZSkJCQkJ CQlcCisgICAgewkJCQkJCQlcCisgICAgICBpZiAoVFJBTlModHBbLTJdKSA9PSBnYzIpCQkJCVwK KyAgICAgICAgewkJCQkJCVwKKyAgICAgICAgICBmb3IgKGkgPSAzOyBpIDw9IGQ7ICsraSkJCQlc CisgICAgICAgICAgICBpZiAoVFJBTlModHBbLWldKSAhPSBUUkFOUyhzcFstaV0pKQkJXAorICAg ICAgICAgICAgICBicmVhazsJCQkJCVwKKyAgICAgICAgICBpZiAoaSA+IGQpCQkJCQlcCisgICAg ICAgICAgICB7CQkJCQkJXAorICAgICAgICAgICAgICBmb3IgKGkgPSBkICsgc2tpcCArIDE7IGkg PD0gbGVuOyArK2kpCVwKKyAgICAgICAgICAgICAgICBpZiAoVFJBTlModHBbLWldKSAhPSBUUkFO UyhzcFstaV0pKQlcCisgICAgICAgICAgICAgICAgICBicmVhazsJCQkJXAorICAgICAgICAgICAg ICBpZiAoaSA+IGxlbikJCQkJXAorICAgICAgICAgICAgICAgIHJldHVybiB0cCAtIGxlbiAtIHRl eHQ7CQkJXAorICAgICAgICAgICAgfQkJCQkJCVwKKyAgICAgICAgICBkID0ga3dzZXQtPnNoaWZ0 W2kgLSAyXTsgdHAgKz0gZDsJCVwKKyAgICAgICAgICBpZiAodHAgPiBlcCkJCQkJCVwKKyAgICAg ICAgICAgIGJyZWFrOwkJCQkJXAorICAgICAgICAgIGlmIChUUkFOUyh0cFstMV0pICE9IGdjMSkJ CQlcCisgICAgICAgICAgICB7CQkJCQkJXAorICAgICAgICAgICAgICBkID0gZDFbVSh0cFstMV0p XTsgTEFTVF9TSElGVDsJCVwKKyAgICAgICAgICAgICAgYnJlYWs7CQkJCQlcCisgICAgICAgICAg ICB9CQkJCQkJXAorICAgICAgICAgIHNraXAgPSBpIC0gMTsJCQkJCVwKKyAgICAgICAgfQkJCQkJ CVwKKyAgICAgIGVsc2UJCQkJCQlcCisgICAgICAgIHsJCQkJCQlcCisgICAgICAgICAgZCA9IGt3 c2V0LT5zaGlmdFswXTsgdHAgKz0gZDsJCQlcCisgICAgICAgICAgaWYgKHRwID4gZXApCQkJCQlc CisgICAgICAgICAgICBicmVhazsJCQkJCVwKKyAgICAgICAgICBpZiAoVFJBTlModHBbLTFdKSAh PSBnYzEpCQkJXAorICAgICAgICAgICAgewkJCQkJCVwKKyAgICAgICAgICAgICAgZCA9IGQxW1Uo dHBbLTFdKV07IExBU1RfU0hJRlQ7CQlcCisgICAgICAgICAgICAgIGJyZWFrOwkJCQkJXAorICAg ICAgICAgICAgfQkJCQkJCVwKKyAgICAgICAgICBza2lwID0gMTsJCQkJCVwKKyAgICAgICAgfQkJ CQkJCVwKICAgICB9CiAKLSAgKnRwcCA9IHRwOwotICByZXR1cm4gZmFsc2U7Ci19Ci0KIC8qIFJl dHVybiB0aGUgYWRkcmVzcyBvZiB0aGUgZmlyc3QgYnl0ZSBpbiB0aGUgYnVmZmVyIFMgdGhhdCBl cXVhbHMgQy4KICAgIFMgY29udGFpbnMgTiBieXRlcy4gIElmIFRSQU5TIGlzIG5vbm51bGwsIHVz ZSBpdCB0byB0cmFuc2xpdGVyYXRlCiAgICBTJ3MgYnl0ZXMgYmVmb3JlIGNvbXBhcmluZyB0aGVt LiAgKi8KQEAgLTU0NSw3ICs1NTQsOCBAQCBibWV4ZWMgKGt3c2V0X3Qga3dzZXQsIGNoYXIgY29u c3QgKnRleHQsIHNpemVfdCBzaXplKQogewogICB1bnNpZ25lZCBjaGFyIGNvbnN0ICpkMTsKICAg Y2hhciBjb25zdCAqZXAsICpzcCwgKnRwOwotICBpbnQgZDsKKyAgaW50IGQsIGksIHNraXA7Cisg IGNoYXIgZ2MxLCBnYzI7CiAgIGludCBsZW4gPSBrd3NldC0+bWluZDsKICAgY2hhciBjb25zdCAq dHJhbnMgPSBrd3NldC0+dHJhbnM7CiAKQEAgLTU2Miw4ICs1NzIsMTcgQEAgYm1leGVjIChrd3Nl dF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0ZXh0LCBzaXplX3Qgc2l6ZSkKICAgZDEgPSBrd3NldC0+ ZGVsdGE7CiAgIHNwID0ga3dzZXQtPnRhcmdldCArIGxlbjsKICAgdHAgPSB0ZXh0ICsgbGVuOwot ICBjaGFyIGdjMSA9IHRyICh0cmFucywgc3BbLTFdKTsKLSAgY2hhciBnYzIgPSB0ciAodHJhbnMs IHNwWy0yXSk7CisgIGlmICh0cmFucykKKyAgICB7CisgICAgICBnYzEgPSB0cmFuc1tVKHNwWy0x XSldOworICAgICAgZ2MyID0gdHJhbnNbVShzcFstMl0pXTsKKyAgICB9CisgIGVsc2UKKyAgICB7 CisgICAgICBnYzEgPSBzcFstMV07CisgICAgICBnYzIgPSBzcFstMl07CisgICAgfQorICBza2lw ID0gMDsKIAogICAvKiBTaWduaWZpY2FuY2Ugb2YgMTI6IDEgKGluaXRpYWwgb2Zmc2V0KSArIDEw IChza2lwIGxvb3ApICsgMSAobWQyKS4gKi8KICAgaWYgKHNpemUgPiAxMiAqIGxlbikKQEAgLTU5 NSw4ICs2MTQsMjAgQEAgYm1leGVjIChrd3NldF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0ZXh0LCBz aXplX3Qgc2l6ZSkKICAgICAgICAgICAgICAgICAgIH0KICAgICAgICAgICAgICAgfQogICAgICAg ICAgIH0KLSAgICAgICAgaWYgKGJtX2RlbHRhMl9zZWFyY2ggKCZ0cCwgZXAsIHNwLCBsZW4sIHRy YW5zLCBnYzEsIGdjMiwgZDEsIGt3c2V0KSkKLSAgICAgICAgICByZXR1cm4gdHAgLSB0ZXh0Owor I3VuZGVmIExBU1RfU0hJRlQKKyNkZWZpbmUgTEFTVF9TSElGVCB0cCArPSBkCisgICAgICAgIGlm ICh0cmFucykKKyAgICAgICAgICB7CisjdW5kZWYgVFJBTlMKKyNkZWZpbmUgVFJBTlMoY2gpIHRy YW5zW1UoY2gpXQorICAgICAgICAgICAgQk1fREVMVEEyX1NFQVJDSDsKKyAgICAgICAgICB9Cisg ICAgICAgIGVsc2UKKyAgICAgICAgICB7CisjdW5kZWYgVFJBTlMKKyNkZWZpbmUgVFJBTlMoY2gp IGNoCisgICAgICAgICAgICBCTV9ERUxUQTJfU0VBUkNIOworICAgICAgICAgIH0KICAgICAgIH0K IAogICAvKiBOb3cgd2UgaGF2ZSBvbmx5IGEgZmV3IGNoYXJhY3RlcnMgbGVmdCB0byBzZWFyY2gu ICBXZQpAQCAtNjA4LDggKzYzOSwyMCBAQCBibWV4ZWMgKGt3c2V0X3Qga3dzZXQsIGNoYXIgY29u c3QgKnRleHQsIHNpemVfdCBzaXplKQogICAgICAgZCA9IGQxW1UoKHRwICs9IGQpWy0xXSldOwog ICAgICAgaWYgKGQgIT0gMCkKICAgICAgICAgY29udGludWU7Ci0gICAgICBpZiAoYm1fZGVsdGEy X3NlYXJjaCAoJnRwLCBlcCwgc3AsIGxlbiwgdHJhbnMsIGdjMSwgZ2MyLCBOVUxMLCBrd3NldCkp Ci0gICAgICAgIHJldHVybiB0cCAtIHRleHQ7CisjdW5kZWYgTEFTVF9TSElGVAorI2RlZmluZSBM QVNUX1NISUZUCisgICAgICBpZiAodHJhbnMpCisgICAgICAgIHsKKyN1bmRlZiBUUkFOUworI2Rl ZmluZSBUUkFOUyhjaCkgdHJhbnNbVShjaCldCisgICAgICAgICAgQk1fREVMVEEyX1NFQVJDSDsK KyAgICAgICAgfQorICAgICAgZWxzZQorICAgICAgICB7CisjdW5kZWYgVFJBTlMKKyNkZWZpbmUg VFJBTlMoY2gpIGNoCisgICAgICAgICAgQk1fREVMVEEyX1NFQVJDSDsKKyAgICAgICAgfQogICAg IH0KIAogICByZXR1cm4gLTE7Ci0tIAoxLjkuMgoKCkZyb20gNjQzMzVkZWUyMzkwMTgyNTY2YzZi ZTI5ZGI5ZDk3MGM0OWQ3ZjgwNCBNb24gU2VwIDE3IDAwOjAwOjAwIDIwMDEKRnJvbTogTm9yaWhp cm8gVGFuYWthIDxub3JpdG5rQGtjbi5uZS5qcD4KRGF0ZTogU2F0LCAyNiBBcHIgMjAxNCAxNzoy NTo1NCArMDkwMApTdWJqZWN0OiBbUEFUQ0ggMi8yXSBrd3NldDogc3BlZWQgdXAgQm95ZXItTW9v cmUgdW5pYnl0ZSAtaSBieSB1c2luZyBtZW1jaHIyCgoqIHNyYy9rd3NldC5jIChzdHJ1Y3Qga3dz ZXQpOiBOZXcgbWVtYmVycyBgbGFzdGNoYXJzJyBhbmQgYG5sYXN0Y2hhcicuCihrd3NwcmVwKTog Q291bnQgbGFzdCBjaGFyYWN0ZXJzIGluIHRoZSBwYXR0ZXJucy4KKGJtZXhlYyk6IFdoZW4gdHdv IGxhc3QgY2hhcmFjdGVycywgdXNlIG1lbWNocjIoKS4KLS0tCiBzcmMva3dzZXQuYyB8IDEwOCAr KysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKystLS0tLS0tLS0tLS0tLS0tLS0t LS0KIDEgZmlsZSBjaGFuZ2VkLCA3MSBpbnNlcnRpb25zKCspLCAzNyBkZWxldGlvbnMoLSkKCmRp ZmYgLS1naXQgYS9zcmMva3dzZXQuYyBiL3NyYy9rd3NldC5jCmluZGV4IDhhNWZhMjMuLjViYzVk YTUgMTAwNjQ0Ci0tLSBhL3NyYy9rd3NldC5jCisrKyBiL3NyYy9rd3NldC5jCkBAIC00MCw2ICs0 MCw3IEBACiAjaW5jbHVkZSAic3lzdGVtLmgiCiAjaW5jbHVkZSAib2JzdGFjay5oIgogI2luY2x1 ZGUgInhhbGxvYy5oIgorI2luY2x1ZGUgIm1lbWNocjIuaCIKIAogI2RlZmluZSBsaW5rIGt3c2V0 X2xpbmsKIApAQCAtOTEsNiArOTIsOCBAQCBzdHJ1Y3Qga3dzZXQKICAgY2hhciAqdGFyZ2V0OwkJ CS8qIFRhcmdldCBzdHJpbmcgaWYgdGhlcmUncyBvbmx5IG9uZS4gKi8KICAgaW50ICpzaGlmdDsJ CQkvKiBVc2VkIGluIEJveWVyLU1vb3JlIHNlYXJjaCBmb3Igb25lIHN0cmluZy4gKi8KICAgY2hh ciBjb25zdCAqdHJhbnM7CQkvKiBDaGFyYWN0ZXIgdHJhbnNsYXRpb24gdGFibGUuICovCisgIGNo YXIgbGFzdGNoYXJzWzJdOwkJLyogTGFzdCBjaGFyYWN0ZXJzIHVwIHRvIHR3byBpbiB0aGUgcGF0 dGVybnMuICAqLworICBpbnQgbmxhc3RjaGFyOwkJLyogTnVtYmVyIG9mIGxhc3QgY2hhcmFjdGVy IGluIHRoZSBwYXR0ZXJucy4gICovCiB9OwogCiAvKiBBbGxvY2F0ZSBhbmQgaW5pdGlhbGl6ZSBh IGtleXdvcmQgc2V0IG9iamVjdCwgcmV0dXJuaW5nIGFuIG9wYXF1ZQpAQCAtMzY2LDcgKzM2OSw3 IEBAIHZvaWQKIGt3c3ByZXAgKGt3c2V0X3Qga3dzZXQpCiB7CiAgIGNoYXIgY29uc3QgKnRyYW5z ID0ga3dzZXQtPnRyYW5zOwotICBpbnQgaTsKKyAgaW50IGksIGo7CiAgIHVuc2lnbmVkIGNoYXIg ZGVsdGFidWZbTkNIQVJdOwogICB1bnNpZ25lZCBjaGFyICpkZWx0YSA9IHRyYW5zID8gZGVsdGFi dWYgOiBrd3NldC0+ZGVsdGE7CiAKQEAgLTQyOCw2ICs0MzEsMjEgQEAga3dzcHJlcCAoa3dzZXRf dCBrd3NldCkKICAgc3RydWN0IHRyaWUgKipuZXh0ID0gdHJhbnMgPyBuZXh0YnVmIDoga3dzZXQt Pm5leHQ7CiAgIG1lbXNldCAobmV4dCwgMCwgc2l6ZW9mIG5leHRidWYpOwogICB0cmVlbmV4dCAo a3dzZXQtPnRyaWUtPmxpbmtzLCBuZXh0KTsKKyAga3dzZXQtPm5sYXN0Y2hhciA9IDA7CisgIGZv ciAoaSA9IDA7IGkgPCBOQ0hBUjsgKytpKQorICAgIHsKKyAgICAgIGlmICghbmV4dFtpXSkKKyAg ICAgICAgY29udGludWU7CisgICAgICBmb3IgKGogPSAwOyBqIDwgTkNIQVI7ICsraikKKyAgICAg ICAgeworICAgICAgICAgIGlmICgodHJhbnMgJiYgdHJhbnNbaV0gPT0gdHJhbnNbal0pIHx8IGkg PT0gaikKKyAgICAgICAgICAgIHsKKyAgICAgICAgICAgICAgaWYgKGt3c2V0LT5ubGFzdGNoYXIg PCAyKQorICAgICAgICAgICAgICAgIGt3c2V0LT5sYXN0Y2hhcnNba3dzZXQtPm5sYXN0Y2hhcl0g PSBqOworICAgICAgICAgICAgICBrd3NldC0+bmxhc3RjaGFyKys7CisgICAgICAgICAgICB9Cisg ICAgICAgIH0KKyAgICB9CiAgIGlmICh0cmFucykKICAgICBmb3IgKGkgPSAwOyBpIDwgTkNIQVI7 ICsraSkKICAgICAgIGt3c2V0LT5uZXh0W2ldID0gbmV4dFtVKHRyYW5zW2ldKV07CkBAIC01MzMs MjEgKzU1MSw2IEBAIGt3c3ByZXAgKGt3c2V0X3Qga3dzZXQpCiAgICAgICAgIH0JCQkJCQlcCiAg ICAgfQogCi0vKiBSZXR1cm4gdGhlIGFkZHJlc3Mgb2YgdGhlIGZpcnN0IGJ5dGUgaW4gdGhlIGJ1 ZmZlciBTIHRoYXQgZXF1YWxzIEMuCi0gICBTIGNvbnRhaW5zIE4gYnl0ZXMuICBJZiBUUkFOUyBp cyBub25udWxsLCB1c2UgaXQgdG8gdHJhbnNsaXRlcmF0ZQotICAgUydzIGJ5dGVzIGJlZm9yZSBj b21wYXJpbmcgdGhlbS4gICovCi1zdGF0aWMgY2hhciBjb25zdCAqCi1tZW1jaHJfdHJhbnMgKGNo YXIgY29uc3QgKnMsIGNoYXIgYywgc2l6ZV90IG4sIGNoYXIgY29uc3QgKnRyYW5zKQotewotICBp ZiAoISB0cmFucykKLSAgICByZXR1cm4gbWVtY2hyIChzLCBjLCBuKTsKLSAgY2hhciBjb25zdCAq c2xpbSA9IHMgKyBuOwotICBmb3IgKDsgcyA8IHNsaW07IHMrKykKLSAgICBpZiAodHJhbnNbVSgq cyldID09IGMpCi0gICAgICByZXR1cm4gczsKLSAgcmV0dXJuIE5VTEw7Ci19Ci0KIC8qIEZhc3Qg Ym95ZXItbW9vcmUgc2VhcmNoLiAqLwogc3RhdGljIHNpemVfdCBfR0xfQVRUUklCVVRFX1BVUkUK IGJtZXhlYyAoa3dzZXRfdCBrd3NldCwgY2hhciBjb25zdCAqdGV4dCwgc2l6ZV90IHNpemUpCkBA IC01NTgsNiArNTYxLDggQEAgYm1leGVjIChrd3NldF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0ZXh0 LCBzaXplX3Qgc2l6ZSkKICAgY2hhciBnYzEsIGdjMjsKICAgaW50IGxlbiA9IGt3c2V0LT5taW5k OwogICBjaGFyIGNvbnN0ICp0cmFucyA9IGt3c2V0LT50cmFuczsKKyAgY2hhciBjb25zdCAqbGFz dGNoYXJzID0ga3dzZXQtPmxhc3RjaGFyczsKKyAgaW50IG5sYXN0Y2hhciA9IGt3c2V0LT5ubGFz dGNoYXI7CiAKICAgaWYgKGxlbiA9PSAwKQogICAgIHJldHVybiAwOwpAQCAtNTY1LDggKzU3MCwy MCBAQCBibWV4ZWMgKGt3c2V0X3Qga3dzZXQsIGNoYXIgY29uc3QgKnRleHQsIHNpemVfdCBzaXpl KQogICAgIHJldHVybiAtMTsKICAgaWYgKGxlbiA9PSAxKQogICAgIHsKLSAgICAgIHRwID0gbWVt Y2hyX3RyYW5zICh0ZXh0LCBrd3NldC0+dGFyZ2V0WzBdLCBzaXplLCB0cmFucyk7Ci0gICAgICBy ZXR1cm4gdHAgPyB0cCAtIHRleHQgOiAtMTsKKyAgICAgIHN3aXRjaCAobmxhc3RjaGFyKQorICAg ICAgICB7CisgICAgICAgIGNhc2UgMToKKyAgICAgICAgICB0cCA9IG1lbWNociAodGV4dCwgbGFz dGNoYXJzWzBdLCBzaXplKTsKKyAgICAgICAgICByZXR1cm4gdHAgPyB0cCAtIHRleHQgOiAtMTsK KyAgICAgICAgY2FzZSAyOgorICAgICAgICAgIHRwID0gbWVtY2hyMiAodGV4dCwgbGFzdGNoYXJz WzBdLCBsYXN0Y2hhcnNbMV0sIHNpemUpOworICAgICAgICAgIHJldHVybiB0cCA/IHRwIC0gdGV4 dCA6IC0xOworICAgICAgICBkZWZhdWx0OgorICAgICAgICAgIGZvciAodHAgPSB0ZXh0OyB0cCA8 IHRleHQgKyBzaXplOyB0cCsrKQorICAgICAgICAgICAgaWYgKHRyYW5zW1UoKnRwKV0gPT0ga3dz ZXQtPnRhcmdldFswXSkKKyAgICAgICAgICAgICAgcmV0dXJuIHRwIC0gdGV4dDsKKyAgICAgICAg ICByZXR1cm4gLTE7CisgICAgICAgIH0KICAgICB9CiAKICAgZDEgPSBrd3NldC0+ZGVsdGE7CkBA IC01ODcsMzMgKzYwNCw1MCBAQCBibWV4ZWMgKGt3c2V0X3Qga3dzZXQsIGNoYXIgY29uc3QgKnRl eHQsIHNpemVfdCBzaXplKQogICAvKiBTaWduaWZpY2FuY2Ugb2YgMTI6IDEgKGluaXRpYWwgb2Zm c2V0KSArIDEwIChza2lwIGxvb3ApICsgMSAobWQyKS4gKi8KICAgaWYgKHNpemUgPiAxMiAqIGxl bikKICAgICAvKiAxMSBpcyBub3QgYSBidWcsIHRoZSBpbml0aWFsIG9mZnNldCBoYXBwZW5zIG9u bHkgb25jZS4gKi8KLSAgICBmb3IgKGVwID0gdGV4dCArIHNpemUgLSAxMSAqIGxlbjsgdHAgPD0g ZXA7ICkKKyAgICBmb3IgKGVwID0gdGV4dCArIHNpemUgLSAxMSAqIGxlbjs7KQogICAgICAgewot ICAgICAgICBkID0gZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKLSAgICAgICAgZCA9IGQxW1UodHBb LTFdKV0sIHRwICs9IGQ7Ci0gICAgICAgIGlmIChkICE9IDApCisgICAgICAgIHdoaWxlICh0cCA8 PSBlcCkKICAgICAgICAgICB7CiAgICAgICAgICAgICBkID0gZDFbVSh0cFstMV0pXSwgdHAgKz0g ZDsKICAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBkOworICAgICAgICAgICAg aWYgKGQgPT0gMCkKKyAgICAgICAgICAgICAgZ290byBmb3VuZDsKKyAgICAgICAgICAgIGQgPSBk MVtVKHRwWy0xXSldLCB0cCArPSBkOworICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRw ICs9IGQ7CisgICAgICAgICAgICBkID0gZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKKyAgICAgICAg ICAgIGlmIChkID09IDApCisgICAgICAgICAgICAgIGdvdG8gZm91bmQ7CisgICAgICAgICAgICBk ID0gZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKKyAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSld LCB0cCArPSBkOwogICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7Ci0gICAg ICAgICAgICBpZiAoZCAhPSAwKQorICAgICAgICAgICAgaWYgKGQgPT0gMCkKKyAgICAgICAgICAg ICAgZ290byBmb3VuZDsKKyAgICAgICAgICAgIC8qIFR5cGljYWxseSBtZW1jaHIgaXMgZmFzdGVy IHRoYW4gc2Vla2luZyBieSBkZWx0YTEgd2hlbgorICAgICAgICAgICAgICAgdGhlcmUgaXMgbm8g Y2hhbmNlIHRvIG1hdGNoIGZvciBhIHdoaWxlLiAgKi8KKyAgICAgICAgICAgIHN3aXRjaCAobmxh c3RjaGFyKQogICAgICAgICAgICAgICB7Ci0gICAgICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFd KV0sIHRwICs9IGQ7Ci0gICAgICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7 Ci0gICAgICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7Ci0gICAgICAgICAg ICAgICAgaWYgKGQgIT0gMCkKLSAgICAgICAgICAgICAgICAgIHsKLSAgICAgICAgICAgICAgICAg ICAgLyogVHlwaWNhbGx5IG1lbWNociBpcyBmYXN0ZXIgdGhhbiBzZWVraW5nIGJ5Ci0gICAgICAg ICAgICAgICAgICAgICAgIGRlbHRhMSB3aGVuIHRoZXJlIGlzIG5vIGNoYW5jZSB0byBtYXRjaCBm b3IKLSAgICAgICAgICAgICAgICAgICAgICAgYSB3aGlsZS4gICovCi0gICAgICAgICAgICAgICAg ICAgIHRwLS07Ci0gICAgICAgICAgICAgICAgICAgIHRwID0gbWVtY2hyX3RyYW5zICh0cCwgZ2Mx LCB0ZXh0ICsgc2l6ZSAtIHRwLCB0cmFucyk7Ci0gICAgICAgICAgICAgICAgICAgIGlmICghIHRw KQotICAgICAgICAgICAgICAgICAgICAgIHJldHVybiAtMTsKLSAgICAgICAgICAgICAgICAgICAg dHArKzsKLSAgICAgICAgICAgICAgICAgIH0KKyAgICAgICAgICAgICAgY2FzZSAxOgorICAgICAg ICAgICAgICAgIC0tdHA7CisgICAgICAgICAgICAgICAgdHAgPSBtZW1jaHIgKHRwLCBsYXN0Y2hh cnNbMF0sIHNpemUgKyB0ZXh0IC0gdHAgKyAxKTsKKyAgICAgICAgICAgICAgICBpZiAoIXRwKQor ICAgICAgICAgICAgICAgICAgcmV0dXJuIC0xOworICAgICAgICAgICAgICAgICsrdHA7CisgICAg ICAgICAgICAgICAgZ290byBmb3VuZDsKKyAgICAgICAgICAgICAgY2FzZSAyOgorICAgICAgICAg ICAgICAgIC0tdHA7CisgICAgICAgICAgICAgICAgdHAgPSBtZW1jaHIyICh0cCwgbGFzdGNoYXJz WzBdLCBsYXN0Y2hhcnNbMV0sCisgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBzaXplICsg dGV4dCAtIHRwICsgMSk7CisgICAgICAgICAgICAgICAgaWYgKCF0cCkKKyAgICAgICAgICAgICAg ICAgIHJldHVybiAtMTsKKyAgICAgICAgICAgICAgICArK3RwOworICAgICAgICAgICAgICAgIGdv dG8gZm91bmQ7CisgICAgICAgICAgICAgIGRlZmF1bHQ6CisgICAgICAgICAgICAgICAgZCA9IGQx W1UodHBbLTFdKV07IHRwICs9IGQ7CisgICAgICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV07 IHRwICs9IGQ7CiAgICAgICAgICAgICAgIH0KICAgICAgICAgICB9CisgICAgICAgIGJyZWFrOwor ICAgICAgZm91bmQ6CiAjdW5kZWYgTEFTVF9TSElGVAogI2RlZmluZSBMQVNUX1NISUZUIHRwICs9 IGQKICAgICAgICAgaWYgKHRyYW5zKQotLSAKMS45LjIKCg== --------_535B6EAB000000009513_MULTIPART_MIXED_-- From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 27 Apr 2014 04:05:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: Norihiro Tanaka , 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.139857146523835 (code B ref 17229); Sun, 27 Apr 2014 04:05:02 +0000 Received: (at 17229) by debbugs.gnu.org; 27 Apr 2014 04:04:25 +0000 Received: from localhost ([127.0.0.1]:59449 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WeGKG-0006CM-7j for submit@debbugs.gnu.org; Sun, 27 Apr 2014 00:04:25 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:37213) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WeGKB-0006C9-C8 for 17229@debbugs.gnu.org; Sun, 27 Apr 2014 00:04:21 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 1C4A739E801F; Sat, 26 Apr 2014 21:04:18 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rUONQNZA-5rM; Sat, 26 Apr 2014 21:04:13 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id B418C39E801A; Sat, 26 Apr 2014 21:04:13 -0700 (PDT) Message-ID: <535C81BD.7070200@cs.ucla.edu> Date: Sat, 26 Apr 2014 21:04:13 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 References: <535A8BD2.4050709@redhat.com> <20140426092633.94C8.27F6AC2D@kcn.ne.jp> <20140426174357.94D5.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140426174357.94D5.27F6AC2D@kcn.ne.jp> Content-Type: multipart/mixed; boundary="------------020204080802070809080202" X-Spam-Score: -3.0 (---) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) This is a multi-part message in MIME format. --------------020204080802070809080202 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Thanks for the test cases and patch. In my tests, switching to macros does not help performance, and that SWITCH macro's implementation actually slows things down a bit, which is what I'd expect. If there is a reason to use macros I'd like to see a patch that simply changes functions to macros without changing the algorithm, so that we can measure this effect separately from the algorithm change. I attempted to suss out the performance improvements in that patch without using macros, and installed the attached changes. With these changes grep performs about as well as it does with that patch, on the benchmarks you mentioned that I tried (as before, I'm using the default optimization with GCC 4.9.0 x86-64 on an AMD Phenom II X4 910e). Quite possibly I've missed something, of course. The two "advance_*" constants used in the heuristics are guesses: I haven't measured rigorously to try to come up with good values. --------------020204080802070809080202 Content-Type: text/plain; charset=UTF-8; name="0001-kwset-improve-performance-when-large-Boyer-Moore-key.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename*0="0001-kwset-improve-performance-when-large-Boyer-Moore-key.pa"; filename*1="tch" RnJvbSBjM2YyYTg0OTM1Y2EwOGUyNzZhMzQyNzYxODYwMjJmZmNmN2Q3OWJlIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBGcmksIDI1IEFwciAyMDE0IDE4OjMxOjQ3IC0wNzAwClN1YmplY3Q6IFtQQVRD SCAxLzJdIGt3c2V0OiBpbXByb3ZlIHBlcmZvcm1hbmNlIHdoZW4gbGFyZ2UgQm95ZXItTW9v cmUga2V5CiBkb2Vzbid0IG1hdGNoCgoqIHNyYy9rd3NldC5jIChibWV4ZWMpOiBBcyBhIGhl dXJpc3RpYywgcHJlZmVyIG1lbWNociB0byBzZWVraW5nCmJ5IGRlbHRhMSBvbmx5IHdoZW4g dGhlIGxhdHRlciBkb2Vzbid0IGFkdmFuY2UgbXVjaC4KLS0tCiBzcmMva3dzZXQuYyB8IDEz ICsrKysrKysrKystLS0KIDEgZmlsZSBjaGFuZ2VkLCAxMCBpbnNlcnRpb25zKCspLCAzIGRl bGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL3NyYy9rd3NldC5jIGIvc3JjL2t3c2V0LmMKaW5k ZXggZjg2ZWUwMy4uNDA2ZmEwZSAxMDA2NDQKLS0tIGEvc3JjL2t3c2V0LmMKKysrIGIvc3Jj L2t3c2V0LmMKQEAgLTU3MCw2ICs1NzAsNyBAQCBibWV4ZWMgKGt3c2V0X3Qga3dzZXQsIGNo YXIgY29uc3QgKnRleHQsIHNpemVfdCBzaXplKQogICAgIC8qIDExIGlzIG5vdCBhIGJ1Zywg dGhlIGluaXRpYWwgb2Zmc2V0IGhhcHBlbnMgb25seSBvbmNlLiAqLwogICAgIGZvciAoZXAg PSB0ZXh0ICsgc2l6ZSAtIDExICogbGVuOyB0cCA8PSBlcDsgKQogICAgICAgeworICAgICAg ICBjaGFyIGNvbnN0ICp0cDAgPSB0cDsKICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRw ICs9IGQ7CiAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBkOwogICAgICAgICBp ZiAoZCAhPSAwKQpAQCAtNTg0LDkgKzU4NSwxNCBAQCBibWV4ZWMgKGt3c2V0X3Qga3dzZXQs IGNoYXIgY29uc3QgKnRleHQsIHNpemVfdCBzaXplKQogICAgICAgICAgICAgICAgIGQgPSBk MVtVKHRwWy0xXSldLCB0cCArPSBkOwogICAgICAgICAgICAgICAgIGlmIChkICE9IDApCiAg ICAgICAgICAgICAgICAgICB7Ci0gICAgICAgICAgICAgICAgICAgIC8qIFR5cGljYWxseSBt ZW1jaHIgaXMgZmFzdGVyIHRoYW4gc2Vla2luZyBieQotICAgICAgICAgICAgICAgICAgICAg ICBkZWx0YTEgd2hlbiB0aGVyZSBpcyBubyBjaGFuY2UgdG8gbWF0Y2ggZm9yCi0gICAgICAg ICAgICAgICAgICAgICAgIGEgd2hpbGUuICAqLworICAgICAgICAgICAgICAgICAgICBkID0g ZDFbVSh0cFstMV0pXSwgdHAgKz0gZDsKKyAgICAgICAgICAgICAgICAgICAgZCA9IGQxW1Uo dHBbLTFdKV0sIHRwICs9IGQ7CisKKyAgICAgICAgICAgICAgICAgICAgLyogQXMgYSBoZXVy aXN0aWMsIHByZWZlciBtZW1jaHIgdG8gc2Vla2luZyBieQorICAgICAgICAgICAgICAgICAg ICAgICBkZWx0YTEgd2hlbiB0aGUgbGF0dGVyIGRvZXNuJ3QgYWR2YW5jZSBtdWNoLiAgKi8K KyAgICAgICAgICAgICAgICAgICAgaW50IGFkdmFuY2VfaGV1cmlzdGljID0gNCAqIHNpemVv ZiAobG9uZyk7CisgICAgICAgICAgICAgICAgICAgIGlmIChhZHZhbmNlX2hldXJpc3RpYyA8 PSB0cCAtIHRwMCkKKyAgICAgICAgICAgICAgICAgICAgICBnb3RvIGJpZ19hZHZhbmNlOwog ICAgICAgICAgICAgICAgICAgICB0cC0tOwogICAgICAgICAgICAgICAgICAgICB0cCA9IG1l bWNocl90cmFucyAodHAsIGdjMSwgdGV4dCArIHNpemUgLSB0cCwgdHJhbnMpOwogICAgICAg ICAgICAgICAgICAgICBpZiAoISB0cCkKQEAgLTU5Nyw2ICs2MDMsNyBAQCBibWV4ZWMgKGt3 c2V0X3Qga3dzZXQsIGNoYXIgY29uc3QgKnRleHQsIHNpemVfdCBzaXplKQogICAgICAgICAg IH0KICAgICAgICAgaWYgKGJtX2RlbHRhMl9zZWFyY2ggKCZ0cCwgZXAsIHNwLCBsZW4sIHRy YW5zLCBnYzEsIGdjMiwgZDEsIGt3c2V0KSkKICAgICAgICAgICByZXR1cm4gdHAgLSB0ZXh0 OworICAgICAgYmlnX2FkdmFuY2U6OwogICAgICAgfQogCiAgIC8qIE5vdyB3ZSBoYXZlIG9u bHkgYSBmZXcgY2hhcmFjdGVycyBsZWZ0IHRvIHNlYXJjaC4gIFdlCi0tIAoxLjkuMAoK --------------020204080802070809080202 Content-Type: text/plain; charset=UTF-8; name="0002-kwset-speed-up-by-using-memchr2.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="0002-kwset-speed-up-by-using-memchr2.patch" RnJvbSA4NzNkZDlkNjI2ODUzZTE5NTAxMWVlZGQxOGI2ZjdhMjFjOWZhZWM0IE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBTYXQsIDI2IEFwciAyMDE0IDE4OjE4OjU5IC0wNzAwClN1YmplY3Q6IFtQQVRD SCAyLzJdIGt3c2V0OiBzcGVlZCB1cCBieSB1c2luZyBtZW1jaHIyCgpJZGVhIHN1Z2dlc3Rl ZCBieSBFcmljIEJsYWtlIGluOiBodHRwOi8vYnVncy5nbnUub3JnLzE3MjI5IzQzCiogYm9v dHN0cmFwLmNvbmYgKGdudWxpYl9tb2R1bGVzKTogQWRkIG1lbWNocjIuCiogc3JjL2t3c2V0 LmM6IEluY2x1ZGUgc3RkaW50LmgsIGZvciB1aW50cHRyX3QuICBJbmNsdWRlIG1lbWNocjIu aC4KKHN0cnVjdCBrd3NldCk6IE5ldyBtZW1iZXJzIGdjMSwgZ2MyLCBnYzFoZWxwLgoodHIp OiBNb3ZlIGVhcmxpZXIsIHNvIGl0IGNhbiBiZSB1c2VkIGVhcmxpZXIuCihrd3NwcmVwKTog SW5pdGlhbGl6ZSBzdHJ1Y3Qga3dzZXQncyBuZXcgbWVtYmVycy4KKG1lbWNocl9rd3NldCk6 IFJlbmFtZSBmcm9tIG1lbWNocl90cmFucy4gIENvbWJpbmUgQyBhbmQgVFJBTlMgYXJncyBp bnRvCm5ldyBhcmcgS1dTRVQuICBBbGwgdXNlcyBjaGFuZ2VkLiAgVXNlIG1lbWNocjIgd2hl biBhcHByb3ByaWF0ZS4KKGJtZXhlYyk6IFVzZSBuZXcgbWVtYmVycyBpbnN0ZWFkIG9mIHJl Y29tcHV0aW5nIHRoZWlyIHZhbHVlcy4KSW5jcmVhc2UgYWR2YW5jZV9oZXVyaXN0aWM7IGl0 J3MganVzdCBhIGd1ZXNzLCBidXQgbWVtY2hyMiBwcm9iYWJseQptYWtlcyBpdCByZWFzb25h YmxlIHRvIGluY3JlYXNlIGl0LgotLS0KIGJvb3RzdHJhcC5jb25mIHwgIDEgKwogc3JjL2t3 c2V0LmMgICAgfCA4NyArKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysr KysrKy0tLS0tLS0tLS0tLS0tCiAyIGZpbGVzIGNoYW5nZWQsIDY3IGluc2VydGlvbnMoKyks IDIxIGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL2Jvb3RzdHJhcC5jb25mIGIvYm9vdHN0 cmFwLmNvbmYKaW5kZXggMzY3NDI3ZC4uOWYzNDU3ZCAxMDA2NDQKLS0tIGEvYm9vdHN0cmFw LmNvbmYKKysrIGIvYm9vdHN0cmFwLmNvbmYKQEAgLTU3LDYgKzU3LDcgQEAgbWFueXdhcm5p bmdzCiBtYnJsZW4KIG1icnRvd2MKIG1lbWNocgorbWVtY2hyMgogbWVtcGNweQogbWlubWF4 CiBvYnN0YWNrCmRpZmYgLS1naXQgYS9zcmMva3dzZXQuYyBiL3NyYy9rd3NldC5jCmluZGV4 IDQwNmZhMGUuLjgyNzBmMDUgMTAwNjQ0Ci0tLSBhL3NyYy9rd3NldC5jCisrKyBiL3NyYy9r d3NldC5jCkBAIC0zNiw4ICszNiwxMCBAQAogI2luY2x1ZGUgImt3c2V0LmgiCiAKICNpbmNs dWRlIDxzdGRib29sLmg+CisjaW5jbHVkZSA8c3RkaW50Lmg+CiAjaW5jbHVkZSA8c3lzL3R5 cGVzLmg+CiAjaW5jbHVkZSAic3lzdGVtLmgiCisjaW5jbHVkZSAibWVtY2hyMi5oIgogI2lu Y2x1ZGUgIm9ic3RhY2suaCIKICNpbmNsdWRlICJ4YWxsb2MuaCIKIApAQCAtOTEsOCArOTMs MzMgQEAgc3RydWN0IGt3c2V0CiAgIGNoYXIgKnRhcmdldDsJCQkvKiBUYXJnZXQgc3RyaW5n IGlmIHRoZXJlJ3Mgb25seSBvbmUuICovCiAgIGludCAqc2hpZnQ7CQkJLyogVXNlZCBpbiBC b3llci1Nb29yZSBzZWFyY2ggZm9yIG9uZSBzdHJpbmcuICovCiAgIGNoYXIgY29uc3QgKnRy YW5zOwkJLyogQ2hhcmFjdGVyIHRyYW5zbGF0aW9uIHRhYmxlLiAqLworCisgIC8qIElmIHRo ZXJlJ3Mgb25seSBvbmUgc3RyaW5nLCB0aGlzIGlzIHRoZSBzdHJpbmcncyBsYXN0IGJ5dGUs CisgICAgIHRyYW5zbGF0ZWQgdmlhIFRSQU5TIGlmIFRSQU5TIGlzIG5vbm51bGwuICAqLwor ICBjaGFyIGdjMTsKKworICAvKiBMaWtld2lzZSBmb3IgdGhlIHN0cmluZydzIHBlbnVsdGlt YXRlIGJ5dGUsIGlmIGl0IGhhcyB0d28gb3IgbW9yZQorICAgICBieXRlcy4gICovCisgIGNo YXIgZ2MyOworCisgIC8qIElmIHRoZXJlJ3Mgb25seSBvbmUgc3RyaW5nLCB0aGlzIGhlbHBz IHRvIG1hdGNoIHRoZSBzdHJpbmcncyBsYXN0IGJ5dGUuCisgICAgIElmIEdDMUhFTFAgaXMg bmVnYXRpdmUsIG9ubHkgR0MxIG1hdGNoZXMgdGhlIHN0cmluZydzIGxhc3QgYnl0ZTsKKyAg ICAgb3RoZXJ3aXNlIGF0IGxlYXN0IHR3byBieXRlcyBtYXRjaCwgYW5kIEIgbWF0Y2hlcyBp ZiBUUkFOU1tCXSA9PSBHQzEuCisgICAgIElmIEdDMUhFTFAgaXMgaW4gdGhlIHJhbmdlIDAu LihOQ0hBUiAtIDEpLCB0aGVyZSBhcmUgZXhhY3RseSB0d28KKyAgICAgc3VjaCBtYXRjaGVz LCBhbmQgR0MxSEVMUCBpcyB0aGUgb3RoZXIgbWF0Y2ggYWZ0ZXIgY29udmVyc2lvbiB0bwor ICAgICB1bnNpZ25lZCBjaGFyLiAgSWYgR0MxSEVMUCBpcyBhdCBsZWFzdCBOQ0hBUiwgdGhl cmUgYXJlIHRocmVlIG9yCisgICAgIG1vcmUgc3VjaCBtYXRjaGVzOyBlLmcuLCBHcmVlayBo YXMgdGhyZWUgc2lnbWEgY2hhcmFjdGVycyB0aGF0CisgICAgIGFsbCBtYXRjaCB3aGVuIGNh c2UtZm9sZGluZy4gICovCisgIGludCBnYzFoZWxwOwogfTsKIAorLyogVXNlIFRSQU5TIHRv IHRyYW5zbGl0ZXJhdGUgQy4gIEEgbnVsbCBUUkFOUyBkb2VzIG5vIHRyYW5zbGl0ZXJhdGlv bi4gICovCitzdGF0aWMgY2hhcgordHIgKGNoYXIgY29uc3QgKnRyYW5zLCBjaGFyIGMpCit7 CisgIHJldHVybiB0cmFucyA/IHRyYW5zW1UoYyldIDogYzsKK30KKwogLyogQWxsb2NhdGUg YW5kIGluaXRpYWxpemUgYSBrZXl3b3JkIHNldCBvYmplY3QsIHJldHVybmluZyBhbiBvcGFx dWUKICAgIHBvaW50ZXIgdG8gaXQuICAqLwoga3dzZXRfdApAQCAtNDU2LDYgKzQ4MywyNyBA QCBrd3NwcmVwIChrd3NldF90IGt3c2V0KQogICAgICAgICAgICAgICBjdXJyID0gY3Vyci0+ bmV4dDsKICAgICAgICAgICAgIH0KICAgICAgICAgfQorCisgICAgICBjaGFyIGdjMSA9IHRy ICh0cmFucywga3dzZXQtPnRhcmdldFtrd3NldC0+bWluZCAtIDFdKTsKKworICAgICAgLyog U2V0IEdDMUhFTFAgYWNjb3JkaW5nIHRvIHdoZXRoZXIgZXhhY3RseSBvbmUsIGV4YWN0bHkg dHdvLCBvcgorICAgICAgICAgdGhyZWUtb3ItbW9yZSBjaGFyYWN0ZXJzIG1hdGNoIEdDMS4g ICovCisgICAgICBpbnQgZ2MxaGVscCA9IC0xOworICAgICAgaWYgKHRyYW5zKQorICAgICAg ICB7CisgICAgICAgICAgY2hhciBjb25zdCAqZXF1aXYxID0gbWVtY2hyICh0cmFucywgZ2Mx LCBOQ0hBUik7CisgICAgICAgICAgY2hhciBjb25zdCAqZXF1aXYyID0gbWVtY2hyIChlcXVp djEgKyAxLCBnYzEsCisgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICB0 cmFucyArIE5DSEFSIC0gKGVxdWl2MSArIDEpKTsKKyAgICAgICAgICBpZiAoZXF1aXYyKQor ICAgICAgICAgICAgZ2MxaGVscCA9IChtZW1jaHIgKGVxdWl2MiArIDEsIGdjMSwgdHJhbnMg KyBOQ0hBUiAtIChlcXVpdjIgKyAxKSkKKyAgICAgICAgICAgICAgICAgICAgICAgPyBOQ0hB UgorICAgICAgICAgICAgICAgICAgICAgICA6IFUoZ2MxKSBeIChlcXVpdjEgLSB0cmFucykg XiAoZXF1aXYyIC0gdHJhbnMpKTsKKyAgICAgICAgfQorCisgICAgICBrd3NldC0+Z2MxID0g Z2MxOworICAgICAga3dzZXQtPmdjMWhlbHAgPSBnYzFoZWxwOworICAgICAgaWYgKGt3c2V0 LT5taW5kID4gMSkKKyAgICAgICAga3dzZXQtPmdjMiA9IHRyICh0cmFucywga3dzZXQtPnRh cmdldFtrd3NldC0+bWluZCAtIDJdKTsKICAgICB9CiAKICAgLyogRml4IHRoaW5ncyB1cCBm b3IgYW55IHRyYW5zbGF0aW9uIHRhYmxlLiAqLwpAQCAtNDY0LDEzICs1MTIsNiBAQCBrd3Nw cmVwIChrd3NldF90IGt3c2V0KQogICAgICAga3dzZXQtPmRlbHRhW2ldID0gZGVsdGFbVSh0 cmFuc1tpXSldOwogfQogCi0vKiBVc2UgVFJBTlMgdG8gdHJhbnNsaXRlcmF0ZSBDLiAgQSBu dWxsIFRSQU5TIGRvZXMgbm8gdHJhbnNsaXRlcmF0aW9uLiAgKi8KLXN0YXRpYyBjaGFyCi10 ciAoY2hhciBjb25zdCAqdHJhbnMsIGNoYXIgYykKLXsKLSAgcmV0dXJuIHRyYW5zID8gdHJh bnNbVShjKV0gOiBjOwotfQotCiAvKiBEZWx0YTIgcG9ydGlvbiBvZiBhIEJveWVyLU1vb3Jl IHNlYXJjaC4gICpUUCBpcyB0aGUgc3RyaW5nIHRleHQKICAgIHBvaW50ZXI7IGl0IGlzIHVw ZGF0ZWQgaW4gcGxhY2UuICBFUCBpcyB0aGUgZW5kIG9mIHRoZSBzdHJpbmcgdGV4dCwKICAg IGFuZCBTUCB0aGUgZW5kIG9mIHRoZSBwYXR0ZXJuLiAgTEVOIGlzIHRoZSBwYXR0ZXJuIGxl bmd0aDsgaXQgbXVzdApAQCAtNTI0LDE5ICs1NjUsMjMgQEAgYm1fZGVsdGEyX3NlYXJjaCAo Y2hhciBjb25zdCAqKnRwcCwgY2hhciBjb25zdCAqZXAsIGNoYXIgY29uc3QgKnNwLCBpbnQg bGVuLAogICByZXR1cm4gZmFsc2U7CiB9CiAKLS8qIFJldHVybiB0aGUgYWRkcmVzcyBvZiB0 aGUgZmlyc3QgYnl0ZSBpbiB0aGUgYnVmZmVyIFMgdGhhdCBlcXVhbHMgQy4KLSAgIFMgY29u dGFpbnMgTiBieXRlcy4gIElmIFRSQU5TIGlzIG5vbm51bGwsIHVzZSBpdCB0byB0cmFuc2xp dGVyYXRlCi0gICBTJ3MgYnl0ZXMgYmVmb3JlIGNvbXBhcmluZyB0aGVtLiAgKi8KKy8qIFJl dHVybiB0aGUgYWRkcmVzcyBvZiB0aGUgZmlyc3QgYnl0ZSBpbiB0aGUgYnVmZmVyIFMgKG9m IHNpemUgTikKKyAgIHRoYXQgbWF0Y2hlcyB0aGUgbGFzdCBieXRlIHNwZWNpZmllZCBieSBL V1NFVCwgYSBzaW5nbGV0b24uICAqLwogc3RhdGljIGNoYXIgY29uc3QgKgotbWVtY2hyX3Ry YW5zIChjaGFyIGNvbnN0ICpzLCBjaGFyIGMsIHNpemVfdCBuLCBjaGFyIGNvbnN0ICp0cmFu cykKK21lbWNocl9rd3NldCAoY2hhciBjb25zdCAqcywgc2l6ZV90IG4sIGt3c2V0X3Qga3dz ZXQpCiB7Ci0gIGlmICghIHRyYW5zKQotICAgIHJldHVybiBtZW1jaHIgKHMsIGMsIG4pOwot ICBjaGFyIGNvbnN0ICpzbGltID0gcyArIG47CisgIGlmIChrd3NldC0+Z2MxaGVscCA8IDAp CisgICAgcmV0dXJuIG1lbWNociAocywga3dzZXQtPmdjMSwgbik7CisgIGludCBzbWFsbF9o ZXVyaXN0aWMgPSAyOworICBpbnQgc21hbGwgPSAoLSAodWludHB0cl90KSBzICUgc2l6ZW9m IChsb25nKQorICAgICAgICAgICAgICAgKyBzbWFsbF9oZXVyaXN0aWMgKiBzaXplb2YgKGxv bmcpKTsKKyAgc2l6ZV90IG50cmFucyA9IGt3c2V0LT5nYzFoZWxwIDwgTkNIQVIgJiYgc21h bGwgPCBuID8gc21hbGwgOiBuOworICBjaGFyIGNvbnN0ICpzbGltID0gcyArIG50cmFuczsK ICAgZm9yICg7IHMgPCBzbGltOyBzKyspCi0gICAgaWYgKHRyYW5zW1UoKnMpXSA9PSBjKQor ICAgIGlmIChrd3NldC0+dHJhbnNbVSgqcyldID09IGt3c2V0LT5nYzEpCiAgICAgICByZXR1 cm4gczsKLSAgcmV0dXJuIE5VTEw7CisgIG4gLT0gbnRyYW5zOworICByZXR1cm4gbiA9PSAw ID8gTlVMTCA6IG1lbWNocjIgKHMsIGt3c2V0LT5nYzEsIGt3c2V0LT5nYzFoZWxwLCBuKTsK IH0KIAogLyogRmFzdCBib3llci1tb29yZSBzZWFyY2guICovCkBAIC01NTUsMTUgKzYwMCwx NSBAQCBibWV4ZWMgKGt3c2V0X3Qga3dzZXQsIGNoYXIgY29uc3QgKnRleHQsIHNpemVfdCBz aXplKQogICAgIHJldHVybiAtMTsKICAgaWYgKGxlbiA9PSAxKQogICAgIHsKLSAgICAgIHRw ID0gbWVtY2hyX3RyYW5zICh0ZXh0LCBrd3NldC0+dGFyZ2V0WzBdLCBzaXplLCB0cmFucyk7 CisgICAgICB0cCA9IG1lbWNocl9rd3NldCAodGV4dCwgc2l6ZSwga3dzZXQpOwogICAgICAg cmV0dXJuIHRwID8gdHAgLSB0ZXh0IDogLTE7CiAgICAgfQogCiAgIGQxID0ga3dzZXQtPmRl bHRhOwogICBzcCA9IGt3c2V0LT50YXJnZXQgKyBsZW47CiAgIHRwID0gdGV4dCArIGxlbjsK LSAgY2hhciBnYzEgPSB0ciAodHJhbnMsIHNwWy0xXSk7Ci0gIGNoYXIgZ2MyID0gdHIgKHRy YW5zLCBzcFstMl0pOworICBjaGFyIGdjMSA9IGt3c2V0LT5nYzE7CisgIGNoYXIgZ2MyID0g a3dzZXQtPmdjMjsKIAogICAvKiBTaWduaWZpY2FuY2Ugb2YgMTI6IDEgKGluaXRpYWwgb2Zm c2V0KSArIDEwIChza2lwIGxvb3ApICsgMSAobWQyKS4gKi8KICAgaWYgKHNpemUgPiAxMiAq IGxlbikKQEAgLTU5MCwxMSArNjM1LDExIEBAIGJtZXhlYyAoa3dzZXRfdCBrd3NldCwgY2hh ciBjb25zdCAqdGV4dCwgc2l6ZV90IHNpemUpCiAKICAgICAgICAgICAgICAgICAgICAgLyog QXMgYSBoZXVyaXN0aWMsIHByZWZlciBtZW1jaHIgdG8gc2Vla2luZyBieQogICAgICAgICAg ICAgICAgICAgICAgICBkZWx0YTEgd2hlbiB0aGUgbGF0dGVyIGRvZXNuJ3QgYWR2YW5jZSBt dWNoLiAgKi8KLSAgICAgICAgICAgICAgICAgICAgaW50IGFkdmFuY2VfaGV1cmlzdGljID0g NCAqIHNpemVvZiAobG9uZyk7CisgICAgICAgICAgICAgICAgICAgIGludCBhZHZhbmNlX2hl dXJpc3RpYyA9IDE2ICogc2l6ZW9mIChsb25nKTsKICAgICAgICAgICAgICAgICAgICAgaWYg KGFkdmFuY2VfaGV1cmlzdGljIDw9IHRwIC0gdHAwKQogICAgICAgICAgICAgICAgICAgICAg IGdvdG8gYmlnX2FkdmFuY2U7CiAgICAgICAgICAgICAgICAgICAgIHRwLS07Ci0gICAgICAg ICAgICAgICAgICAgIHRwID0gbWVtY2hyX3RyYW5zICh0cCwgZ2MxLCB0ZXh0ICsgc2l6ZSAt IHRwLCB0cmFucyk7CisgICAgICAgICAgICAgICAgICAgIHRwID0gbWVtY2hyX2t3c2V0ICh0 cCwgdGV4dCArIHNpemUgLSB0cCwga3dzZXQpOwogICAgICAgICAgICAgICAgICAgICBpZiAo ISB0cCkKICAgICAgICAgICAgICAgICAgICAgICByZXR1cm4gLTE7CiAgICAgICAgICAgICAg ICAgICAgIHRwKys7Ci0tIAoxLjkuMAoK --------------020204080802070809080202-- From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Norihiro Tanaka Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 27 Apr 2014 07:58:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: Paul Eggert Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.139858544920108 (code B ref 17229); Sun, 27 Apr 2014 07:58:01 +0000 Received: (at 17229) by debbugs.gnu.org; 27 Apr 2014 07:57:29 +0000 Received: from localhost ([127.0.0.1]:59501 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WeJxn-0005EF-Vf for submit@debbugs.gnu.org; Sun, 27 Apr 2014 03:57:28 -0400 Received: from mailgw01.kcn.ne.jp ([61.86.7.208]:50820) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WeJxj-0005E4-UP for 17229@debbugs.gnu.org; Sun, 27 Apr 2014 03:57:25 -0400 Received: from imp02 (mailgw6.kcn.ne.jp [61.86.15.232]) by mailgw01.kcn.ne.jp (Postfix) with ESMTP id BD7178027A for <17229@debbugs.gnu.org>; Sun, 27 Apr 2014 16:57:21 +0900 (JST) Received: from mail08.kcn.ne.jp ([61.86.6.187]) by imp02 with bizsmtp id uvxM1n00B426eXR01vxMP1; Sun, 27 Apr 2014 16:57:21 +0900 X-OrgRCPT: 17229@debbugs.gnu.org Received: from [10.120.1.53] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail08.kcn.ne.jp (Postfix) with ESMTPA id 5B53312B8098; Sun, 27 Apr 2014 16:57:21 +0900 (JST) Date: Sun, 27 Apr 2014 16:57:21 +0900 From: Norihiro Tanaka In-Reply-To: <535C81BD.7070200@cs.ucla.edu> References: <20140426174357.94D5.27F6AC2D@kcn.ne.jp> <535C81BD.7070200@cs.ucla.edu> Message-Id: <20140427165308.5C46.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------_535CB499000000005C94_MULTIPART_MIXED_" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --------_535CB499000000005C94_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Thanks for the improvement with memchr2(). It's very great! Paul Eggert wrote: > If there is a reason to use macros I'd like to see a patch that simply > changes functions to macros without changing the algorithm, so that we > can measure this effect separately from the algorithm change. I tested with below. yes jjjjjjjjjjjjjjjjjjjj | head -10000000 >k env LC_ALL=C time -p src/grep kjjjjjjjjjjjjjjjjjjj k The performance with the current master is as following. real 2.95 user 2.44 sys 0.45 After changes function to macros. real 1.09 user 0.59 sys 0.43 Could you try above cases? Norihiro --------_535CB499000000005C94_MULTIPART_MIXED_ Content-Type: text/plain; charset="UTF-8"; name="patch.txt" Content-Disposition: attachment; filename="patch.txt" Content-Transfer-Encoding: base64 RnJvbSA5ZTIxYWNhZjI2OGY2ODBlYmU0OTY5ZjhiNDg5ZWUyMzQ4YmM1ZTk1IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE YXRlOiBTdW4sIDI3IEFwciAyMDE0IDE2OjM4OjUzICswOTAwClN1YmplY3Q6IFtQQVRDSF0ga3dz ZXQ6IGNoYW5nZXMgYm1fZGVsdGEyX3NlYXJjaCBmdW5jdGlvbiB0byBtYWNyb3MKCiogc3JjL2t3 c2V0LmMgKGJtX2RlbHRhMl9zZWFyY2gpOiBSZW1vdmUgaXQuCihCTV9ERUxUQTJfU0VBUkNILCBU UkFOUyBMQVNUX1NISUZUKTogRGVmaW5lIG5ldyBtYWNyb3MuCihibWV4ZWMpOiBSZXBsYWNlIGJt X2RlbHRhMl9zZWFyY2ggZnVuY3Rpb24gdG8gbWFjcm9zLgotLS0KIHNyYy9rd3NldC5jIHwgMTUz ICsrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrLS0tLS0tLS0tLS0tLS0tLS0t LS0tLQogMSBmaWxlIGNoYW5nZWQsIDk3IGluc2VydGlvbnMoKyksIDU2IGRlbGV0aW9ucygtKQoK ZGlmZiAtLWdpdCBhL3NyYy9rd3NldC5jIGIvc3JjL2t3c2V0LmMKaW5kZXggODI3MGYwNS4uMmFj YWQxYSAxMDA2NDQKLS0tIGEvc3JjL2t3c2V0LmMKKysrIGIvc3JjL2t3c2V0LmMKQEAgLTUxMiw1 OSArNTEyLDc2IEBAIGt3c3ByZXAgKGt3c2V0X3Qga3dzZXQpCiAgICAgICBrd3NldC0+ZGVsdGFb aV0gPSBkZWx0YVtVKHRyYW5zW2ldKV07CiB9CiAKLS8qIERlbHRhMiBwb3J0aW9uIG9mIGEgQm95 ZXItTW9vcmUgc2VhcmNoLiAgKlRQIGlzIHRoZSBzdHJpbmcgdGV4dAotICAgcG9pbnRlcjsgaXQg aXMgdXBkYXRlZCBpbiBwbGFjZS4gIEVQIGlzIHRoZSBlbmQgb2YgdGhlIHN0cmluZyB0ZXh0LAot ICAgYW5kIFNQIHRoZSBlbmQgb2YgdGhlIHBhdHRlcm4uICBMRU4gaXMgdGhlIHBhdHRlcm4gbGVu Z3RoOyBpdCBtdXN0Ci0gICBiZSBhdCBsZWFzdCAyLiAgVFJBTlMsIGlmIG5vbm51bGwsIGlzIHRo ZSBpbnB1dCB0cmFuc2xhdGlvbiB0YWJsZS4KLSAgIEdDMSBhbmQgR0MyIGFyZSB0aGUgbGFzdCBh bmQgc2Vjb25kLWZyb20gbGFzdCBieXRlcyBvZiB0aGUgcGF0dGVybiwKLSAgIHRyYW5zbGl0ZXJh dGVkIGJ5IFRSQU5TOyB0aGUgY2FsbGVyIHByZWNvbXB1dGVzIHRoZW0gZm9yCi0gICBlZmZpY2ll bmN5LiAgSWYgRDEgaXMgbm9ubnVsbCwgaXQgaXMgYSBkZWx0YTEgdGFibGUgZm9yIHNoaWZ0aW5n ICpUUAotICAgd2hlbiBmYWlsaW5nLiAgS1dTRVQtPnNoaWZ0IHNheXMgaG93IG11Y2ggdG8gc2hp ZnQuICAqLwotc3RhdGljIGlubGluZSBib29sCi1ibV9kZWx0YTJfc2VhcmNoIChjaGFyIGNvbnN0 ICoqdHBwLCBjaGFyIGNvbnN0ICplcCwgY2hhciBjb25zdCAqc3AsIGludCBsZW4sCi0gICAgICAg ICAgICAgICAgICBjaGFyIGNvbnN0ICp0cmFucywgY2hhciBnYzEsIGNoYXIgZ2MyLAotICAgICAg ICAgICAgICAgICAgdW5zaWduZWQgY2hhciBjb25zdCAqZDEsIGt3c2V0X3Qga3dzZXQpCi17Ci0g IGNoYXIgY29uc3QgKnRwID0gKnRwcDsKLSAgaW50IGQgPSBsZW4sIHNraXAgPSAwOwotCi0gIHdo aWxlICh0cnVlKQotICAgIHsKLSAgICAgIGludCBpID0gMjsKLSAgICAgIGlmICh0ciAodHJhbnMs IHRwWy0yXSkgPT0gZ2MyKQotICAgICAgICB7Ci0gICAgICAgICAgd2hpbGUgKCsraSA8PSBkKQot ICAgICAgICAgICAgaWYgKHRyICh0cmFucywgdHBbLWldKSAhPSB0ciAodHJhbnMsIHNwWy1pXSkp Ci0gICAgICAgICAgICAgIGJyZWFrOwotICAgICAgICAgIGlmIChpID4gZCkKLSAgICAgICAgICAg IHsKLSAgICAgICAgICAgICAgZm9yIChpID0gZCArIHNraXAgKyAxOyBpIDw9IGxlbjsgKytpKQot ICAgICAgICAgICAgICAgIGlmICh0ciAodHJhbnMsIHRwWy1pXSkgIT0gdHIgKHRyYW5zLCBzcFst aV0pKQotICAgICAgICAgICAgICAgICAgYnJlYWs7Ci0gICAgICAgICAgICAgIGlmIChpID4gbGVu KQotICAgICAgICAgICAgICAgIHsKLSAgICAgICAgICAgICAgICAgICp0cHAgPSB0cCAtIGxlbjsK LSAgICAgICAgICAgICAgICAgIHJldHVybiB0cnVlOwotICAgICAgICAgICAgICAgIH0KLSAgICAg ICAgICAgIH0KLSAgICAgICAgfQotCi0gICAgICB0cCArPSBkID0ga3dzZXQtPnNoaWZ0W2kgLSAy XTsKLSAgICAgIGlmICh0cCA+IGVwKQotICAgICAgICBicmVhazsKLSAgICAgIGlmICh0ciAodHJh bnMsIHRwWy0xXSkgIT0gZ2MxKQotICAgICAgICB7Ci0gICAgICAgICAgaWYgKGQxKQotICAgICAg ICAgICAgdHAgKz0gZDFbVSh0cFstMV0pXTsKLSAgICAgICAgICBicmVhazsKLSAgICAgICAgfQot ICAgICAgc2tpcCA9IGkgLSAxOworLyogRGVsdGEyIHBvcnRpb24gb2YgYSBCb3llci1Nb29yZSBz ZWFyY2guICAqLworI2RlZmluZSBCTV9ERUxUQTJfU0VBUkNIIAkJCQlcCisgIGlmIChUUkFOUyh0 cFstMl0pID09IGdjMikJCQkJXAorICAgIHsJCQkJCQkJXAorICAgICAgZm9yIChpID0gMzsgaSA8 PSBsZW47ICsraSkJCQlcCisgICAgICAgIGlmIChUUkFOUyh0cFstaV0pICE9IFRSQU5TKHNwWy1p XSkpCQlcCisgICAgICAgICAgYnJlYWs7CQkJCQlcCisgICAgICBpZiAoaSA+IGxlbikJCQkJCVwK KyAgICAgICAgcmV0dXJuIHRwIC0gbGVuIC0gdGV4dDsJCQkJXAorICAgICAgZCA9IGt3c2V0LT5z aGlmdFtpIC0gMl07IHRwICs9IGQ7CQkJXAorICAgICAgaWYgKHRwID4gZXApCQkJCQlcCisgICAg ICAgIGJyZWFrOwkJCQkJCVwKKyAgICAgIGlmIChUUkFOUyh0cFstMV0pICE9IGdjMSkJCQkJXAor ICAgICAgICB7CQkJCQkJXAorICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldOyBMQVNUX1NISUZU OwkJXAorICAgICAgICAgIGNvbnRpbnVlOwkJCQkJXAorICAgICAgICB9CQkJCQkJXAorICAgICAg c2tpcCA9IGkgLSAxOwkJCQkJXAorICAgIH0JCQkJCQkJXAorICBlbHNlCQkJCQkJCVwKKyAgICB7 CQkJCQkJCVwKKyAgICAgIGQgPSBrd3NldC0+c2hpZnRbMF07IHRwICs9IGQ7CQkJXAorICAgICAg aWYgKHRwID4gZXApCQkJCQlcCisgICAgICAgIGJyZWFrOwkJCQkJCVwKKyAgICAgIGlmIChUUkFO Uyh0cFstMV0pICE9IGdjMSkJCQkJXAorICAgICAgICB7CQkJCQkJXAorICAgICAgICAgIGQgPSBk MVtVKHRwWy0xXSldOyBMQVNUX1NISUZUOwkJXAorICAgICAgICAgIGNvbnRpbnVlOwkJCQkJXAor ICAgICAgICB9CQkJCQkJXAorICAgICAgc2tpcCA9IDE7CQkJCQkJXAorICAgIH0JCQkJCQkJXAor ICB3aGlsZSAodHJ1ZSkJCQkJCQlcCisgICAgewkJCQkJCQlcCisgICAgICBpZiAoVFJBTlModHBb LTJdKSA9PSBnYzIpCQkJCVwKKyAgICAgICAgewkJCQkJCVwKKyAgICAgICAgICBmb3IgKGkgPSAz OyBpIDw9IGQ7ICsraSkJCQlcCisgICAgICAgICAgICBpZiAoVFJBTlModHBbLWldKSAhPSBUUkFO UyhzcFstaV0pKQkJXAorICAgICAgICAgICAgICBicmVhazsJCQkJCVwKKyAgICAgICAgICBpZiAo aSA+IGQpCQkJCQlcCisgICAgICAgICAgICB7CQkJCQkJXAorICAgICAgICAgICAgICBmb3IgKGkg PSBkICsgc2tpcCArIDE7IGkgPD0gbGVuOyArK2kpCVwKKyAgICAgICAgICAgICAgICBpZiAoVFJB TlModHBbLWldKSAhPSBUUkFOUyhzcFstaV0pKQlcCisgICAgICAgICAgICAgICAgICBicmVhazsJ CQkJXAorICAgICAgICAgICAgICBpZiAoaSA+IGxlbikJCQkJXAorICAgICAgICAgICAgICAgIHJl dHVybiB0cCAtIGxlbiAtIHRleHQ7CQkJXAorICAgICAgICAgICAgfQkJCQkJCVwKKyAgICAgICAg ICBkID0ga3dzZXQtPnNoaWZ0W2kgLSAyXTsgdHAgKz0gZDsJCVwKKyAgICAgICAgICBpZiAodHAg PiBlcCkJCQkJCVwKKyAgICAgICAgICAgIGJyZWFrOwkJCQkJXAorICAgICAgICAgIGlmIChUUkFO Uyh0cFstMV0pICE9IGdjMSkJCQlcCisgICAgICAgICAgICB7CQkJCQkJXAorICAgICAgICAgICAg ICBkID0gZDFbVSh0cFstMV0pXTsgTEFTVF9TSElGVDsJCVwKKyAgICAgICAgICAgICAgYnJlYWs7 CQkJCQlcCisgICAgICAgICAgICB9CQkJCQkJXAorICAgICAgICAgIHNraXAgPSBpIC0gMTsJCQkJ CVwKKyAgICAgICAgfQkJCQkJCVwKKyAgICAgIGVsc2UJCQkJCQlcCisgICAgICAgIHsJCQkJCQlc CisgICAgICAgICAgZCA9IGt3c2V0LT5zaGlmdFswXTsgdHAgKz0gZDsJCQlcCisgICAgICAgICAg aWYgKHRwID4gZXApCQkJCQlcCisgICAgICAgICAgICBicmVhazsJCQkJCVwKKyAgICAgICAgICBp ZiAoVFJBTlModHBbLTFdKSAhPSBnYzEpCQkJXAorICAgICAgICAgICAgewkJCQkJCVwKKyAgICAg ICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV07IExBU1RfU0hJRlQ7CQlcCisgICAgICAgICAgICAg IGJyZWFrOwkJCQkJXAorICAgICAgICAgICAgfQkJCQkJCVwKKyAgICAgICAgICBza2lwID0gMTsJ CQkJCVwKKyAgICAgICAgfQkJCQkJCVwKICAgICB9CiAKLSAgKnRwcCA9IHRwOwotICByZXR1cm4g ZmFsc2U7Ci19Ci0KIC8qIFJldHVybiB0aGUgYWRkcmVzcyBvZiB0aGUgZmlyc3QgYnl0ZSBpbiB0 aGUgYnVmZmVyIFMgKG9mIHNpemUgTikKICAgIHRoYXQgbWF0Y2hlcyB0aGUgbGFzdCBieXRlIHNw ZWNpZmllZCBieSBLV1NFVCwgYSBzaW5nbGV0b24uICAqLwogc3RhdGljIGNoYXIgY29uc3QgKgpA QCAtNTkwLDcgKzYwNyw3IEBAIGJtZXhlYyAoa3dzZXRfdCBrd3NldCwgY2hhciBjb25zdCAqdGV4 dCwgc2l6ZV90IHNpemUpCiB7CiAgIHVuc2lnbmVkIGNoYXIgY29uc3QgKmQxOwogICBjaGFyIGNv bnN0ICplcCwgKnNwLCAqdHA7Ci0gIGludCBkOworICBpbnQgZCwgaSwgc2tpcDsKICAgaW50IGxl biA9IGt3c2V0LT5taW5kOwogICBjaGFyIGNvbnN0ICp0cmFucyA9IGt3c2V0LT50cmFuczsKIApA QCAtNjQ2LDggKzY2MywyMCBAQCBibWV4ZWMgKGt3c2V0X3Qga3dzZXQsIGNoYXIgY29uc3QgKnRl eHQsIHNpemVfdCBzaXplKQogICAgICAgICAgICAgICAgICAgfQogICAgICAgICAgICAgICB9CiAg ICAgICAgICAgfQotICAgICAgICBpZiAoYm1fZGVsdGEyX3NlYXJjaCAoJnRwLCBlcCwgc3AsIGxl biwgdHJhbnMsIGdjMSwgZ2MyLCBkMSwga3dzZXQpKQotICAgICAgICAgIHJldHVybiB0cCAtIHRl eHQ7CisjdW5kZWYgTEFTVF9TSElGVAorI2RlZmluZSBMQVNUX1NISUZUIHRwICs9IGQKKyAgICAg ICAgaWYgKHRyYW5zKQorICAgICAgICAgIHsKKyN1bmRlZiBUUkFOUworI2RlZmluZSBUUkFOUyhj aCkgdHJhbnNbVShjaCldCisgICAgICAgICAgICBCTV9ERUxUQTJfU0VBUkNIOworICAgICAgICAg IH0KKyAgICAgICAgZWxzZQorICAgICAgICAgIHsKKyN1bmRlZiBUUkFOUworI2RlZmluZSBUUkFO UyhjaCkgY2gKKyAgICAgICAgICAgIEJNX0RFTFRBMl9TRUFSQ0g7CisgICAgICAgICAgfQogICAg ICAgYmlnX2FkdmFuY2U6OwogICAgICAgfQogCkBAIC02NjAsOCArNjg5LDIwIEBAIGJtZXhlYyAo a3dzZXRfdCBrd3NldCwgY2hhciBjb25zdCAqdGV4dCwgc2l6ZV90IHNpemUpCiAgICAgICBkID0g ZDFbVSgodHAgKz0gZClbLTFdKV07CiAgICAgICBpZiAoZCAhPSAwKQogICAgICAgICBjb250aW51 ZTsKLSAgICAgIGlmIChibV9kZWx0YTJfc2VhcmNoICgmdHAsIGVwLCBzcCwgbGVuLCB0cmFucywg Z2MxLCBnYzIsIE5VTEwsIGt3c2V0KSkKLSAgICAgICAgcmV0dXJuIHRwIC0gdGV4dDsKKyN1bmRl ZiBMQVNUX1NISUZUCisjZGVmaW5lIExBU1RfU0hJRlQKKyAgICAgIGlmICh0cmFucykKKyAgICAg ICAgeworI3VuZGVmIFRSQU5TCisjZGVmaW5lIFRSQU5TKGNoKSB0cmFuc1tVKGNoKV0KKyAgICAg ICAgICBCTV9ERUxUQTJfU0VBUkNIOworICAgICAgICB9CisgICAgICBlbHNlCisgICAgICAgIHsK KyN1bmRlZiBUUkFOUworI2RlZmluZSBUUkFOUyhjaCkgY2gKKyAgICAgICAgICBCTV9ERUxUQTJf U0VBUkNIOworICAgICAgICB9CiAgICAgfQogCiAgIHJldHVybiAtMTsKLS0gCjEuOS4yCgo= --------_535CB499000000005C94_MULTIPART_MIXED_-- From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 27 Apr 2014 20:34:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: Norihiro Tanaka Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.13986307964105 (code B ref 17229); Sun, 27 Apr 2014 20:34:02 +0000 Received: (at 17229) by debbugs.gnu.org; 27 Apr 2014 20:33:16 +0000 Received: from localhost ([127.0.0.1]:43287 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WeVlD-000144-Lk for submit@debbugs.gnu.org; Sun, 27 Apr 2014 16:33:16 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:35038) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WeVl8-00013c-N4 for 17229@debbugs.gnu.org; Sun, 27 Apr 2014 16:33:11 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id C3078A60002; Sun, 27 Apr 2014 13:33:04 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ijyT2EPraF3T; Sun, 27 Apr 2014 13:33:00 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 975C839E801F; Sun, 27 Apr 2014 13:33:00 -0700 (PDT) Message-ID: <535D6979.3060805@cs.ucla.edu> Date: Sun, 27 Apr 2014 13:32:57 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 References: <20140426174357.94D5.27F6AC2D@kcn.ne.jp> <535C81BD.7070200@cs.ucla.edu> <20140427165308.5C46.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140427165308.5C46.27F6AC2D@kcn.ne.jp> Content-Type: multipart/mixed; boundary="------------090809030401010701010504" X-Spam-Score: -3.0 (---) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) This is a multi-part message in MIME format. --------------090809030401010701010504 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Norihiro Tanaka wrote: > Could you try above cases? Thanks, you're observing a 2.7x performance speedup with macros on your platform and your benchmark. With the same patch, I observed only a 1.18x speedup on the same benchmark. As usual, I'm testing with AMD Phenom II X4 910e + GCC 4.9.0 + Fedora 20 + default (-O2) optimization. I'm curious about why you're observing a much bigger performance difference with macros. What platform are you using? Anyway, an 18% speedup is still a speedup, so I looked into it. GCC 4.9.0 misses a non-obvious opportunity for function inlining. I installed a tweak (attached) that should make the inlining opportunity obvious to compilers nowadays. On my platform this gave a 28% speedup, i.e., a bit better than the macro-using patch would have. --------------090809030401010701010504 Content-Type: text/plain; charset=UTF-8; name="0001-kwset-improve-performance-by-inlining-more.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="0001-kwset-improve-performance-by-inlining-more.patch" RnJvbSA2MTQ5N2ZiNWNjZGFkOTk3M2E3MWEwNGY3M2Y5ZDQyNTI2MDkzOTViIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBTdW4sIDI3IEFwciAyMDE0IDEzOjAxOjE3IC0wNzAwClN1YmplY3Q6IFtQQVRD SF0ga3dzZXQ6IGltcHJvdmUgcGVyZm9ybWFuY2UgYnkgaW5saW5pbmcgbW9yZQoKUHJvYmxl bSByZXBvcnRlZCBieSBOb3JpaGlybyBUYW5ha2EgaW4gPGh0dHA6Ly9idWdzLmdudS5vcmcv MTcyMjkjNTU+LgoqIHNyYy9rd3NldC5jIChibWV4ZWNfdHJhbnMpOiBSZW5hbWUgZnJvbSBi bWV4ZWMsIGFuZCBtYWtlIGl0IGlubGluZS4KKGJtZXhlYyk6IE5ldyBpbXBsZW1lbnRhdGlv biwgd2hpY2ggY2FsbHMgYm1leGVjX3RyYW5zLiAgVGhpcyBoZWxwcwpHQ0MgaW5saW5lIG1v cmUgYWdncmVzc2l2ZWx5IHdpdGggdGhlIGRlZmF1bHQgb3B0aW1pemF0aW9uLCBhbmQKaW1w cm92ZXMgcGVyZm9ybWFuY2UgMjUlIHdpdGggdGhlIHJlcG9ydGVkIGJlbmNobWFyayBvbiBt eSBob3N0LgotLS0KIHNyYy9rd3NldC5jIHwgMTcgKysrKysrKysrKysrKystLS0KIDEgZmls ZSBjaGFuZ2VkLCAxNCBpbnNlcnRpb25zKCspLCAzIGRlbGV0aW9ucygtKQoKZGlmZiAtLWdp dCBhL3NyYy9rd3NldC5jIGIvc3JjL2t3c2V0LmMKaW5kZXggODI3MGYwNS4uOGU5YjUxMCAx MDA2NDQKLS0tIGEvc3JjL2t3c2V0LmMKKysrIGIvc3JjL2t3c2V0LmMKQEAgLTU4NCw5ICs1 ODQsOSBAQCBtZW1jaHJfa3dzZXQgKGNoYXIgY29uc3QgKnMsIHNpemVfdCBuLCBrd3NldF90 IGt3c2V0KQogICByZXR1cm4gbiA9PSAwID8gTlVMTCA6IG1lbWNocjIgKHMsIGt3c2V0LT5n YzEsIGt3c2V0LT5nYzFoZWxwLCBuKTsKIH0KIAotLyogRmFzdCBib3llci1tb29yZSBzZWFy Y2guICovCi1zdGF0aWMgc2l6ZV90IF9HTF9BVFRSSUJVVEVfUFVSRQotYm1leGVjIChrd3Nl dF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0ZXh0LCBzaXplX3Qgc2l6ZSkKKy8qIEZhc3QgQm95 ZXItTW9vcmUgc2VhcmNoIChpbmxpbmFibGUgdmVyc2lvbikuICAqLworc3RhdGljIGlubGlu ZSBzaXplX3QgX0dMX0FUVFJJQlVURV9QVVJFCitibWV4ZWNfdHJhbnMgKGt3c2V0X3Qga3dz ZXQsIGNoYXIgY29uc3QgKnRleHQsIHNpemVfdCBzaXplKQogewogICB1bnNpZ25lZCBjaGFy IGNvbnN0ICpkMTsKICAgY2hhciBjb25zdCAqZXAsICpzcCwgKnRwOwpAQCAtNjY3LDYgKzY2 NywxNyBAQCBibWV4ZWMgKGt3c2V0X3Qga3dzZXQsIGNoYXIgY29uc3QgKnRleHQsIHNpemVf dCBzaXplKQogICByZXR1cm4gLTE7CiB9CiAKKy8qIEZhc3QgQm95ZXItTW9vcmUgc2VhcmNo LiAgKi8KK3N0YXRpYyBzaXplX3QKK2JtZXhlYyAoa3dzZXRfdCBrd3NldCwgY2hhciBjb25z dCAqdGV4dCwgc2l6ZV90IHNpemUpCit7CisgIC8qIEhlbHAgdGhlIGNvbXBpbGVyIGlubGlu ZSBibWV4ZWNfdHJhbnMgaW4gdHdvIHdheXMsIGRlcGVuZGluZyBvbgorICAgICB3aGV0aGVy IGt3c2V0LT50cmFucyBpcyBudWxsLiAgKi8KKyAgcmV0dXJuIChrd3NldC0+dHJhbnMKKyAg ICAgICAgICA/IGJtZXhlY190cmFucyAoa3dzZXQsIHRleHQsIHNpemUpCisgICAgICAgICAg OiBibWV4ZWNfdHJhbnMgKGt3c2V0LCB0ZXh0LCBzaXplKSk7Cit9CisKIC8qIEhhaXJ5IG11 bHRpcGxlIHN0cmluZyBzZWFyY2guICovCiBzdGF0aWMgc2l6ZV90IF9HTF9BUkdfTk9OTlVM TCAoKDQpKQogY3dleGVjIChrd3NldF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0ZXh0LCBzaXpl X3QgbGVuLCBzdHJ1Y3Qga3dzbWF0Y2ggKmt3c21hdGNoKQotLSAKMS45LjAKCg== --------------090809030401010701010504-- From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Jim Meyering Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Mon, 28 Apr 2014 04:07:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: Paul Eggert Cc: Norihiro Tanaka , 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.139865796227848 (code B ref 17229); Mon, 28 Apr 2014 04:07:02 +0000 Received: (at 17229) by debbugs.gnu.org; 28 Apr 2014 04:06:02 +0000 Received: from localhost ([127.0.0.1]:43428 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WecpN-0007Ez-J0 for submit@debbugs.gnu.org; Mon, 28 Apr 2014 00:06:02 -0400 Received: from mail-yk0-f174.google.com ([209.85.160.174]:44281) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WecpK-0007Ek-Iv for 17229@debbugs.gnu.org; Mon, 28 Apr 2014 00:05:59 -0400 Received: by mail-yk0-f174.google.com with SMTP id 20so5277445yks.5 for <17229@debbugs.gnu.org>; Sun, 27 Apr 2014 21:05:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=CY2eYPxT8eMj9ABqurr5eZRouvoaYwtmt3MkDWwHfe0=; b=IpRQEOUsazroQTIqXzsXRHeh3sV2GjxKdgUBmaX+x5Kv5z960e0YzyEjWpLGL1wjRr /1AgxGZi7Nm0EQAfopZ/FeuUIkLK5re2l3RiZherr1ijGWrxDtIBpboL+BpJPbxa/3nr rMLpUvq+MoCPIFN1aeKT64TNDrVLtol/w9PJgYhmT+0nL/3gvKs4OHzaKPRkL1C8ZR/n mT4OzmxASKUnan5D0E0uz0PEJjObWZNovqpTMLzZy1Xdf5/vOCVHHY0gBlhBnRRi4+Jw qWvBwxeVtrcrUE5TR/fnLOpistqI0y5BDYlAEYNA8J2sX1/gecERu9BgAr0MA7NvSIYP Ur8Q== X-Received: by 10.236.220.72 with SMTP id n68mr357203yhp.102.1398657952789; Sun, 27 Apr 2014 21:05:52 -0700 (PDT) MIME-Version: 1.0 Received: by 10.170.127.18 with HTTP; Sun, 27 Apr 2014 21:05:32 -0700 (PDT) In-Reply-To: <535D6979.3060805@cs.ucla.edu> References: <20140426174357.94D5.27F6AC2D@kcn.ne.jp> <535C81BD.7070200@cs.ucla.edu> <20140427165308.5C46.27F6AC2D@kcn.ne.jp> <535D6979.3060805@cs.ucla.edu> From: Jim Meyering Date: Sun, 27 Apr 2014 21:05:32 -0700 X-Google-Sender-Auth: 9KCJLQfmWULjCdFCPUxDGhOEDtY Message-ID: Content-Type: text/plain; charset=ISO-8859-1 X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Sun, Apr 27, 2014 at 1:32 PM, Paul Eggert wrote: > Anyway, an 18% speedup is still a speedup, so I looked into it. GCC 4.9.0 > misses a non-obvious opportunity for function inlining. I installed a tweak > (attached) that should make the inlining opportunity obvious to compilers > nowadays. On my platform this gave a 28% speedup, i.e., a bit better than > the macro-using patch would have. Very nice. I measured a 34% improvement on an i7-3615QM with -O2 and gcc built from git (4.10.0 20140424) From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Norihiro Tanaka Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Mon, 28 Apr 2014 12:47:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: Paul Eggert Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.13986891661119 (code B ref 17229); Mon, 28 Apr 2014 12:47:01 +0000 Received: (at 17229) by debbugs.gnu.org; 28 Apr 2014 12:46:06 +0000 Received: from localhost ([127.0.0.1]:43648 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wekwg-0000Hy-78 for submit@debbugs.gnu.org; Mon, 28 Apr 2014 08:46:06 -0400 Received: from mailgw05.kcn.ne.jp ([61.86.7.212]:61076) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wekwd-0000Di-Kv for 17229@debbugs.gnu.org; Mon, 28 Apr 2014 08:46:04 -0400 Received: from imp03 (mailgw7.kcn.ne.jp [61.86.15.238]) by mailgw05.kcn.ne.jp (Postfix) with ESMTP id AD86F67A16 for <17229@debbugs.gnu.org>; Mon, 28 Apr 2014 21:45:56 +0900 (JST) Received: from mail08.kcn.ne.jp ([61.86.6.187]) by imp03 with bizsmtp id vQlw1n00D426eXR01QlwH1; Mon, 28 Apr 2014 21:45:56 +0900 X-OrgRCPT: 17229@debbugs.gnu.org Received: from [10.120.1.47] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail08.kcn.ne.jp (Postfix) with ESMTPA id 35EAC12B8099; Mon, 28 Apr 2014 21:45:56 +0900 (JST) Date: Mon, 28 Apr 2014 21:45:55 +0900 From: Norihiro Tanaka In-Reply-To: <535D6979.3060805@cs.ucla.edu> References: <20140427165308.5C46.27F6AC2D@kcn.ne.jp> <535D6979.3060805@cs.ucla.edu> Message-Id: <20140428214553.7447.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------_535E4A01000000007442_MULTIPART_MIXED_" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --------_535E4A01000000007442_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Paul Eggert wrote: > Anyway, an 18% speedup is still a speedup, so I looked into it. > GCC 4.9.0 misses a non-obvious opportunity for function inlining. I > installed a tweak (attached) that should make the inlining opportunity > obvious to compilers nowadays. On my platform this gave a 28% speedup, > i.e., a bit better than the macro-using patch would have. You are right. My compiler was too old. It was GCC 4.1.2 on CentOS 5.10. I retried it with GCC 4.4.7, and got the good performance. # Although I tried to build GCC 4.9.0, it hasn't carried out well yet. By the way, I examined the reason why it was slow on GCC 4.1.2, and I found that tr() isn't inlining without `-finline-loops' option, because `-finline-small-functions' option can be used from GCC 4.3. Although I submit the patch, it mayn't be so important. Thanks, Norihiro --------_535E4A01000000007442_MULTIPART_MIXED_ Content-Type: text/plain; charset="UTF-8"; name="patch.txt" Content-Disposition: attachment; filename="patch.txt" Content-Transfer-Encoding: base64 RnJvbSA0Y2UwNjdiNGQ3NWJjNjVjMjU0MzFhYTlkZWZiYjJhMDdmYzNkZjIzIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE YXRlOiBNb24sIDI4IEFwciAyMDE0IDIxOjI4OjUxICswOTAwClN1YmplY3Q6IFtQQVRDSF0ga3dz ZXQ6IGltcHJvdmUgdGhlIHBlcmZvcm1hbmNlIGJ5IGlubGluaW5nIHRyCgoqIHNyYy9rd3NldC5j ICh0cik6IE1ha2UgaXQgaW5saW5lLgotLS0KIHNyYy9rd3NldC5jIHwgMiArLQogMSBmaWxlIGNo YW5nZWQsIDEgaW5zZXJ0aW9uKCspLCAxIGRlbGV0aW9uKC0pCgpkaWZmIC0tZ2l0IGEvc3JjL2t3 c2V0LmMgYi9zcmMva3dzZXQuYwppbmRleCA4ZTliNTEwLi42ZDIxODkzIDEwMDY0NAotLS0gYS9z cmMva3dzZXQuYworKysgYi9zcmMva3dzZXQuYwpAQCAtMTE0LDcgKzExNCw3IEBAIHN0cnVjdCBr d3NldAogfTsKIAogLyogVXNlIFRSQU5TIHRvIHRyYW5zbGl0ZXJhdGUgQy4gIEEgbnVsbCBUUkFO UyBkb2VzIG5vIHRyYW5zbGl0ZXJhdGlvbi4gICovCi1zdGF0aWMgY2hhcgorc3RhdGljIGlubGlu ZSBjaGFyCiB0ciAoY2hhciBjb25zdCAqdHJhbnMsIGNoYXIgYykKIHsKICAgcmV0dXJuIHRyYW5z ID8gdHJhbnNbVShjKV0gOiBjOwotLSAKMS45LjIKCg== --------_535E4A01000000007442_MULTIPART_MIXED_-- From unknown Tue Sep 09 00:44:17 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Wed, 30 Apr 2014 01:39:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17229 X-GNU-PR-Package: grep X-GNU-PR-Keywords: moreinfo patch To: Norihiro Tanaka Cc: 17229@debbugs.gnu.org Received: via spool by 17229-submit@debbugs.gnu.org id=B17229.139882190314820 (code B ref 17229); Wed, 30 Apr 2014 01:39:01 +0000 Received: (at 17229) by debbugs.gnu.org; 30 Apr 2014 01:38:23 +0000 Received: from localhost ([127.0.0.1]:45816 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WfJTb-0003qy-4X for submit@debbugs.gnu.org; Tue, 29 Apr 2014 21:38:23 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:54514) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WfJTY-0003qi-9s for 17229@debbugs.gnu.org; Tue, 29 Apr 2014 21:38:21 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 675E9A60038; Tue, 29 Apr 2014 18:38:14 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9+V49vcNx2Tf; Tue, 29 Apr 2014 18:38:05 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id C73E1A60001; Tue, 29 Apr 2014 18:38:05 -0700 (PDT) Message-ID: <536053FD.20404@cs.ucla.edu> Date: Tue, 29 Apr 2014 18:38:05 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 References: <20140427165308.5C46.27F6AC2D@kcn.ne.jp> <535D6979.3060805@cs.ucla.edu> <20140428214553.7447.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140428214553.7447.27F6AC2D@kcn.ne.jp> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.0 (---) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) > Subject: [PATCH] kwset: improve the performance by inlining tr Thanks. Normally we don't add 'inline' to functions, under the theory that compilers inline well-enough nowadays that we'd rather trust their judgment, but I suppose CentOS 5 is still relevant-enough and the performance effects great enough that we can make an exception here. Plus, we're already using 'inline' elsewhere in this module for performance reasons, and it's unlikely that a non-inlined 'tr' would perform better on any current platform. So, I installed that patch. From unknown Tue Sep 09 00:44:17 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.503 (Entity 5.503) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Norihiro Tanaka Subject: bug#17229: closed (bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching) Message-ID: References: <20140501081053.5D0A.27F6AC2D@kcn.ne.jp> <20140409225442.7863.27F6AC2D@kcn.ne.jp> X-Gnu-PR-Message: they-closed 17229 X-Gnu-PR-Package: grep X-Gnu-PR-Keywords: moreinfo patch Reply-To: 17229@debbugs.gnu.org Date: Wed, 30 Apr 2014 23:12:03 +0000 Content-Type: multipart/mixed; boundary="----------=_1398899523-11829-1" This is a multi-part message in MIME format... ------------=_1398899523-11829-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searchi= ng which was filed against the grep package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 17229@debbugs.gnu.org. --=20 17229: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D17229 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1398899523-11829-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 17229-done) by debbugs.gnu.org; 30 Apr 2014 23:11:05 +0000 Received: from localhost ([127.0.0.1]:46891 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wfdea-000331-Ny for submit@debbugs.gnu.org; Wed, 30 Apr 2014 19:11:05 -0400 Received: from mailgw04.kcn.ne.jp ([61.86.7.211]:43610) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WfdeX-00032H-Ln for 17229-done@debbugs.gnu.org; Wed, 30 Apr 2014 19:11:02 -0400 Received: from imp01 (mailgw5.kcn.ne.jp [61.86.15.231]) by mailgw04.kcn.ne.jp (Postfix) with ESMTP id 508696C1E83 for <17229-done@debbugs.gnu.org>; Thu, 1 May 2014 08:10:54 +0900 (JST) Received: from mail06.kcn.ne.jp ([61.86.6.185]) by imp01 with bizsmtp id wPAu1n0083zXHqt01PAuay; Thu, 01 May 2014 08:10:54 +0900 X-OrgRCPT: 17229-done@debbugs.gnu.org Received: from [10.120.1.58] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail06.kcn.ne.jp (Postfix) with ESMTPA id 0EC8C1BF0094; Thu, 1 May 2014 08:10:54 +0900 (JST) Date: Thu, 01 May 2014 08:10:53 +0900 From: Norihiro Tanaka To: Paul Eggert Subject: bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching In-Reply-To: <536053FD.20404@cs.ucla.edu> References: <20140428214553.7447.27F6AC2D@kcn.ne.jp> <536053FD.20404@cs.ucla.edu> Message-Id: <20140501081053.5D0A.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 17229-done Cc: 17229-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Thanks. Closing, all requests suggested in this bug was considered. ------------=_1398899523-11829-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 9 Apr 2014 13:55:30 +0000 Received: from localhost ([127.0.0.1]:38696 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WXsyO-0002hI-P6 for submit@debbugs.gnu.org; Wed, 09 Apr 2014 09:55:29 -0400 Received: from eggs.gnu.org ([208.118.235.92]:45960) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WXsyH-0002ge-K1 for submit@debbugs.gnu.org; Wed, 09 Apr 2014 09:55:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WXsy4-0006Rt-Vr for submit@debbugs.gnu.org; Wed, 09 Apr 2014 09:55:16 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:46956) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WXsy4-0006Rp-Sf for submit@debbugs.gnu.org; Wed, 09 Apr 2014 09:55:08 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47632) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WXsxy-0000by-Jp for bug-grep@gnu.org; Wed, 09 Apr 2014 09:55:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WXsxr-00067l-Ou for bug-grep@gnu.org; Wed, 09 Apr 2014 09:55:02 -0400 Received: from pbsg500.nifty.com ([202.248.238.70]:34749) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WXsxr-00066W-9F for bug-grep@gnu.org; Wed, 09 Apr 2014 09:54:55 -0400 Received: from [10.120.1.47] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) (authenticated) by pbsg500.nifty.com with ESMTP id s39Dsfv5017066 for ; Wed, 9 Apr 2014 22:54:42 +0900 X-Nifty-SrcIP: [118.21.128.66] Date: Wed, 09 Apr 2014 22:54:43 +0900 From: Norihiro Tanaka To: bug-grep@gnu.org Subject: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching Message-Id: <20140409225442.7863.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------_5343FA290000000010DE_MULTIPART_MIXED_" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) --------_5343FA290000000010DE_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit memchr() of glibc is faster than seeking by delta1 on some platforms. So, when there is no chance to match for a while, use it on them. Speed-up about 3x in best case, 10% in normal case by this patch. It's still available only for x86 and x86-64 platform. Norihiro --------_5343FA290000000010DE_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII"; name="patch.txt" Content-Disposition: attachment; filename="patch.txt" Content-Transfer-Encoding: base64 RnJvbSA5MGFjMjU5NzUxMWRkOGVkNDRlZWU2MThkZjM5YzFlZjk0N2VhMTI5IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE YXRlOiBUaHUsIDMgQXByIDIwMTQgMjI6NTc6MDkgKzA5MDAKU3ViamVjdDogW1BBVENIIDIvMl0g Z3JlcDogc3BlZWQtdXAgYnkgdXNpbmcgbWVtY2hyKCkgaW4gQm95ZXItTW9vcmUgc2VhcmNoaW5n CgptZW1jaHIoKSBvZiBnbGliYyBpcyBmYXN0ZXIgdGhhbiBzZWVraW5nIGJ5IGRlbHRhMSBvbiBz b21lIHBsYXRmb3Jtcy4KV2hlbiB0aGVyZSBpcyBubyBjaGFuY2UgdG8gbWF0Y2ggZm9yIGEgd2hp bGUsIHVzZSBpdCBvbiB0aGVtLgoqIHNyYy9rd3NldC5jIChibWV4ZWMpOiBVc2UgbWVtY2hyKCkg aW4gQm95ZXItTW9vcmUgc2VhcmNoaW5nLgotLS0KIHNyYy9rd3NldC5jIHwgMjMgKysrKysrKysr KysrKysrKysrKysrLS0KIDEgZmlsZSBjaGFuZ2VkLCAyMSBpbnNlcnRpb25zKCspLCAyIGRlbGV0 aW9ucygtKQoKZGlmZiAtLWdpdCBhL3NyYy9rd3NldC5jIGIvc3JjL2t3c2V0LmMKaW5kZXggOGQ4 YWEwNy4uYzk1YmM1NiAxMDA2NDQKLS0tIGEvc3JjL2t3c2V0LmMKKysrIGIvc3JjL2t3c2V0LmMK QEAgLTUzMSw4ICs1MzEsMjcgQEAgYm1leGVjIChrd3NldF90IGt3c2V0LCBjaGFyIGNvbnN0ICp0 ZXh0LCBzaXplX3Qgc2l6ZSkKICAgICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBk OwogICAgICAgICAgICAgaWYgKGQgPT0gMCkKICAgICAgICAgICAgICAgZ290byBmb3VuZDsKLSAg ICAgICAgICAgIGQgPSBkMVtVKHRwWy0xXSldLCB0cCArPSBkOwotICAgICAgICAgICAgZCA9IGQx W1UodHBbLTFdKV0sIHRwICs9IGQ7CisgICAgICAgICAgICAvKiBtZW1jaGFyKCkgb2YgZ2xpYmMg aXMgZmFzdGVyIHRoYW4gc2Vla2luZyBieSBkZWx0YTEgb24KKyAgICAgICAgICAgICAgIHNvbWUg cGxhdGZvcm1zLiAgV2hlbiB0aGVyZSBpcyBubyBjaGFuY2UgdG8gbWF0Y2ggZm9yIGEKKyAgICAg ICAgICAgICAgIHdoaWxlLCB1c2UgaXQgb24gdGhlbS4gICovCisjaWYgZGVmaW5lZChfX0dMSUJD X18pICYmIChkZWZpbmVkKF9faTM4Nl9fKSB8fCBkZWZpbmVkKF9feDg2XzY0X18pKQorICAgICAg ICAgICAgaWYgKCF0cmFucykKKyAgICAgICAgICAgICAgeworICAgICAgICAgICAgICAgIHRwID0g bWVtY2hyICh0cCAtIDEsIGdjMSwgc2l6ZSArIHRleHQgLSB0cCArIDEpOworICAgICAgICAgICAg ICAgIGlmICh0cCkKKyAgICAgICAgICAgICAgICAgIHsKKyAgICAgICAgICAgICAgICAgICAgKyt0 cDsKKyAgICAgICAgICAgICAgICAgICAgZ290byBmb3VuZDsKKyAgICAgICAgICAgICAgICAgIH0K KyAgICAgICAgICAgICAgICBlbHNlCisgICAgICAgICAgICAgICAgICByZXR1cm4gLTE7CisgICAg ICAgICAgICAgIH0KKyAgICAgICAgICAgIGVsc2UKKyNlbmRpZgorICAgICAgICAgICAgICB7Cisg ICAgICAgICAgICAgICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7CisgICAgICAgICAgICAg ICAgZCA9IGQxW1UodHBbLTFdKV0sIHRwICs9IGQ7CisgICAgICAgICAgICAgIH0KICAgICAgICAg ICB9CiAgICAgICAgIGJyZWFrOwogICAgICAgZm91bmQ6Ci0tIAoxLjkuMQoK --------_5343FA290000000010DE_MULTIPART_MIXED_-- ------------=_1398899523-11829-1--