From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 05 07:40:30 2014 Received: (at submit) by debbugs.gnu.org; 5 Jun 2014 11:40:30 +0000 Received: from localhost ([127.0.0.1]:44821 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WsW21-0000ye-Tx for submit@debbugs.gnu.org; Thu, 05 Jun 2014 07:40:30 -0400 Received: from eggs.gnu.org ([208.118.235.92]:37884) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WsW1z-0000yO-Gq for submit@debbugs.gnu.org; Thu, 05 Jun 2014 07:40:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WsW1k-0004ym-Os for submit@debbugs.gnu.org; Thu, 05 Jun 2014 07:40:22 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:43419) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WsW1k-0004xV-LX for submit@debbugs.gnu.org; Thu, 05 Jun 2014 07:40:12 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34701) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WsVuk-0007Mn-5b for bug-grep@gnu.org; Thu, 05 Jun 2014 07:33:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WsVuc-00023h-Gq for bug-grep@gnu.org; Thu, 05 Jun 2014 07:32:58 -0400 Received: from mailgw04.kcn.ne.jp ([61.86.7.211]:47429) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WsVub-00021w-W6 for bug-grep@gnu.org; Thu, 05 Jun 2014 07:32:50 -0400 Received: from imp01 (mailgw5.kcn.ne.jp [61.86.15.231]) by mailgw04.kcn.ne.jp (Postfix) with ESMTP id DF76F6C17DC for ; Thu, 5 Jun 2014 20:32:44 +0900 (JST) Received: from mail05.kcn.ne.jp ([61.86.6.184]) by imp01 with bizsmtp id AbYk1o01F3yDdWd01bYk1L; Thu, 05 Jun 2014 20:32:44 +0900 X-OrgRCPT: bug-grep@gnu.org Received: from [10.120.1.47] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail05.kcn.ne.jp (Postfix) with ESMTPA id 92F967D009F for ; Thu, 5 Jun 2014 20:32:44 +0900 (JST) Date: Thu, 05 Jun 2014 20:32:44 +0900 From: Norihiro Tanaka To: bug-grep@gnu.org Subject: [PATCH] dfa: speed-up for a pattern that many atoms are catenated Message-Id: <20140605203227.5BEF.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------_5390548D000000005BE4_MULTIPART_MIXED_" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) --------_5390548D000000005BE4_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit When many atoms are catenated, dfamust() is very slow in order that pushing a string into `in' list is slow. This change fixes it. I tested below to confirm the effect. $ printf '%08192d\n' 0 | time -p src/grep -f - /dev/null --------_5390548D000000005BE4_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII"; name="0001-dfa-speed-up-for-a-pattern-that-many-atoms-are-caten.patch" Content-Disposition: attachment; filename="0001-dfa-speed-up-for-a-pattern-that-many-atoms-are-caten.patch" Content-Transfer-Encoding: base64 RnJvbSBkN2ZmNDUzYTJhYWM1NTdmYjEzMzEwMGI5NGUyZmM3YzAwN2ZiZTI2IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE YXRlOiBUdWUsIDIwIE1heSAyMDE0IDEwOjE4OjQyICswOTAwClN1YmplY3Q6IFtQQVRDSF0gZGZh OiBzcGVlZC11cCBmb3IgYSBwYXR0ZXJuIHRoYXQgbWFueSBhdG9tcyBhcmUgY2F0ZW5hdGVkCgpX aGVuIG1hbnkgYXRvbXMgYXJlIGNhdGVuYXRlZCwgZGZhbXVzdCgpIGlzIHZlcnkgc2xvdyBpbiBv cmRlciB0aGF0CnB1c2hpbmcgYSBzdHJpbmcgaW50byBgaW4nIGxpc3QgaXMgc2xvdy4gIFRoaXMg Y2hhbmdlIGZpeGVzIGl0LgoKKiBzcmMvZGZhLmMgKGlzdHJzdHIpOiBVc2UgbWVtY2hyKCkgaW4g b3JkZXIgdG8gZmluZCBhIHBvdGVudGlhbCBtYXRjaC4KKGVubGlzdCk6IElmIHRoZXJlIGlzIGFs cmVhZHkgc29tZXRoaW5nIGluIHRoZSBsaXN0LCBkb24ndCBhbGxvY2F0ZQptZW1vcnkuCihkZmFt dXN0KTogSWYgZG9uJ3QgaGF2ZSB0byBjaGVjayBpbiB0aGUgbGlzdCwgZW5saXN0IGEgbmV3IGVu dHJ5CmRpcmVjdGx5LgotLS0KIHNyYy9kZmEuYyB8IDU3ICsrKysrKysrKysrKysrKysrKysrKysr KysrKysrKysrKysrKystLS0tLS0tLS0tLS0tLS0tLS0tLQogMSBmaWxlIGNoYW5nZWQsIDM3IGlu c2VydGlvbnMoKyksIDIwIGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL3NyYy9kZmEuYyBiL3Ny Yy9kZmEuYwppbmRleCA0OGE4M2NkLi45ZjZkNTg4IDEwMDY0NAotLS0gYS9zcmMvZGZhLmMKKysr IGIvc3JjL2RmYS5jCkBAIC0zNzUzLDE1ICszNzUzLDI0IEBAIGljYXRhbGxvYyAoY2hhciAqb2xk LCBjaGFyIGNvbnN0ICpuZXcpCiB9CiAKIHN0YXRpYyBjaGFyICpfR0xfQVRUUklCVVRFX1BVUkUK LWlzdHJzdHIgKGNoYXIgY29uc3QgKmxvb2tpbiwgY2hhciBjb25zdCAqbG9va2ZvcikKK2lzdHJz dHIgKGNoYXIgY29uc3QgKmxvb2tpbiwgY2hhciBjb25zdCAqbG9va2Zvciwgc2l6ZV90IGNvbnN0 IGxlbmZvcikKIHsKICAgY2hhciBjb25zdCAqY3A7Ci0gIHNpemVfdCBsZW47CisgIHNpemVfdCBs ZW5pbjsKIAotICBsZW4gPSBzdHJsZW4gKGxvb2tmb3IpOwotICBmb3IgKGNwID0gbG9va2luOyAq Y3AgIT0gJ1wwJzsgKytjcCkKLSAgICBpZiAoc3RybmNtcCAoY3AsIGxvb2tmb3IsIGxlbikgPT0g MCkKLSAgICAgIHJldHVybiAoY2hhciAqKSBjcDsKKyAgbGVuaW4gPSBzdHJsZW4gKGxvb2tpbik7 CisgIGlmIChsZW5mb3IgPT0gMSkKKyAgICByZXR1cm4gbWVtY2hyIChsb29raW4sICpsb29rZm9y LCBsZW5pbik7CisKKyAgY3AgPSBsb29raW47CisgIGZvciAoY3AgPSBsb29raW47IGNwIDwgbG9v a2luICsgbGVuaW4gLSBsZW5mb3I7ICsrY3ApCisgICAgeworICAgICAgbWVtY2hyIChjcCwgKmxv b2tmb3IsIGxlbmluIC0gKGNwIC0gbG9va2luKSk7CisgICAgICBpZiAoIWNwKQorICAgICAgICBy ZXR1cm4gTlVMTDsKKyAgICAgIGlmIChzdHJuY21wIChjcCArIDEsIGxvb2tmb3IgKyAxLCBsZW5m b3IgLSAxKSA9PSAwKQorICAgICAgICByZXR1cm4gKGNoYXIgKikgY3A7CisgICAgfQogICByZXR1 cm4gTlVMTDsKIH0KIApAQCAtMzc3NiwxOSArMzc4NSwxNiBAQCBzdGF0aWMgY2hhciAqKgogZW5s aXN0IChjaGFyICoqY3BwLCBjaGFyICpuZXcsIHNpemVfdCBsZW4pCiB7CiAgIHNpemVfdCBpLCBq OwotICBuZXcgPSBtZW1jcHkgKHhtYWxsb2MgKGxlbiArIDEpLCBuZXcsIGxlbik7Ci0gIG5ld1ts ZW5dID0gJ1wwJzsKICAgLyogSXMgdGhlcmUgYWxyZWFkeSBzb21ldGhpbmcgaW4gdGhlIGxpc3Qg dGhhdCdzIG5ldyAob3IgbG9uZ2VyKT8gICovCiAgIGZvciAoaSA9IDA7IGNwcFtpXSAhPSBOVUxM OyArK2kpCi0gICAgaWYgKGlzdHJzdHIgKGNwcFtpXSwgbmV3KSAhPSBOVUxMKQotICAgICAgewot ICAgICAgICBmcmVlIChuZXcpOwotICAgICAgICByZXR1cm4gY3BwOwotICAgICAgfQorICAgIGlm IChpc3Ryc3RyIChjcHBbaV0sIG5ldywgbGVuKSAhPSBOVUxMKQorICAgICAgcmV0dXJuIGNwcDsK KyAgbmV3ID0gbWVtY3B5ICh4bWFsbG9jIChsZW4gKyAxKSwgbmV3LCBsZW4pOworICBuZXdbbGVu XSA9ICdcMCc7CiAgIC8qIEVsaW1pbmF0ZSBhbnkgb2Jzb2xldGVkIHN0cmluZ3MuICAqLwogICBq ID0gMDsKICAgd2hpbGUgKGNwcFtqXSAhPSBOVUxMKQotICAgIGlmIChpc3Ryc3RyIChuZXcsIGNw cFtqXSkgPT0gTlVMTCkKKyAgICBpZiAoaXN0cnN0ciAobmV3LCBjcHBbal0sIHN0cmxlbiAoY3Bw W2pdKSkgPT0gTlVMTCkKICAgICAgICsrajsKICAgICBlbHNlCiAgICAgICB7CkBAIC00MDIzLDE3 ICs0MDI5LDI4IEBAIGRmYW11c3QgKHN0cnVjdCBkZmEgKmQpCiAgICAgICAgICAgICAvKiBJbi4g IEV2ZXJ5dGhpbmcgaW4gbGVmdCwgcGx1cyBldmVyeXRoaW5nIGluCiAgICAgICAgICAgICAgICBy aWdodCwgcGx1cyBjb25jYXRlbmF0aW9uIG9mCiAgICAgICAgICAgICAgICBsZWZ0J3MgcmlnaHQg YW5kIHJpZ2h0J3MgbGVmdC4gICovCi0gICAgICAgICAgICBsbXAtPmluID0gYWRkbGlzdHMgKGxt cC0+aW4sIHJtcC0+aW4pOwogICAgICAgICAgICAgaWYgKGxtcC0+cmlnaHRbMF0gIT0gJ1wwJyAm JiBybXAtPmxlZnRbMF0gIT0gJ1wwJykKICAgICAgICAgICAgICAgewogICAgICAgICAgICAgICAg IHNpemVfdCBscmxlbiA9IHN0cmxlbiAobG1wLT5yaWdodCk7CiAgICAgICAgICAgICAgICAgc2l6 ZV90IHJsbGVuID0gc3RybGVuIChybXAtPmxlZnQpOwotICAgICAgICAgICAgICAgIGNoYXIgKnRw ID0geG1hbGxvYyAobHJsZW4gKyBybGxlbik7Ci0gICAgICAgICAgICAgICAgbWVtY3B5ICh0cCwg bG1wLT5yaWdodCwgbHJsZW4pOwotICAgICAgICAgICAgICAgIG1lbWNweSAodHAgKyBscmxlbiwg cm1wLT5sZWZ0LCBybGxlbik7Ci0gICAgICAgICAgICAgICAgbG1wLT5pbiA9IGVubGlzdCAobG1w LT5pbiwgdHAsIGxybGVuICsgcmxsZW4pOwotICAgICAgICAgICAgICAgIGZyZWUgKHRwKTsKKyAg ICAgICAgICAgICAgICBpZiAoIWxtcC0+aW5bMV0gJiYgIXJtcC0+aW5bMV0pCisgICAgICAgICAg ICAgICAgICB7CisgICAgICAgICAgICAgICAgICAgIGxtcC0+aW5bMF0gPSB4bnJlYWxsb2MgKGxt cC0+aW5bMF0sIGxybGVuICsgcmxsZW4gKyAxLAorICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICBzaXplb2YgKmxtcC0+aW5bMF0pOworICAgICAgICAgICAgICAgICAg ICBtZW1jcHkgKGxtcC0+aW5bMF0gKyBscmxlbiwgcm1wLT5pblswXSwgcmxsZW4gKyAxKTsKKyAg ICAgICAgICAgICAgICAgIH0KKyAgICAgICAgICAgICAgICBlbHNlCisgICAgICAgICAgICAgICAg ICB7CisgICAgICAgICAgICAgICAgICAgIGNoYXIgKnRwID0geG1hbGxvYyAobHJsZW4gKyBybGxl bik7CisgICAgICAgICAgICAgICAgICAgIG1lbWNweSAodHAsIGxtcC0+cmlnaHQsIGxybGVuKTsK KyAgICAgICAgICAgICAgICAgICAgbWVtY3B5ICh0cCArIGxybGVuLCBybXAtPmxlZnQsIHJsbGVu KTsKKyAgICAgICAgICAgICAgICAgICAgbG1wLT5pbiA9IGFkZGxpc3RzIChsbXAtPmluLCBybXAt PmluKTsKKyAgICAgICAgICAgICAgICAgICAgbG1wLT5pbiA9IGVubGlzdCAobG1wLT5pbiwgdHAs IGxybGVuICsgcmxsZW4pOworICAgICAgICAgICAgICAgICAgICBmcmVlICh0cCk7CisgICAgICAg ICAgICAgICAgICB9CiAgICAgICAgICAgICAgIH0KKyAgICAgICAgICAgIGVsc2UKKyAgICAgICAg ICAgICAgbG1wLT5pbiA9IGFkZGxpc3RzIChsbXAtPmluLCBybXAtPmluKTsKICAgICAgICAgICAg IC8qIExlZnQtaGFuZCAqLwogICAgICAgICAgICAgaWYgKGxtcC0+aXNbMF0gIT0gJ1wwJykKICAg ICAgICAgICAgICAgbG1wLT5sZWZ0ID0gaWNhdGFsbG9jIChsbXAtPmxlZnQsIHJtcC0+bGVmdCk7 Ci0tIAoyLjAuMAoK --------_5390548D000000005BE4_MULTIPART_MIXED_-- From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 05 12:49:09 2014 Received: (at 17700) by debbugs.gnu.org; 5 Jun 2014 16:49:09 +0000 Received: from localhost ([127.0.0.1]:45987 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wsaqe-0004gv-HH for submit@debbugs.gnu.org; Thu, 05 Jun 2014 12:49:09 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:37925) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WsaqW-0004g3-TW for 17700@debbugs.gnu.org; Thu, 05 Jun 2014 12:49:02 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id BE97D39E8013; Thu, 5 Jun 2014 09:48:50 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yOCljYoOt-32; Thu, 5 Jun 2014 09:48:42 -0700 (PDT) Received: from penguin.cs.ucla.edu (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 2AC6139E8008; Thu, 5 Jun 2014 09:48:42 -0700 (PDT) Message-ID: <53909F69.8070601@cs.ucla.edu> Date: Thu, 05 Jun 2014 09:48:41 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Norihiro Tanaka , 17700@debbugs.gnu.org Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated References: <20140605203227.5BEF.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140605203227.5BEF.27F6AC2D@kcn.ne.jp> Content-Type: multipart/mixed; boundary="------------080703040509050408010305" X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 17700 Cc: Aharon Robbins X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) This is a multi-part message in MIME format. --------------080703040509050408010305 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 06/05/2014 04:32 AM, Norihiro Tanaka wrote: > + memchr (cp, *lookfor, lenin - (cp - lookin)); > + if (!cp) Thanks, but this part can't be right, as memchr's result is discarded. It seems to me that much of the performance benefit comes from using a faster implementation of strstr, and that the DFA code will be better off if it simply uses the system strstr rather than rolling its own. (The DFA code dates back before strstr was standardized, which is why it has its own implementation.) I installed the attached patch to do that and got a big speedup: $ printf '%08192d\n' 0 | time -p src/old/grep -f - /dev/null real 16.14 user 16.13 sys 0.00 $ printf '%08192d\n' 0 | time -p src/grep -f - /dev/null real 0.79 user 0.79 sys 0.00 Could you please look at the remaining part of your patch and see whether it's a win if it's merged to what's now installed? Thanks. PS. Aharon, I assume this'll affect Gawk, in that you'll need to provide a strstr if you want to be portable to ancient systems that lack it. strstr was standardized in C89 so it'd have to be a pretty ancient system, and it may be better just to let this slide. --------------080703040509050408010305 Content-Type: text/x-patch; name="diff.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="diff.patch" diff --git a/NEWS b/NEWS index fdad83c..1af3def 100644 --- a/NEWS +++ b/NEWS @@ -2,6 +2,10 @@ GNU grep NEWS -*- outline -*- * Noteworthy changes in release ?.? (????-??-??) [?] +** Improvements + + Performance has improved for very long strings in patterns. + * Noteworthy changes in release 2.20 (2014-06-03) [stable] diff --git a/bootstrap.conf b/bootstrap.conf index 9f3457d..1ab730d 100644 --- a/bootstrap.conf +++ b/bootstrap.conf @@ -77,6 +77,7 @@ stdlib stpcpy strerror string +strstr strtoull strtoumax sys_stat diff --git a/src/dfa.c b/src/dfa.c index 48a83cd..522a027 100644 --- a/src/dfa.c +++ b/src/dfa.c @@ -3752,19 +3752,6 @@ icatalloc (char *old, char const *new) return result; } -static char *_GL_ATTRIBUTE_PURE -istrstr (char const *lookin, char const *lookfor) -{ - char const *cp; - size_t len; - - len = strlen (lookfor); - for (cp = lookin; *cp != '\0'; ++cp) - if (strncmp (cp, lookfor, len) == 0) - return (char *) cp; - return NULL; -} - static void freelist (char **cpp) { @@ -3780,7 +3767,7 @@ enlist (char **cpp, char *new, size_t len) new[len] = '\0'; /* Is there already something in the list that's new (or longer)? */ for (i = 0; cpp[i] != NULL; ++i) - if (istrstr (cpp[i], new) != NULL) + if (strstr (cpp[i], new) != NULL) { free (new); return cpp; @@ -3788,7 +3775,7 @@ enlist (char **cpp, char *new, size_t len) /* Eliminate any obsoleted strings. */ j = 0; while (cpp[j] != NULL) - if (istrstr (new, cpp[j]) == NULL) + if (strstr (new, cpp[j]) == NULL) ++j; else { --------------080703040509050408010305-- From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 05 18:49:34 2014 Received: (at 17700) by debbugs.gnu.org; 5 Jun 2014 22:49:34 +0000 Received: from localhost ([127.0.0.1]:46237 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WsgTV-0000Ub-5B for submit@debbugs.gnu.org; Thu, 05 Jun 2014 18:49:33 -0400 Received: from mailgw01.kcn.ne.jp ([61.86.7.208]:35202) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WsgTS-0000UE-4K for 17700@debbugs.gnu.org; Thu, 05 Jun 2014 18:49:31 -0400 Received: from imp02 (mailgw6.kcn.ne.jp [61.86.15.232]) by mailgw01.kcn.ne.jp (Postfix) with ESMTP id 01CC17FF86 for <17700@debbugs.gnu.org>; Fri, 6 Jun 2014 07:49:23 +0900 (JST) Received: from mail05.kcn.ne.jp ([61.86.6.184]) by imp02 with bizsmtp id AmpN1o00T3yDdWd01mpNHJ; Fri, 06 Jun 2014 07:49:22 +0900 X-OrgRCPT: 17700@debbugs.gnu.org Received: from [10.120.1.33] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail05.kcn.ne.jp (Postfix) with ESMTPA id B07AA7D009F; Fri, 6 Jun 2014 07:49:22 +0900 (JST) Date: Fri, 06 Jun 2014 07:49:22 +0900 From: Norihiro Tanaka To: Paul Eggert Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated In-Reply-To: <53909F69.8070601@cs.ucla.edu> References: <20140605203227.5BEF.27F6AC2D@kcn.ne.jp> <53909F69.8070601@cs.ucla.edu> Message-Id: <20140606074922.3D69.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 17700 Cc: Aharon Robbins , 17700@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Thanks. Paul Eggert wrote: > It seems to me that much of the performance benefit comes from using a > faster implementation of strstr, and that the DFA code will be better > off if it simply uses the system strstr rather than rolling its own. `lookfor' isn't terminated by `\-'\0' in istrstr(), but strstr() requires it. So I wasn't replace it to strstr(); after `grep: undo part of previous change', no longer faster. $ printf '%02048d\n' 0 | time -p src/grep -f - /dev/null master : real 5.74 user 5.40 sys 0.17 my patch: real 0.08 user 0.04 sys 0.04 From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 05 20:49:46 2014 Received: (at 17700) by debbugs.gnu.org; 6 Jun 2014 00:49:46 +0000 Received: from localhost ([127.0.0.1]:46286 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WsiLp-0004Yd-P0 for submit@debbugs.gnu.org; Thu, 05 Jun 2014 20:49:46 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:34364) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WsiLl-0004YI-Qr for 17700@debbugs.gnu.org; Thu, 05 Jun 2014 20:49:42 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 89AEEA6000E; Thu, 5 Jun 2014 17:49:35 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CLv2+q-cqK6O; Thu, 5 Jun 2014 17:49:27 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id E5C19A60001; Thu, 5 Jun 2014 17:49:26 -0700 (PDT) Message-ID: <53911013.7070604@cs.ucla.edu> Date: Thu, 05 Jun 2014 17:49:23 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Norihiro Tanaka Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated References: <20140605203227.5BEF.27F6AC2D@kcn.ne.jp> <53909F69.8070601@cs.ucla.edu> <20140606074922.3D69.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140606074922.3D69.27F6AC2D@kcn.ne.jp> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 17700 Cc: Aharon Robbins , 17700@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) Norihiro Tanaka wrote: > after `grep: undo part of previous change', no longer faster. > > $ printf '%02048d\n' 0 | time -p src/grep -f - /dev/null > > master : real 5.74 user 5.40 sys 0.17 > my patch: real 0.08 user 0.04 sys 0.04 This isn't matching the results I get on my platform. If src/d5dfa69/grep is the old version, src/709e7e5/grep the current master (i.e., just use system strstr), and src/grep is the old version with your patch, I get: $ printf '%02048d\n' 0 | time -p src/d5dfa69/grep -f - /dev/null real 0.33 user 0.33 sys 0.00 $ printf '%02048d\n' 0 | time -p src/709e7e5/grep -f - /dev/null real 0.04 user 0.04 sys 0.00 $ printf '%02048d\n' 0 | time -p src/grep -f - /dev/null real 0.03 user 0.03 sys 0.00 So it looks like your patch confers some advantage, but on my platform almost all the speedup is achieved simply by switching to the system strstr. > `lookfor' isn't terminated by `\-'\0' in istrstr(), but strstr() requires it. Yes, that's why I needed to make the most recent change in the current master; it arranges for strstr's 2nd arg to be null-terminated. From debbugs-submit-bounces@debbugs.gnu.org Fri Jun 06 02:43:09 2014 Received: (at 17700) by debbugs.gnu.org; 6 Jun 2014 06:43:09 +0000 Received: from localhost ([127.0.0.1]:46428 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wsnro-0002Nr-4N for submit@debbugs.gnu.org; Fri, 06 Jun 2014 02:43:08 -0400 Received: from frenzy.freefriends.org ([66.54.153.139]:44136 helo=freefriends.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wsnrj-0002Nb-Pd for 17700@debbugs.gnu.org; Fri, 06 Jun 2014 02:43:04 -0400 X-Envelope-From: arnold@skeeve.com Received: from freefriends.org (localhost [127.0.0.1]) by freefriends.org (8.14.8/8.14.8) with ESMTP id s566gteC028369; Fri, 6 Jun 2014 00:42:55 -0600 Received: (from arnold@localhost) by freefriends.org (8.14.8/8.14.8/submit) id s566gkFl028365; Fri, 6 Jun 2014 06:42:46 GMT From: arnold@skeeve.com Message-Id: <201406060642.s566gkFl028365@freefriends.org> X-Authentication-Warning: frenzy.freefriends.org: arnold set sender to arnold@skeeve.com using -f Date: Fri, 06 Jun 2014 00:42:45 -0600 To: noritnk@kcn.ne.jp, eggert@cs.ucla.edu Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated References: <20140605203227.5BEF.27F6AC2D@kcn.ne.jp> <53909F69.8070601@cs.ucla.edu> <20140606074922.3D69.27F6AC2D@kcn.ne.jp> <53911013.7070604@cs.ucla.edu> In-Reply-To: <53911013.7070604@cs.ucla.edu> User-Agent: Heirloom mailx 12.4 7/29/08 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 17700 Cc: 17700@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Is strstr() even a good idea? dfa needs to be able to match NUL bytes in the data. If this prevents that, then it's a problem. I'm merely asking - I didn't look hard at where the change was made. If it matching NUL bytes isn't affected then, no problem. Thanks, Arnold Paul Eggert wrote: > Norihiro Tanaka wrote: > > after `grep: undo part of previous change', no longer faster. > > > > $ printf '%02048d\n' 0 | time -p src/grep -f - /dev/null > > > > master : real 5.74 user 5.40 sys 0.17 > > my patch: real 0.08 user 0.04 sys 0.04 > > This isn't matching the results I get on my platform. If > src/d5dfa69/grep is the old version, src/709e7e5/grep the current master > (i.e., just use system strstr), and src/grep is the old version with > your patch, I get: > > $ printf '%02048d\n' 0 | time -p src/d5dfa69/grep -f - /dev/null > real 0.33 > user 0.33 > sys 0.00 > $ printf '%02048d\n' 0 | time -p src/709e7e5/grep -f - /dev/null > real 0.04 > user 0.04 > sys 0.00 > $ printf '%02048d\n' 0 | time -p src/grep -f - /dev/null > real 0.03 > user 0.03 > sys 0.00 > > So it looks like your patch confers some advantage, but on my platform > almost all the speedup is achieved simply by switching to the system strstr. > > > `lookfor' isn't terminated by `\-'\0' in istrstr(), but strstr() requires it. > > Yes, that's why I needed to make the most recent change in the current > master; it arranges for strstr's 2nd arg to be null-terminated. From debbugs-submit-bounces@debbugs.gnu.org Fri Jun 06 09:19:24 2014 Received: (at 17700-done) by debbugs.gnu.org; 6 Jun 2014 13:19:24 +0000 Received: from localhost ([127.0.0.1]:46628 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wsu3I-0001ag-7T for submit@debbugs.gnu.org; Fri, 06 Jun 2014 09:19:24 -0400 Received: from mailgw01.kcn.ne.jp ([61.86.7.208]:49696) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wsu3E-0001aM-QK for 17700-done@debbugs.gnu.org; Fri, 06 Jun 2014 09:19:22 -0400 Received: from imp03 (mailgw7.kcn.ne.jp [61.86.15.238]) by mailgw01.kcn.ne.jp (Postfix) with ESMTP id 8EB9B8023F for <17700-done@debbugs.gnu.org>; Fri, 6 Jun 2014 22:19:13 +0900 (JST) Received: from mail04.kcn.ne.jp ([61.86.6.183]) by imp03 with bizsmtp id B1KD1o00N3wvxAM011KDSz; Fri, 06 Jun 2014 22:19:13 +0900 X-Direction: outbound X-OrgRCPT: 17700-done@debbugs.gnu.org Received: from [10.120.1.33] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail04.kcn.ne.jp (Postfix) with ESMTPA id 3A2D81290022; Fri, 6 Jun 2014 22:19:13 +0900 (JST) Date: Fri, 06 Jun 2014 22:19:13 +0900 From: Norihiro Tanaka To: Paul Eggert Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated In-Reply-To: <53911013.7070604@cs.ucla.edu> References: <20140606074922.3D69.27F6AC2D@kcn.ne.jp> <53911013.7070604@cs.ucla.edu> Message-Id: <20140606221912.5F47.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 17700-done Cc: Aharon Robbins , 17700-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Paul Eggert wrote: > So it looks like your patch confers some advantage, but on my platform > almost all the speedup is achieved simply by switching to the system strstr. First, I tested on CentOS 5.10. Next, I tested on RHEL 6.5, and get result as same as you. strstr() on CentOS 5.10 may be too old. I'm sure that this bug has already been fixed. So closing. Thanks, Norihiro From debbugs-submit-bounces@debbugs.gnu.org Fri Jun 06 09:43:39 2014 Received: (at 17700) by debbugs.gnu.org; 6 Jun 2014 13:43:39 +0000 Received: from localhost ([127.0.0.1]:46663 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WsuQk-0002Oh-KL for submit@debbugs.gnu.org; Fri, 06 Jun 2014 09:43:38 -0400 Received: from mailgw05.kcn.ne.jp ([61.86.7.212]:50507) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WsuQi-0002OI-6M for 17700@debbugs.gnu.org; Fri, 06 Jun 2014 09:43:37 -0400 Received: from imp03 (mailgw7.kcn.ne.jp [61.86.15.238]) by mailgw05.kcn.ne.jp (Postfix) with ESMTP id 2DC3B67D76 for <17700@debbugs.gnu.org>; Fri, 6 Jun 2014 22:43:29 +0900 (JST) Received: from mail08.kcn.ne.jp ([61.86.6.187]) by imp03 with bizsmtp id B1jV1o006426eXR011jVjB; Fri, 06 Jun 2014 22:43:29 +0900 X-Direction: outbound X-OrgRCPT: 17700@debbugs.gnu.org Received: from [10.120.1.33] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail08.kcn.ne.jp (Postfix) with ESMTPA id AF89412B802E; Fri, 6 Jun 2014 22:43:28 +0900 (JST) Date: Fri, 06 Jun 2014 22:43:28 +0900 From: Norihiro Tanaka To: arnold@skeeve.com Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated In-Reply-To: <201406060642.s566gkFl028365@freefriends.org> References: <53911013.7070604@cs.ucla.edu> <201406060642.s566gkFl028365@freefriends.org> Message-Id: <20140606224327.5F59.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 17700 Cc: eggert@cs.ucla.edu, 17700@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Arnold Robbins wrote: > Is strstr() even a good idea? dfa needs to be able to match NUL > bytes in the data. If this prevents that, then it's a problem. dfamust() doesn't change a result, so no problem. BTW, even before make this change, dfamusts is terminated by NUL byte. Thanks, Norihiro From debbugs-submit-bounces@debbugs.gnu.org Fri Jun 06 10:20:06 2014 Received: (at 17700) by debbugs.gnu.org; 6 Jun 2014 14:20:06 +0000 Received: from localhost ([127.0.0.1]:47338 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wsuzx-0003lj-DJ for submit@debbugs.gnu.org; Fri, 06 Jun 2014 10:20:06 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:60438) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wsuzq-0003lH-Su for 17700@debbugs.gnu.org; Fri, 06 Jun 2014 10:19:59 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 86AE8A6000A; Fri, 6 Jun 2014 07:19:48 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id P5GT6B-9fPrq; Fri, 6 Jun 2014 07:19:40 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id E7334A60001; Fri, 6 Jun 2014 07:19:39 -0700 (PDT) Message-ID: <5391CDF8.9070000@cs.ucla.edu> Date: Fri, 06 Jun 2014 07:19:36 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Norihiro Tanaka Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated References: <20140606074922.3D69.27F6AC2D@kcn.ne.jp> <53911013.7070604@cs.ucla.edu> <20140606221912.5F47.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140606221912.5F47.27F6AC2D@kcn.ne.jp> Content-Type: multipart/mixed; boundary="------------070605020402000204070505" X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 17700 Cc: 17700@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) This is a multi-part message in MIME format. --------------070605020402000204070505 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Norihiro Tanaka wrote: > strstr() on CentOS 5.10 may be too old. Yes it is. But 'configure' is supposed to detect this. config.log should say something like this: configure:26283: checking whether strstr works in linear time configure:26357: gcc -std=gnu99 -o conftest -g -O2 conftest.c >&5 configure:26357: $? = 0 configure:26357: ./conftest configure:26357: $? = 142 configure: program exited with status 142 This is how it behaves for me, on RHEL 6.5 and on Solaris 11.1. Could you please investigate why it is not occurring on CentOS 5.10? For example, why does the attached program work? It should fail. --------------070605020402000204070505 Content-Type: text/x-csrc; name="strstr-test.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="strstr-test.c" #include /* for signal */ #include /* for strstr */ #include /* for malloc */ #include /* for alarm */ static void quit (int sig) { exit (sig + 128); } int main () { int result = 0; size_t m = 1000000; char *haystack = (char *) malloc (2 * m + 2); char *needle = (char *) malloc (m + 2); /* Failure to compile this test due to missing alarm is okay, since all such platforms (mingw) also have quadratic strstr. */ signal (SIGALRM, quit); alarm (5); /* Check for quadratic performance. */ if (haystack && needle) { memset (haystack, 'A', 2 * m); haystack[2 * m] = 'B'; haystack[2 * m + 1] = 0; memset (needle, 'A', m); needle[m] = 'B'; needle[m + 1] = 0; if (!strstr (haystack, needle)) result |= 1; } return result; ; return 0; } --------------070605020402000204070505-- From debbugs-submit-bounces@debbugs.gnu.org Fri Jun 06 10:26:05 2014 Received: (at 17700) by debbugs.gnu.org; 6 Jun 2014 14:26:05 +0000 Received: from localhost ([127.0.0.1]:47342 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wsv5k-0003xu-Vs for submit@debbugs.gnu.org; Fri, 06 Jun 2014 10:26:05 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:60691) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wsv5e-0003xY-RW for 17700@debbugs.gnu.org; Fri, 06 Jun 2014 10:25:59 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id F0031A6000A; Fri, 6 Jun 2014 07:25:48 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oLvaMe7tcTjS; Fri, 6 Jun 2014 07:25:40 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 616ADA60001; Fri, 6 Jun 2014 07:25:40 -0700 (PDT) Message-ID: <5391CF64.6000400@cs.ucla.edu> Date: Fri, 06 Jun 2014 07:25:40 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Norihiro Tanaka , arnold@skeeve.com Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated References: <53911013.7070604@cs.ucla.edu> <201406060642.s566gkFl028365@freefriends.org> <20140606224327.5F59.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140606224327.5F59.27F6AC2D@kcn.ne.jp> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 17700 Cc: 17700@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) Norihiro Tanaka wrote: > BTW, even before make > this change, dfamusts is terminated by NUL byte. Yes, that's right, this patch doesn't affect whether the DFA code works correctly on NUL bytes. There is a performance issue: if the pattern contains NUL bytes the DFA code doesn't operate as efficiently as it will does the pattern contains only non-NUL bytes. But this performance issue is unaffected by the recent change. It's low priority to fix the performance issue, but if we wanted to fix it among many other things we should use memmem instead of strstr. From debbugs-submit-bounces@debbugs.gnu.org Fri Jun 06 18:46:17 2014 Received: (at 17700) by debbugs.gnu.org; 6 Jun 2014 22:46:17 +0000 Received: from localhost ([127.0.0.1]:39938 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wt2ts-0005EV-Qk for submit@debbugs.gnu.org; Fri, 06 Jun 2014 18:46:17 -0400 Received: from mailgw06.kcn.ne.jp ([61.86.7.213]:49068) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wt2tp-0005E9-1i for 17700@debbugs.gnu.org; Fri, 06 Jun 2014 18:46:14 -0400 Received: from imp02 (mailgw6.kcn.ne.jp [61.86.15.232]) by mailgw06.kcn.ne.jp (Postfix) with ESMTP id E576CE80026 for <17700@debbugs.gnu.org>; Sat, 7 Jun 2014 07:46:00 +0900 (JST) Received: from mail03.kcn.ne.jp ([61.86.6.182]) by imp02 with bizsmtp id BAm01o00g3veGq501Am0Sm; Sat, 07 Jun 2014 07:46:00 +0900 X-OrgRCPT: 17700@debbugs.gnu.org Received: from [10.120.1.49] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail03.kcn.ne.jp (Postfix) with ESMTPA id B280B1410025; Sat, 7 Jun 2014 07:46:00 +0900 (JST) Date: Sat, 07 Jun 2014 07:46:00 +0900 From: Norihiro Tanaka To: Paul Eggert Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated In-Reply-To: <5391CDF8.9070000@cs.ucla.edu> References: <20140606221912.5F47.27F6AC2D@kcn.ne.jp> <5391CDF8.9070000@cs.ucla.edu> Message-Id: <20140607074600.744E.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 17700 Cc: 17700@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Paul Eggert wrote: > Could you please investigate why it is not occurring on CentOS 5.10? > For example, why does the attached program work? It should fail. Sorry, I don't update gnulib. After update, it was returned immediately. I have also confirmed speed-up in grep. Thanks, Norihiro From unknown Sun Jun 22 15:27:01 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 05 Jul 2014 11:24:03 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator