From unknown Tue Jun 17 20:18:21 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#16842 <16842@debbugs.gnu.org> To: bug#16842 <16842@debbugs.gnu.org> Subject: Status: [PATCH] Use mbrtowc_cache in DFA engine Reply-To: bug#16842 <16842@debbugs.gnu.org> Date: Wed, 18 Jun 2025 03:18:21 +0000 retitle 16842 [PATCH] Use mbrtowc_cache in DFA engine reassign 16842 grep submitter 16842 Norihiro Tanaka severity 16842 normal tag 16842 patch thanks From debbugs-submit-bounces@debbugs.gnu.org Sat Feb 22 10:46:42 2014 Received: (at submit) by debbugs.gnu.org; 22 Feb 2014 15:46:42 +0000 Received: from localhost ([127.0.0.1]:35829 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHEmn-000059-V6 for submit@debbugs.gnu.org; Sat, 22 Feb 2014 10:46:42 -0500 Received: from pbsg501.nifty.com ([202.248.238.71]:27113) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHEmh-00004s-Kh for submit@debbugs.gnu.org; Sat, 22 Feb 2014 10:46:39 -0500 Received: from [10.120.1.23] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) (authenticated) by pbsg501.nifty.com with ESMTP id s1MFkQjk022308 for ; Sun, 23 Feb 2014 00:46:26 +0900 X-Nifty-SrcIP: [118.21.128.66] Date: Sun, 23 Feb 2014 00:46:27 +0900 From: Norihiro Tanaka To: submit@debbugs.gnu.org Subject: [PATCH] Use mbrtowc_cache in DFA engine Message-Id: <20140223004626.6B29.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------_52D2369D000000001525_MULTIPART_MIXED_" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.64.06 [ja] X-Spam-Score: 3.0 (+++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Package: grep Tags: patch The patch is DFA version of patch#16544 "Optimazation for is_mb_middle". It will improve performance for non-UTF8 locales in DFA engine. I tested below. In both case, Speed-up 3-3.5x. [...] Content analysis details: (3.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record -0.5 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain 3.5 OBFU_TEXT_ATTACH BODY: Text attachment with non-text MIME type X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 3.0 (+++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Package: grep Tags: patch The patch is DFA version of patch#16544 "Optimazation for is_mb_middle". It will improve performance for non-UTF8 locales in DFA engine. I tested below. In both case, Speed-up 3-3.5x. [...] Content analysis details: (3.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record -0.5 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain 3.5 OBFU_TEXT_ATTACH BODY: Text attachment with non-text MIME type --------_52D2369D000000001525_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Package: grep Tags: patch The patch is DFA version of patch#16544 "Optimazation for is_mb_middle". It will improve performance for non-UTF8 locales in DFA engine. I tested below. In both case, Speed-up 3-3.5x. $ yes $(printf '%078dm' 0)|head -1000000 > in $ for i in `seq 5`; do env LC_ALL=ja_JP.eucJP time src/grep n in; done $ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -1000000 > k $ for i in `seq 5`; do env LC_ALL=ja_JP.eucJP time src/grep -i foobar k; done Norihiro --------_52D2369D000000001525_MULTIPART_MIXED_ Content-Type: application/octet-stream; name="use_mb_cache_in_dfa.txt" Content-Disposition: attachment; filename="use_mb_cache_in_dfa.txt" Content-Transfer-Encoding: base64 RnJvbSA0ODk0ZGU0OTk2ZTNiODI4MWQ2ZmMzYzY4ZDBjYWYyMzRjYmQxNzkwIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE YXRlOiBUaHUsIDIwIEZlYiAyMDE0IDIxOjU4OjQ5ICswOTAwClN1YmplY3Q6IFtQQVRDSF0gVXNl IG1icnRvd2NfY2FjaGUgaW4gREZBIGVuZ2luZS4KCi0tLQogc3JjL2RmYS5jIHwgNTEgKysrKysr KysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrLS0tCiAxIGZpbGUgY2hh bmdlZCwgNDggaW5zZXJ0aW9ucygrKSwgMyBkZWxldGlvbnMoLSkKCmRpZmYgLS1naXQgYS9zcmMv ZGZhLmMgYi9zcmMvZGZhLmMKaW5kZXggODkwNmVkMy4uYzA3M2NmMyAxMDA2NDQKLS0tIGEvc3Jj L2RmYS5jCisrKyBiL3NyYy9kZmEuYwpAQCAtNDE5LDYgKzQxOSwxMyBAQCBzdHJ1Y3QgZGZhCiAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHRoZSBkZmEuICAqLwogfTsKIAorc3Ry dWN0IG1icnRvd2NfY2FjaGUgeworICAgIHNpemVfdCBsZW5ndGg7CisgICAgd2NoYXJfdCB3Y2hh cjsKK307CisKK3N0YXRpYyBzdHJ1Y3QgbWJydG93Y19jYWNoZSBtYnJ0b3djX2NhY2hlW05PVENI QVJdOworCiAvKiBTb21lIG1hY3JvcyBmb3IgdXNlciBhY2Nlc3MgdG8gZGZhIGludGVybmFscy4g ICovCiAKIC8qIEFDQ0VQVElORyByZXR1cm5zIHRydWUgaWYgcyBjb3VsZCBwb3NzaWJseSBiZSBh biBhY2NlcHRpbmcgc3RhdGUgb2Ygci4gICovCkBAIC04MjIsMTIgKzgyOSwyMiBAQCBzdGF0aWMg dW5zaWduZWQgY2hhciBjb25zdCAqYnVmX2VuZDsgICAgLyogcmVmZXJlbmNlIHRvIGVuZCBpbiBk ZmFleGVjLiAgKi8KICAgICBlbHNlCQkJCQlcCiAgICAgICB7CQkJCQkJXAogICAgICAgICB3Y2hh cl90IF93YzsJCQkJXAotICAgICAgICBjdXJfbWJfbGVuID0gbWJydG93YyAoJl93YywgbGV4cHRy LCBsZXhsZWZ0LCAmbWJzKTsgXAorICAgICAgICBib29sIHVzZV9jYWNoZSA9IGZhbHNlOwkJCVwK KyAgICAgICAgY3VyX21iX2xlbiA9IG1icnRvd2NfY2FjaGVbdG9fdWNoYXIgKCpsZXhwdHIpXS5s ZW5ndGg7IFwKKyAgICAgICAgaWYgKGN1cl9tYl9sZW4gIT0gKHNpemVfdCkgLTIpCQlcCisgICAg ICAgICAgewkJCQkJXAorICAgICAgICAgICAgX3djID0gbWJydG93Y19jYWNoZVt0b191Y2hhciAo KmxleHB0cildLndjaGFyOyBcCisgICAgICAgICAgICB1c2VfY2FjaGUgPSB0cnVlOyAgICAgICAg ICAgICAgICAgICBcCisgICAgICAgICAgfQkJCQkJXAorICAgICAgICBlbHNlCQkJCQlcCisgICAg ICAgICAgY3VyX21iX2xlbiA9IG1icnRvd2MgKCZfd2MsIGxleHB0ciwgbGV4bGVmdCwgJm1icyk7 IFwKICAgICAgICAgaWYgKGN1cl9tYl9sZW4gPD0gMCkJCQlcCiAgICAgICAgICAgewkJCQkJXAog ICAgICAgICAgICAgY3VyX21iX2xlbiA9IDE7CQkJXAogICAgICAgICAgICAgLS1sZXhsZWZ0OwkJ CQlcCiAgICAgICAgICAgICAod2MpID0gKGMpID0gdG9fdWNoYXIgKCpsZXhwdHIrKyk7ICBcCisg ICAgICAgICAgICBpZiAoIXVzZV9jYWNoZSkJCQlcCisgICAgICAgICAgICAgIG1lbXNldCAoJm1i cywgMCwgc2l6ZW9mIG1icyk7CVwKICAgICAgICAgICB9CQkJCQlcCiAgICAgICAgIGVsc2UJCQkJ CVwKICAgICAgICAgICB7CQkJCQlcCkBAIC0zMjcxLDggKzMyODgsMTkgQEAgcHJlcGFyZV93Y19i dWYgKGNvbnN0IGNoYXIgKmJlZ2luLCBjb25zdCBjaGFyICplbmQpCiAgICAgewogICAgICAgaWYg KHJlbWFpbl9ieXRlcyA9PSAwKQogICAgICAgICB7Ci0gICAgICAgICAgcmVtYWluX2J5dGVzCi0g ICAgICAgICAgICA9IG1icnRvd2MgKGlucHV0d2NzICsgaSwgYmVnaW4gKyBpLCBlbmQgLSBiZWdp biAtIGkgKyAxLCAmbWJzKTsKKyAgICAgICAgICBib29sIHVzZV9jYWNoZSA9IGZhbHNlOworICAg ICAgICAgIHJlbWFpbl9ieXRlcyA9IG1icnRvd2NfY2FjaGVbdG9fdWNoYXIgKGJlZ2luW2ldKV0u bGVuZ3RoOworICAgICAgICAgIGlmIChyZW1haW5fYnl0ZXMgIT0gKHNpemVfdCkgLTIpCisgICAg ICAgICAgICB7CisgICAgICAgICAgICAgIGlucHV0d2NzW2ldID0gbWJydG93Y19jYWNoZVt0b191 Y2hhciAoYmVnaW5baV0pXS53Y2hhcjsKKyAgICAgICAgICAgICAgdXNlX2NhY2hlID0gdHJ1ZTsK KyAgICAgICAgICAgIH0KKyAgICAgICAgICBlbHNlCisgICAgICAgICAgICB7CisgICAgICAgICAg ICAgIHJlbWFpbl9ieXRlcworICAgICAgICAgICAgICAgID0gbWJydG93YyAoaW5wdXR3Y3MgKyBp LCBiZWdpbiArIGksIGVuZCAtIGJlZ2luIC0gaSArIDEsICZtYnMpOworICAgICAgICAgICAgfQor CiAgICAgICAgICAgaWYgKHJlbWFpbl9ieXRlcyA8IDEKICAgICAgICAgICAgICAgfHwgcmVtYWlu X2J5dGVzID09IChzaXplX3QpIC0xCiAgICAgICAgICAgICAgIHx8IHJlbWFpbl9ieXRlcyA9PSAo c2l6ZV90KSAtMgpAQCAtMzI4MSw2ICszMzA5LDggQEAgcHJlcGFyZV93Y19idWYgKGNvbnN0IGNo YXIgKmJlZ2luLCBjb25zdCBjaGFyICplbmQpCiAgICAgICAgICAgICAgIHJlbWFpbl9ieXRlcyA9 IDA7CiAgICAgICAgICAgICAgIGlucHV0d2NzW2ldID0gKHdjaGFyX3QpIGJlZ2luW2ldOwogICAg ICAgICAgICAgICBtYmxlbl9idWZbaV0gPSAwOworICAgICAgICAgICAgICBpZiAoIXVzZV9jYWNo ZSkKKyAgICAgICAgICAgICAgICBtZW1zZXQgKCZtYnMsIDAsIHNpemVvZiBtYnMpOwogICAgICAg ICAgICAgICBpZiAoYmVnaW5baV0gPT0gZW9sKQogICAgICAgICAgICAgICAgIGJyZWFrOwogICAg ICAgICAgICAgfQpAQCAtMzU0NSwxMSArMzU3NSwyNiBAQCBkZmFvcHRpbWl6ZSAoc3RydWN0IGRm YSAqZCkKICAgZC0+bWJfY3VyX21heCA9IDE7CiB9CiAKK3ZvaWQKK2J1aWxkX21icnRvd2NfY2Fj aGUgKHZvaWQpCit7CisgIGludCBpOworCisgIGZvciAoaSA9IENIQVJfTUlOOyBpIDw9IENIQVJf TUFYOyArK2kpCisgICAgeworICAgICAgY2hhciBjID0gaTsKKyAgICAgIHVuc2lnbmVkIGNoYXIg dWMgPSBpOworICAgICAgbWJzdGF0ZV90IHMgPSB7IDAgfTsKKyAgICAgIG1icnRvd2NfY2FjaGVb dWNdLmxlbmd0aCA9IG1icnRvd2MgKCZtYnJ0b3djX2NhY2hlW3VjXS53Y2hhciwgJmMsIDEsICZz KTsKKyAgICB9Cit9CisKIC8qIFBhcnNlIGFuZCBhbmFseXplIGEgc2luZ2xlIHN0cmluZyBvZiB0 aGUgZ2l2ZW4gbGVuZ3RoLiAgKi8KIHZvaWQKIGRmYWNvbXAgKGNoYXIgY29uc3QgKnMsIHNpemVf dCBsZW4sIHN0cnVjdCBkZmEgKmQsIGludCBzZWFyY2hmbGFnKQogewogICBkZmFpbml0IChkKTsK KyAgYnVpbGRfbWJydG93Y19jYWNoZSAoKTsKICAgZGZhcGFyc2UgKHMsIGxlbiwgZCk7CiAgIGRm YW11c3QgKGQpOwogICBkZmFvcHRpbWl6ZSAoZCk7Ci0tIAoxLjguNS4yCgo= --------_52D2369D000000001525_MULTIPART_MIXED_ Content-Type: application/octet-stream; name="tests.txt" Content-Disposition: attachment; filename="tests.txt" Content-Transfer-Encoding: base64 JCB5ZXMgJChwcmludGYgJyUwNzhkbScgMCl8aGVhZCAtMTAwMDAwMCA+IGluDQokIGZvciBpIGlu IGBzZXEgNWA7IGRvIGVudiBMQ19BTEw9amFfSlAuZXVjSlAgdGltZSBzcmMvZ3JlcCBuIGluOyBk b25lDQoNCkJlZm9yZSB0aGUgcGF0Y2g6DQpDb21tYW5kIGV4aXRlZCB3aXRoIG5vbi16ZXJvIHN0 YXR1cyAxDQozLjgydXNlciAxLjIwc3lzdGVtIDA6MDUuMzBlbGFwc2VkIDk0JUNQVSAoMGF2Z3Rl eHQrMGF2Z2RhdGEgMzA0MG1heHJlc2lkZW50KWsNCjBpbnB1dHMrMG91dHB1dHMgKDBtYWpvcisy MTZtaW5vcilwYWdlZmF1bHRzIDBzd2Fwcw0KQ29tbWFuZCBleGl0ZWQgd2l0aCBub24temVybyBz dGF0dXMgMQ0KMy44MHVzZXIgMS4wN3N5c3RlbSAwOjA1LjA0ZWxhcHNlZCA5NiVDUFUgKDBhdmd0 ZXh0KzBhdmdkYXRhIDMwNTZtYXhyZXNpZGVudClrDQowaW5wdXRzKzBvdXRwdXRzICgwbWFqb3Ir MjE3bWlub3IpcGFnZWZhdWx0cyAwc3dhcHMNCkNvbW1hbmQgZXhpdGVkIHdpdGggbm9uLXplcm8g c3RhdHVzIDENCjQuNDJ1c2VyIDAuNzdzeXN0ZW0gMDowNS40MWVsYXBzZWQgOTYlQ1BVICgwYXZn dGV4dCswYXZnZGF0YSAzMDU2bWF4cmVzaWRlbnQpaw0KMGlucHV0cyswb3V0cHV0cyAoMG1ham9y KzIxN21pbm9yKXBhZ2VmYXVsdHMgMHN3YXBzDQpDb21tYW5kIGV4aXRlZCB3aXRoIG5vbi16ZXJv IHN0YXR1cyAxDQozLjYzdXNlciAxLjE4c3lzdGVtIDA6MDUuMDdlbGFwc2VkIDk0JUNQVSAoMGF2 Z3RleHQrMGF2Z2RhdGEgMzA3Mm1heHJlc2lkZW50KWsNCjBpbnB1dHMrMG91dHB1dHMgKDBtYWpv cisyMThtaW5vcilwYWdlZmF1bHRzIDBzd2Fwcw0KQ29tbWFuZCBleGl0ZWQgd2l0aCBub24temVy byBzdGF0dXMgMQ0KNC4wMHVzZXIgMC44MnN5c3RlbSAwOjA1LjEzZWxhcHNlZCA5NCVDUFUgKDBh dmd0ZXh0KzBhdmdkYXRhIDMwNTZtYXhyZXNpZGVudClrDQowaW5wdXRzKzBvdXRwdXRzICgwbWFq b3IrMjE3bWlub3IpcGFnZWZhdWx0cyAwc3dhcHMNCg0KQWZ0ZXIgdGhlIHBhdGNoOg0KQ29tbWFu ZCBleGl0ZWQgd2l0aCBub24temVybyBzdGF0dXMgMQ0KMC45MnVzZXIgMC41MnN5c3RlbSAwOjAx LjUzZWxhcHNlZCA5NCVDUFUgKDBhdmd0ZXh0KzBhdmdkYXRhIDMwODhtYXhyZXNpZGVudClrDQow aW5wdXRzKzBvdXRwdXRzICgwbWFqb3IrMjE5bWlub3IpcGFnZWZhdWx0cyAwc3dhcHMNCkNvbW1h bmQgZXhpdGVkIHdpdGggbm9uLXplcm8gc3RhdHVzIDENCjEuMjJ1c2VyIDAuMjBzeXN0ZW0gMDow MS40OGVsYXBzZWQgOTYlQ1BVICgwYXZndGV4dCswYXZnZGF0YSAzMDcybWF4cmVzaWRlbnQpaw0K MGlucHV0cyswb3V0cHV0cyAoMG1ham9yKzIxOG1pbm9yKXBhZ2VmYXVsdHMgMHN3YXBzDQpDb21t YW5kIGV4aXRlZCB3aXRoIG5vbi16ZXJvIHN0YXR1cyAxDQoxLjA1dXNlciAwLjQxc3lzdGVtIDA6 MDEuNTFlbGFwc2VkIDk3JUNQVSAoMGF2Z3RleHQrMGF2Z2RhdGEgMzA3Mm1heHJlc2lkZW50KWsN CjBpbnB1dHMrMG91dHB1dHMgKDBtYWpvcisyMThtaW5vcilwYWdlZmF1bHRzIDBzd2Fwcw0KQ29t bWFuZCBleGl0ZWQgd2l0aCBub24temVybyBzdGF0dXMgMQ0KMS4xN3VzZXIgMC4yM3N5c3RlbSAw OjAxLjQ3ZWxhcHNlZCA5NiVDUFUgKDBhdmd0ZXh0KzBhdmdkYXRhIDMwODhtYXhyZXNpZGVudClr DQowaW5wdXRzKzBvdXRwdXRzICgwbWFqb3IrMjE5bWlub3IpcGFnZWZhdWx0cyAwc3dhcHMNCkNv bW1hbmQgZXhpdGVkIHdpdGggbm9uLXplcm8gc3RhdHVzIDENCjEuMjJ1c2VyIDAuNDNzeXN0ZW0g MDowMS43MmVsYXBzZWQgOTYlQ1BVICgwYXZndGV4dCswYXZnZGF0YSAzMDcybWF4cmVzaWRlbnQp aw0KMGlucHV0cyswb3V0cHV0cyAoMG1ham9yKzIxOG1pbm9yKXBhZ2VmYXVsdHMgMHN3YXBzDQoN Cg0KJCB5ZXMgampqampqampqampqampqampqampqampqampqampqampqampqampqaiB8IGhlYWQg LTEwMDAwMDAgPiBrDQokIGZvciBpIGluIGBzZXEgNWA7IGRvIGVudiBMQ19BTEw9amFfSlAuZXVj SlAgdGltZSBzcmMvZ3JlcCAtaSBmb29iYXIgazsgZG9uZQ0KDQpCZWZvcmUgdGhlIHBhdGNoOg0K Q29tbWFuZCBleGl0ZWQgd2l0aCBub24temVybyBzdGF0dXMgMQ0KMi4zM3VzZXIgMC40MHN5c3Rl bSAwOjAyLjg2ZWxhcHNlZCA5NSVDUFUgKDBhdmd0ZXh0KzBhdmdkYXRhIDMwNzJtYXhyZXNpZGVu dClrDQowaW5wdXRzKzBvdXRwdXRzICgwbWFqb3IrMjE4bWlub3IpcGFnZWZhdWx0cyAwc3dhcHMN CkNvbW1hbmQgZXhpdGVkIHdpdGggbm9uLXplcm8gc3RhdHVzIDENCjIuMTB1c2VyIDAuNTBzeXN0 ZW0gMDowMi43NGVsYXBzZWQgOTUlQ1BVICgwYXZndGV4dCswYXZnZGF0YSAzMDU2bWF4cmVzaWRl bnQpaw0KMGlucHV0cyswb3V0cHV0cyAoMG1ham9yKzIzMW1pbm9yKXBhZ2VmYXVsdHMgMHN3YXBz DQpDb21tYW5kIGV4aXRlZCB3aXRoIG5vbi16ZXJvIHN0YXR1cyAxDQoyLjE5dXNlciAwLjQ1c3lz dGVtIDA6MDIuNzRlbGFwc2VkIDk2JUNQVSAoMGF2Z3RleHQrMGF2Z2RhdGEgMzA1Nm1heHJlc2lk ZW50KWsNCjBpbnB1dHMrMG91dHB1dHMgKDBtYWpvcisyMzFtaW5vcilwYWdlZmF1bHRzIDBzd2Fw cw0KQ29tbWFuZCBleGl0ZWQgd2l0aCBub24temVybyBzdGF0dXMgMQ0KMi4yOHVzZXIgMC4yNnN5 c3RlbSAwOjAyLjY0ZWxhcHNlZCA5NiVDUFUgKDBhdmd0ZXh0KzBhdmdkYXRhIDMwNTZtYXhyZXNp ZGVudClrDQowaW5wdXRzKzBvdXRwdXRzICgwbWFqb3IrMjE3bWlub3IpcGFnZWZhdWx0cyAwc3dh cHMNCkNvbW1hbmQgZXhpdGVkIHdpdGggbm9uLXplcm8gc3RhdHVzIDENCjIuMTh1c2VyIDAuNTFz eXN0ZW0gMDowMi43N2VsYXBzZWQgOTclQ1BVICgwYXZndGV4dCswYXZnZGF0YSAzMDQwbWF4cmVz aWRlbnQpaw0KMGlucHV0cyswb3V0cHV0cyAoMG1ham9yKzIxNm1pbm9yKXBhZ2VmYXVsdHMgMHN3 YXBzDQoNCkFmdGVyIHRoZSBwYXRjaDoNCkNvbW1hbmQgZXhpdGVkIHdpdGggbm9uLXplcm8gc3Rh dHVzIDENCjAuNzN1c2VyIDAuMThzeXN0ZW0gMDowMC45NmVsYXBzZWQgOTQlQ1BVICgwYXZndGV4 dCswYXZnZGF0YSAzMDcybWF4cmVzaWRlbnQpaw0KMGlucHV0cyswb3V0cHV0cyAoMG1ham9yKzIx OG1pbm9yKXBhZ2VmYXVsdHMgMHN3YXBzDQpDb21tYW5kIGV4aXRlZCB3aXRoIG5vbi16ZXJvIHN0 YXR1cyAxDQowLjczdXNlciAwLjEyc3lzdGVtIDA6MDAuOTBlbGFwc2VkIDk1JUNQVSAoMGF2Z3Rl eHQrMGF2Z2RhdGEgMzA3Mm1heHJlc2lkZW50KWsNCjBpbnB1dHMrMG91dHB1dHMgKDBtYWpvcisy MThtaW5vcilwYWdlZmF1bHRzIDBzd2Fwcw0KQ29tbWFuZCBleGl0ZWQgd2l0aCBub24temVybyBz dGF0dXMgMQ0KMC44M3VzZXIgMC4wOHN5c3RlbSAwOjAwLjkyZWxhcHNlZCA5OSVDUFUgKDBhdmd0 ZXh0KzBhdmdkYXRhIDMwNzJtYXhyZXNpZGVudClrDQowaW5wdXRzKzBvdXRwdXRzICgwbWFqb3Ir MjE4bWlub3IpcGFnZWZhdWx0cyAwc3dhcHMNCkNvbW1hbmQgZXhpdGVkIHdpdGggbm9uLXplcm8g c3RhdHVzIDENCjAuNzR1c2VyIDAuMTNzeXN0ZW0gMDowMC45MGVsYXBzZWQgOTclQ1BVICgwYXZn dGV4dCswYXZnZGF0YSAzMDU2bWF4cmVzaWRlbnQpaw0KMGlucHV0cyswb3V0cHV0cyAoMG1ham9y KzIxN21pbm9yKXBhZ2VmYXVsdHMgMHN3YXBzDQpDb21tYW5kIGV4aXRlZCB3aXRoIG5vbi16ZXJv IHN0YXR1cyAxDQowLjc4dXNlciAwLjEyc3lzdGVtIDA6MDAuOTdlbGFwc2VkIDkzJUNQVSAoMGF2 Z3RleHQrMGF2Z2RhdGEgMzA4OG1heHJlc2lkZW50KWsNCjBpbnB1dHMrMG91dHB1dHMgKDBtYWpv cisyMTltaW5vcilwYWdlZmF1bHRzIDBzd2Fwcw0K --------_52D2369D000000001525_MULTIPART_MIXED_-- From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 28 03:28:03 2014 Received: (at 16842) by debbugs.gnu.org; 28 Mar 2014 07:28:03 +0000 Received: from localhost ([127.0.0.1]:53565 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTRCr-0001Em-Vu for submit@debbugs.gnu.org; Fri, 28 Mar 2014 03:28:02 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:39292) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTRCn-0001ES-9D for 16842@debbugs.gnu.org; Fri, 28 Mar 2014 03:27:58 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 02C2E39E801B; Fri, 28 Mar 2014 00:27:56 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BLPQW8rCBpB9; Fri, 28 Mar 2014 00:27:55 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 32E8439E8011; Fri, 28 Mar 2014 00:27:55 -0700 (PDT) Message-ID: <53352477.9050201@cs.ucla.edu> Date: Fri, 28 Mar 2014 00:27:51 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: Norihiro Tanaka , 16842@debbugs.gnu.org Subject: Re: bug#16842: [PATCH] Use mbrtowc_cache in DFA engine References: <20140223004626.6B29.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140223004626.6B29.27F6AC2D@kcn.ne.jp> Content-Type: multipart/mixed; boundary="------------020908070202040906030703" X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 16842 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.7 (--) This is a multi-part message in MIME format. --------------020908070202040906030703 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Thanks very much. I read through that patch and think we can come up with a simpler cache that need not store lengths, but reserves WEOF to represent an incomplete multibyte character. This approach simplifies the code and avoids some glitches when mbrtowc returns special values not in the range 1..N. How about the attached patch instead? --------------020908070202040906030703 Content-Type: text/plain; charset=UTF-8; name="0001-dfa-cache-results-of-mbrtowc-for-speed.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="0001-dfa-cache-results-of-mbrtowc-for-speed.patch" RnJvbSBjZGU2ODkyYTZkOWNmZGM0NzhlYjAxZDMwYTE3MTY0ZmVjN2UzYzVhIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBGcmksIDI4IE1hciAyMDE0IDAwOjExOjUyIC0wNzAwClN1YmplY3Q6IFtQQVRD SF0gZGZhOiBjYWNoZSByZXN1bHRzIG9mIG1icnRvd2MgZm9yIHNwZWVkCgpJZGVhIHN1Z2dl c3RlZCBieSBOb3JpaGlybyBUYW5ha2EgaW4gQnVnIzE2ODQyLgoqIHNyYy9kZmEuYyAobWJy dG93Y19jYWNoZSk6IE5ldyBzdGF0aWMgdmFyLgooYnVpbGRfbWJydG93Y19jYWNoZSwgbWJz X3RvX3djaGFyKTogTmV3IGZ1bmN0aW9ucy4KKEZFVENIX1dDKSBbTUJTX1NVUFBPUlRdOiBT cGVlZCB1cCBieSB1c2luZyBtYnNfdG9fd2NoYXIKaW5zdGVhZCBvZiBtYnJ0b3djIGFuZCB3 Y3RvYi4KKEZFVENIX1dDKSBbIU1CU19TVVBQT1JUXTogUmV3cml0ZSBpbiB0ZXJtcyBvZiBv bGQgRkVUQ0ggbWFjcm8uCihGRVRDSCk6IFJlbW92ZTsgbm8gbG9uZ2VyIHVzZWQuCihsZXgp OiBTaW1wbGlmeSBieSBhdm9pZGluZyB0aGUgbmVlZCBmb3IgRkVUQ0guCihwcmVwYXJlX3dj X2J1ZikgW01CU19TVVBQT1JUXTogU3BlZWQgdXAgYnkgdXNpbmcgbWJzX3RvX3djaGFyLgpT aW1wbGlmeSB0aGUgbG9vcC4KKGRmYWNvbXApOiBJbml0aWFsaXplIHRoZSBjYWNoZS4KLS0t CiBzcmMvZGZhLmMgfCAxMzggKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKyst LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0KIDEgZmlsZSBjaGFuZ2VkLCA3NyBpbnNlcnRp b25zKCspLCA2MSBkZWxldGlvbnMoLSkKCmRpZmYgLS1naXQgYS9zcmMvZGZhLmMgYi9zcmMv ZGZhLmMKaW5kZXggZjg4ZmYyYS4uNjI2MDg3ZSAxMDA2NDQKLS0tIGEvc3JjL2RmYS5jCisr KyBiL3NyYy9kZmEuYwpAQCAtNDMwLDYgKzQzMCw2MiBAQCBzdHJ1Y3QgZGZhCiAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgIHRoZSBkZmEuICAqLwogfTsKIAorLyogQSB0 YWJsZSBpbmRleGVkIGJ5IGJ5dGUgdmFsdWVzIHRoYXQgY29udGFpbnMgdGhlIGNvcnJlc3Bv bmRpbmcgd2lkZQorICAgY2hhcmFjdGVyIChpZiBhbnkpIGZvciB0aGF0IGJ5dGUuICBXRU9G IG1lYW5zIHRoZSBieXRlIGlzIHRoZQorICAgbGVhZGluZyBieXRlIG9mIGEgbXVsdGlieXRl IGNoYXJhY3Rlci4gIEludmFsaWQgYW5kIG51bGwgYnl0ZXMgYXJlCisgICBtYXBwZWQgdG8g dGhlbXNlbHZlcy4gICovCitzdGF0aWMgd2ludF90IG1icnRvd2NfY2FjaGVbTk9UQ0hBUl07 CisKK3N0YXRpYyB2b2lkCitidWlsZF9tYnJ0b3djX2NhY2hlICh2b2lkKQoreworICBpbnQg aTsKKyAgZm9yIChpID0gQ0hBUl9NSU47IGkgPD0gQ0hBUl9NQVg7ICsraSkKKyAgICB7Cisg ICAgICBjaGFyIGMgPSBpOworICAgICAgdW5zaWduZWQgY2hhciB1YyA9IGk7CisgICAgICBt YnN0YXRlX3QgcyA9IHsgMCB9OworICAgICAgc3dpdGNoIChtYnJ0b3djICgmbWJydG93Y19j YWNoZVt1Y10sICZjLCAxLCAmcykpCisgICAgICAgIHsKKyAgICAgICAgY2FzZSAoc2l6ZV90 KSAtMjogbWJydG93Y19jYWNoZVt1Y10gPSBXRU9GOyBicmVhazsKKyAgICAgICAgY2FzZSAo c2l6ZV90KSAtMTogbWJydG93Y19jYWNoZVt1Y10gPSB1YzsgYnJlYWs7CisgICAgICAgIH0K KyAgICB9Cit9CisKKy8qIFN0b3JlIGludG8gKlBXQyB0aGUgcmVzdWx0IG9mIGNvbnZlcnRp bmcgdGhlIGxlYWRpbmcgYnl0ZXMgb2YgdGhlCisgICBtdWx0aWJ5dGUgYnVmZmVyIFMgb2Yg bGVuZ3RoIE4gYnl0ZXMsIHVwZGF0aW5nIHRoZSBjb252ZXJzaW9uIHN0YXRlCisgICBpbiAq TUJTLiAgT24gY29udmVyc2lvbiBlcnJvciwgY29udmVydCBqdXN0IGEgc2luZ2xlIGJ5dGUg YXMtaXMuCisgICBSZXR1cm4gdGhlIG51bWJlciBvZiBieXRlcyBjb252ZXJ0ZWQuCisKKyAg IFRoaXMgZGlmZmVycyBmcm9tIG1icnRvd2MgKFBXQywgUywgTiwgTUJTKSBhcyBmb2xsb3dz OgorCisgICAqIE4gbXVzdCBiZSBhdCBsZWFzdCAxLgorICAgKiBTW04gLSAxXSBtdXN0IGJl IGEgc2VudGluZWwgYnl0ZS4KKyAgICogU2hpZnQgZW5jb2RpbmdzIGFyZSBub3Qgc3VwcG9y dGVkLgorICAgKiBUaGUgcmV0dXJuIHZhbHVlIGlzIGFsd2F5cyBpbiB0aGUgcmFuZ2UgMS4u Ti4KKyAgICogKk1CUyBpcyBhbHdheXMgdmFsaWQgYWZ0ZXJ3YXJkcy4KKyAgICogKlBXQyBp cyBhbHdheXMgc2V0IHRvIHNvbWV0aGluZy4KKyAgICogVGhpcyB1c2VzIG1icnRvd2NfY2Fj aGUgZm9yIHNwZWVkIGluIHRoZSB0eXBpY2FsIGNhc2UuICAqLworc3RhdGljIHNpemVfdAor bWJzX3RvX3djaGFyICh3Y2hhcl90ICpwd2MsIGNoYXIgY29uc3QgKnMsIHNpemVfdCBuLCBt YnN0YXRlX3QgKm1icykKK3sKKyAgdW5zaWduZWQgY2hhciB1YyA9IHNbMF07CisgIHdpbnRf dCB3YyA9IG1icnRvd2NfY2FjaGVbdWNdOworCisgIGlmICh3YyA9PSBXRU9GKQorICAgIHsK KyAgICAgIHNpemVfdCBuYnl0ZXMgPSBtYnJ0b3djIChwd2MsIHMsIG4sIG1icyk7CisgICAg ICBpZiAoMCA8IG5ieXRlcyAmJiBuYnl0ZXMgPCAoc2l6ZV90KSAtMikKKyAgICAgICAgcmV0 dXJuIG5ieXRlczsKKyAgICAgIG1lbXNldCAobWJzLCAwLCBzaXplb2YgKm1icyk7CisgICAg ICB3YyA9IHVjOworICAgIH0KKworICAqcHdjID0gd2M7CisgIHJldHVybiAxOworfQorCiAv KiBTb21lIG1hY3JvcyBmb3IgdXNlciBhY2Nlc3MgdG8gZGZhIGludGVybmFscy4gICovCiAK IC8qIEFDQ0VQVElORyByZXR1cm5zIHRydWUgaWYgcyBjb3VsZCBwb3NzaWJseSBiZSBhbiBh Y2NlcHRpbmcgc3RhdGUgb2Ygci4gICovCkBAIC04NDQsMzUgKzkwMCwxOCBAQCBzdGF0aWMg dW5zaWduZWQgY2hhciBjb25zdCAqYnVmX2VuZDsgICAgLyogcmVmZXJlbmNlIHRvIGVuZCBp biBkZmFleGVjLiAgKi8KICAgICBlbHNlCQkJCQlcCiAgICAgICB7CQkJCQkJXAogICAgICAg ICB3Y2hhcl90IF93YzsJCQkJXAotICAgICAgICBzaXplX3QgbmJ5dGVzID0gbWJydG93YyAo Jl93YywgbGV4cHRyLCBsZXhsZWZ0LCAmbWJzKTsgXAotICAgICAgICBib29sIHZhbGlkX2No YXIgPSAxIDw9IG5ieXRlcyAmJiBuYnl0ZXMgPCAoc2l6ZV90KSAtMjsgXAotICAgICAgICBp ZiAoISB2YWxpZF9jaGFyKQkJCVwKLSAgICAgICAgICB7CQkJCQlcCi0gICAgICAgICAgICBt ZW1zZXQgKCZtYnMsIDAsIHNpemVvZiBtYnMpOwlcCi0gICAgICAgICAgICBjdXJfbWJfbGVu ID0gMTsJCQlcCi0gICAgICAgICAgICAtLWxleGxlZnQ7CQkJCVwKLSAgICAgICAgICAgICh3 YykgPSAoYykgPSB0b191Y2hhciAoKmxleHB0cisrKTsgIFwKLSAgICAgICAgICB9CQkJCQlc Ci0gICAgICAgIGVsc2UJCQkJCVwKLSAgICAgICAgICB7CQkJCQlcCi0gICAgICAgICAgICBj dXJfbWJfbGVuID0gbmJ5dGVzOwkJXAotICAgICAgICAgICAgbGV4cHRyICs9IGN1cl9tYl9s ZW47CQlcCi0gICAgICAgICAgICBsZXhsZWZ0IC09IGN1cl9tYl9sZW47CQlcCi0gICAgICAg ICAgICAod2MpID0gX3djOwkJCQlcCi0gICAgICAgICAgICAoYykgPSB3Y3RvYiAod2MpOwkJ CVwKLSAgICAgICAgICB9CQkJCQlcCisgICAgICAgIHNpemVfdCBuYnl0ZXMgPSBtYnNfdG9f d2NoYXIgKCZfd2MsIGxleHB0ciwgbGV4bGVmdCwgJm1icyk7IFwKKyAgICAgICAgY3VyX21i X2xlbiA9IG5ieXRlczsJCQlcCisgICAgICAgICh3YykgPSBfd2M7CQkJCVwKKyAgICAgICAg KGMpID0gbmJ5dGVzID09IDEgPyB0b191Y2hhciAoKmxleHB0cikgOiBFT0Y7ICAgIFwKKyAg ICAgICAgbGV4cHRyICs9IG5ieXRlczsJCQlcCisgICAgICAgIGxleGxlZnQgLT0gbmJ5dGVz OwkJCVwKICAgICAgIH0JCQkJCQlcCiAgIH0gd2hpbGUgKDApCiAKLSMgZGVmaW5lIEZFVENI KGMsIGVvZmVycikJCQlcCi0gIGRvIHsJCQkJCQlcCi0gICAgd2ludF90IHdjOwkJCQkJXAot ICAgIEZFVENIX1dDIChjLCB3YywgZW9mZXJyKTsJCQlcCi0gIH0gd2hpbGUgKDApCi0KICNl bHNlCiAvKiBOb3RlIHRoYXQgY2hhcmFjdGVycyBiZWNvbWUgdW5zaWduZWQgaGVyZS4gICov Ci0jIGRlZmluZSBGRVRDSChjLCBlb2ZlcnIpCSAgICAgIFwKKyMgZGVmaW5lIEZFVENIX1dD KGMsIHVudXNlZCwgZW9mZXJyKSAgXAogICBkbyB7CQkJCSAgICAgIFwKICAgICBpZiAoISBs ZXhsZWZ0KQkJICAgICAgXAogICAgICAgewkJCQkgICAgICBcCkBAIC04ODUsOCArOTI0LDYg QEAgc3RhdGljIHVuc2lnbmVkIGNoYXIgY29uc3QgKmJ1Zl9lbmQ7ICAgIC8qIHJlZmVyZW5j ZSB0byBlbmQgaW4gZGZhZXhlYy4gICovCiAgICAgLS1sZXhsZWZ0OwkJCSAgICAgIFwKICAg fSB3aGlsZSAoMCkKIAotIyBkZWZpbmUgRkVUQ0hfV0MoYywgdW51c2VkLCBlb2ZlcnIpIEZF VENIIChjLCBlb2ZlcnIpCi0KICNlbmRpZiAvKiBNQlNfU1VQUE9SVCAqLwogCiAjaWZuZGVm IE1JTgpAQCAtMTI2NCwxNCArMTMwMSw5IEBAIGxleCAodm9pZCkKICAgICAgImlmIChiYWNr c2xhc2gpIC4uLiIuICAqLwogICBmb3IgKGkgPSAwOyBpIDwgMjsgKytpKQogICAgIHsKLSAg ICAgIGlmIChNQl9DVVJfTUFYID4gMSkKLSAgICAgICAgewotICAgICAgICAgIEZFVENIX1dD IChjLCB3Y3RvaywgTlVMTCk7Ci0gICAgICAgICAgaWYgKChpbnQpIGMgPT0gRU9GKQotICAg ICAgICAgICAgZ290byBub3JtYWxfY2hhcjsKLSAgICAgICAgfQotICAgICAgZWxzZQotICAg ICAgICBGRVRDSCAoYywgTlVMTCk7CisgICAgICBGRVRDSF9XQyAoYywgd2N0b2ssIE5VTEwp OworICAgICAgaWYgKGMgPT0gKHVuc2lnbmVkIGludCkgRU9GKQorICAgICAgICBnb3RvIG5v cm1hbF9jaGFyOwogCiAgICAgICBzd2l0Y2ggKGMpCiAgICAgICAgIHsKQEAgLTMzMjUsMzkg KzMzNTcsMjIgQEAgcHJlcGFyZV93Y19idWYgKGNvbnN0IGNoYXIgKmJlZ2luLCBjb25zdCBj aGFyICplbmQpCiB7CiAjaWYgTUJTX1NVUFBPUlQKICAgdW5zaWduZWQgY2hhciBlb2wgPSBl b2xieXRlOwotICBzaXplX3QgcmVtYWluX2J5dGVzLCBpOworICBzaXplX3QgaTsKKyAgc2l6 ZV90IGlsaW0gPSBlbmQgLSBiZWdpbiArIDE7CiAKICAgYnVmX2JlZ2luID0gKHVuc2lnbmVk IGNoYXIgKikgYmVnaW47CiAKLSAgcmVtYWluX2J5dGVzID0gMDsKLSAgZm9yIChpID0gMDsg aSA8IGVuZCAtIGJlZ2luICsgMTsgaSsrKQorICBmb3IgKGkgPSAwOyBpIDwgaWxpbTsgaSsr KQogICAgIHsKLSAgICAgIGlmIChyZW1haW5fYnl0ZXMgPT0gMCkKLSAgICAgICAgewotICAg ICAgICAgIHNpemVfdCBuYnl0ZXMKLSAgICAgICAgICAgID0gbWJydG93YyAoaW5wdXR3Y3Mg KyBpLCBiZWdpbiArIGksIGVuZCAtIGJlZ2luIC0gaSArIDEsICZtYnMpOwotICAgICAgICAg IGlmICghICgxIDw9IG5ieXRlcyAmJiBuYnl0ZXMgPCAoc2l6ZV90KSAtMikKLSAgICAgICAg ICAgICAgfHwgKG5ieXRlcyA9PSAxICYmIGlucHV0d2NzW2ldID09ICh3Y2hhcl90KSBiZWdp bltpXSkpCi0gICAgICAgICAgICB7Ci0gICAgICAgICAgICAgIGlmICgoc2l6ZV90KSAtMiA8 PSBuYnl0ZXMpCi0gICAgICAgICAgICAgICAgbWVtc2V0ICgmbWJzLCAwLCBzaXplb2YgbWJz KTsKLSAgICAgICAgICAgICAgcmVtYWluX2J5dGVzID0gMDsKLSAgICAgICAgICAgICAgaW5w dXR3Y3NbaV0gPSAod2NoYXJfdCkgYmVnaW5baV07Ci0gICAgICAgICAgICAgIG1ibGVuX2J1 ZltpXSA9IDA7Ci0gICAgICAgICAgICAgIGlmIChiZWdpbltpXSA9PSBlb2wpCi0gICAgICAg ICAgICAgICAgYnJlYWs7Ci0gICAgICAgICAgICB9Ci0gICAgICAgICAgZWxzZQotICAgICAg ICAgICAgewotICAgICAgICAgICAgICBtYmxlbl9idWZbaV0gPSBuYnl0ZXM7Ci0gICAgICAg ICAgICAgIHJlbWFpbl9ieXRlcyA9IG5ieXRlcyAtIDE7Ci0gICAgICAgICAgICB9Ci0gICAg ICAgIH0KLSAgICAgIGVsc2UKKyAgICAgIHNpemVfdCBuYnl0ZXMgPSBtYnNfdG9fd2NoYXIg KGlucHV0d2NzICsgaSwgYmVnaW4gKyBpLCBpbGltIC0gaSwgJm1icyk7CisgICAgICBtYmxl bl9idWZbaV0gPSBuYnl0ZXMgLSAobmJ5dGVzID09IDEpOworICAgICAgaWYgKGJlZ2luW2ld ID09IGVvbCkKKyAgICAgICAgYnJlYWs7CisgICAgICB3aGlsZSAoLS1uYnl0ZXMgIT0gMCkK ICAgICAgICAgewotICAgICAgICAgIG1ibGVuX2J1ZltpXSA9IHJlbWFpbl9ieXRlczsKKyAg ICAgICAgICBpKys7CisgICAgICAgICAgbWJsZW5fYnVmW2ldID0gbmJ5dGVzOwogICAgICAg ICAgIGlucHV0d2NzW2ldID0gMDsKLSAgICAgICAgICByZW1haW5fYnl0ZXMtLTsKICAgICAg ICAgfQogICAgIH0KIApAQCAtMzYxMyw2ICszNjI4LDcgQEAgdm9pZAogZGZhY29tcCAoY2hh ciBjb25zdCAqcywgc2l6ZV90IGxlbiwgc3RydWN0IGRmYSAqZCwgaW50IHNlYXJjaGZsYWcp CiB7CiAgIGRmYWluaXQgKGQpOworICBidWlsZF9tYnJ0b3djX2NhY2hlICgpOwogICBkZmFw YXJzZSAocywgbGVuLCBkKTsKICAgZGZhbXVzdCAoZCk7CiAgIGRmYW9wdGltaXplIChkKTsK LS0gCjEuOS4wCgo= --------------020908070202040906030703-- From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 28 12:05:35 2014 Received: (at 16842) by debbugs.gnu.org; 28 Mar 2014 16:05:35 +0000 Received: from localhost ([127.0.0.1]:55029 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTZHh-0008L9-M4 for submit@debbugs.gnu.org; Fri, 28 Mar 2014 12:05:34 -0400 Received: from pbsg500.nifty.com ([202.248.238.70]:41628) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTZHc-0008Kw-H8 for 16842@debbugs.gnu.org; Fri, 28 Mar 2014 12:05:32 -0400 Received: from [10.120.1.56] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) (authenticated) by pbsg500.nifty.com with ESMTP id s2SG5Lvh023497; Sat, 29 Mar 2014 01:05:22 +0900 X-Nifty-SrcIP: [118.21.128.66] Date: Sat, 29 Mar 2014 01:05:23 +0900 From: Norihiro Tanaka To: Paul Eggert Subject: bug#16842: [PATCH] Use mbrtowc_cache in DFA engine In-Reply-To: <53352477.9050201@cs.ucla.edu> References: <20140223004626.6B29.27F6AC2D@kcn.ne.jp> <53352477.9050201@cs.ucla.edu> Message-Id: <20140329010522.062C.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------_53359AA6000000000627_MULTIPART_MIXED_" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: 2.3 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Paul, Thanks very match. I checked the patch, and I add fixes to it as following. 1. Fixed warning. [...] Content analysis details: (2.3 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 2.7 RCVD_IN_PSBL RBL: Received via a relay in PSBL [202.248.238.70 listed in psbl.surriel.com] -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record -0.4 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain X-Debbugs-Envelope-To: 16842 Cc: 16842@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 2.3 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Paul, Thanks very match. I checked the patch, and I add fixes to it as following. 1. Fixed warning. [...] Content analysis details: (2.3 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 2.7 RCVD_IN_PSBL RBL: Received via a relay in PSBL [202.248.238.70 listed in psbl.surriel.com] -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record -0.4 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain --------_53359AA6000000000627_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Paul, Thanks very match. I checked the patch, and I add fixes to it as following. 1. Fixed warning. dfa.c: In function 'build_mbrtowc_cache': dfa.c:448: warning: pointer targets in passing argument 1 of 'mbrtowc' differ in signedness 2. took mbrtowc_cache into new member of struct dfa. When struct dfa more than one are used at the same time, mbrtowc cache may be conflict. So, take mbrtowc_cache into new member of struct dfa, and define each mbrtowc cache for them. Norihiro --------_53359AA6000000000627_MULTIPART_MIXED_ Content-Type: text/plain; charset="US-ASCII" Content-Disposition: attachment; filename="patch.txt" Content-Transfer-Encoding: base64 RnJvbSA0MWJmZDJmNjZhNDhlZmMwY2RmMWI4NjVjMmNjNGNkYjQ4ZDk4Y2UwIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOb3JpaGlybyBUYW5ha2EgPG5vcml0bmtAa2NuLm5lLmpwPgpE YXRlOiBTYXQsIDI5IE1hciAyMDE0IDAwOjI4OjU2ICswOTAwClN1YmplY3Q6IFtQQVRDSF0gZ3Jl cDogdGFrZSBtYnJ0b3djX2NhY2hlIGludG8gbmV3IG1lbWJlciBvZiBzdHJ1Y3QgZGZhCgpXaGVu IHN0cnVjdCBkZmEgbW9yZSB0aGFuIG9uZSBhcmUgdXNlZCBhdCB0aGUgc2FtZSB0aW1lLCBtYnJ0 b3djIGNhY2hlCm1heSBiZSBjb25mbGljdC4gIFNvLCB0YWtlIG1icnRvd2NfY2FjaGUgaW50byBu ZXcgbWVtYmVyIG9mIHN0cnVjdCBkZmEsCmFuZCBkZWZpbmUgZWFjaCBtYnJ0b3djIGNhY2hlIGZv ciB0aGVtLgoKKiBzcmMvZGZhLmMgKHN0cnVjdCBkZmEpOiBOZXcgbWVtYmVyIGBtYnJ0b3djX2Nh Y2hlJy4KKGRmYW1iY2FjaGUpOiBSZW5hbWUgZnJvbSBidWlsZF9tYnJ0b3djX2NhY2hlLiAgQWRk IGRlcGVuZGVuY3kgb24gc3RydWN0IGRmYS4KKG1ic190b193Y2hhcik6IEFkZCBkZXBlbmRlbmN5 IG9uIHN0cnVjdCBkZmEuCihGRVRDSF9XQyk6IFVzZSBpdC4KKHByZXBhcmVfd2NfYnVmKTogVXNl IGl0LiAgQWRkIGRlcGVuZGVuY3kgb24gc3RydWN0IGRmYS4KKGRmYWNvbXApOiBDYWxsIGl0Lgoo ZGZhZnJlZSk6IFJlbGVhc2UgaXQuCi0tLQogc3JjL2RmYS5jIHwgMTMzICsrKysrKysrKysrKysr KysrKysrKysrKysrKysrKysrKy0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tCiAxIGZpbGUg Y2hhbmdlZCwgNzEgaW5zZXJ0aW9ucygrKSwgNjIgZGVsZXRpb25zKC0pCgpkaWZmIC0tZ2l0IGEv c3JjL2RmYS5jIGIvc3JjL2RmYS5jCmluZGV4IDYyNjA4N2UuLjFjYTdmMzggMTAwNjQ0Ci0tLSBh L3NyYy9kZmEuYworKysgYi9zcmMvZGZhLmMKQEAgLTM3Niw2ICszNzYsMTQgQEAgc3RydWN0IGRm YQogICBzaXplX3Qgbm11bHRpYnl0ZV9wcm9wOwogICBpbnQgKm11bHRpYnl0ZV9wcm9wOwogCisj aWYgTUJTX1NVUFBPUlQKKyAgLyogQSB0YWJsZSBpbmRleGVkIGJ5IGJ5dGUgdmFsdWVzIHRoYXQg Y29udGFpbnMgdGhlIGNvcnJlc3BvbmRpbmcgd2lkZQorICAgICBjaGFyYWN0ZXIgKGlmIGFueSkg Zm9yIHRoYXQgYnl0ZS4gIFdFT0YgbWVhbnMgdGhlIGJ5dGUgaXMgdGhlCisgICAgIGxlYWRpbmcg Ynl0ZSBvZiBhIG11bHRpYnl0ZSBjaGFyYWN0ZXIuICBJbnZhbGlkIGFuZCBudWxsIGJ5dGVzIGFy ZQorICAgICBtYXBwZWQgdG8gdGhlbXNlbHZlcy4gICovCisgIHdpbnRfdCAqbWJydG93Y19jYWNo ZTsKKyNlbmRpZgorCiAgIC8qIEFycmF5IG9mIHRoZSBicmFja2V0IGV4cHJlc3Npb24gaW4gdGhl IERGQS4gICovCiAgIHN0cnVjdCBtYl9jaGFyX2NsYXNzZXMgKm1iY3NldHM7CiAgIHNpemVfdCBu bWJjc2V0czsKQEAgLTQzMCw2MiArNDM4LDYgQEAgc3RydWN0IGRmYQogICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICB0aGUgZGZhLiAgKi8KIH07CiAKLS8qIEEgdGFibGUgaW5kZXhl ZCBieSBieXRlIHZhbHVlcyB0aGF0IGNvbnRhaW5zIHRoZSBjb3JyZXNwb25kaW5nIHdpZGUKLSAg IGNoYXJhY3RlciAoaWYgYW55KSBmb3IgdGhhdCBieXRlLiAgV0VPRiBtZWFucyB0aGUgYnl0ZSBp cyB0aGUKLSAgIGxlYWRpbmcgYnl0ZSBvZiBhIG11bHRpYnl0ZSBjaGFyYWN0ZXIuICBJbnZhbGlk IGFuZCBudWxsIGJ5dGVzIGFyZQotICAgbWFwcGVkIHRvIHRoZW1zZWx2ZXMuICAqLwotc3RhdGlj IHdpbnRfdCBtYnJ0b3djX2NhY2hlW05PVENIQVJdOwotCi1zdGF0aWMgdm9pZAotYnVpbGRfbWJy dG93Y19jYWNoZSAodm9pZCkKLXsKLSAgaW50IGk7Ci0gIGZvciAoaSA9IENIQVJfTUlOOyBpIDw9 IENIQVJfTUFYOyArK2kpCi0gICAgewotICAgICAgY2hhciBjID0gaTsKLSAgICAgIHVuc2lnbmVk IGNoYXIgdWMgPSBpOwotICAgICAgbWJzdGF0ZV90IHMgPSB7IDAgfTsKLSAgICAgIHN3aXRjaCAo bWJydG93YyAoJm1icnRvd2NfY2FjaGVbdWNdLCAmYywgMSwgJnMpKQotICAgICAgICB7Ci0gICAg ICAgIGNhc2UgKHNpemVfdCkgLTI6IG1icnRvd2NfY2FjaGVbdWNdID0gV0VPRjsgYnJlYWs7Ci0g ICAgICAgIGNhc2UgKHNpemVfdCkgLTE6IG1icnRvd2NfY2FjaGVbdWNdID0gdWM7IGJyZWFrOwot ICAgICAgICB9Ci0gICAgfQotfQotCi0vKiBTdG9yZSBpbnRvICpQV0MgdGhlIHJlc3VsdCBvZiBj b252ZXJ0aW5nIHRoZSBsZWFkaW5nIGJ5dGVzIG9mIHRoZQotICAgbXVsdGlieXRlIGJ1ZmZlciBT IG9mIGxlbmd0aCBOIGJ5dGVzLCB1cGRhdGluZyB0aGUgY29udmVyc2lvbiBzdGF0ZQotICAgaW4g Kk1CUy4gIE9uIGNvbnZlcnNpb24gZXJyb3IsIGNvbnZlcnQganVzdCBhIHNpbmdsZSBieXRlIGFz LWlzLgotICAgUmV0dXJuIHRoZSBudW1iZXIgb2YgYnl0ZXMgY29udmVydGVkLgotCi0gICBUaGlz IGRpZmZlcnMgZnJvbSBtYnJ0b3djIChQV0MsIFMsIE4sIE1CUykgYXMgZm9sbG93czoKLQotICAg KiBOIG11c3QgYmUgYXQgbGVhc3QgMS4KLSAgICogU1tOIC0gMV0gbXVzdCBiZSBhIHNlbnRpbmVs IGJ5dGUuCi0gICAqIFNoaWZ0IGVuY29kaW5ncyBhcmUgbm90IHN1cHBvcnRlZC4KLSAgICogVGhl IHJldHVybiB2YWx1ZSBpcyBhbHdheXMgaW4gdGhlIHJhbmdlIDEuLk4uCi0gICAqICpNQlMgaXMg YWx3YXlzIHZhbGlkIGFmdGVyd2FyZHMuCi0gICAqICpQV0MgaXMgYWx3YXlzIHNldCB0byBzb21l dGhpbmcuCi0gICAqIFRoaXMgdXNlcyBtYnJ0b3djX2NhY2hlIGZvciBzcGVlZCBpbiB0aGUgdHlw aWNhbCBjYXNlLiAgKi8KLXN0YXRpYyBzaXplX3QKLW1ic190b193Y2hhciAod2NoYXJfdCAqcHdj LCBjaGFyIGNvbnN0ICpzLCBzaXplX3QgbiwgbWJzdGF0ZV90ICptYnMpCi17Ci0gIHVuc2lnbmVk IGNoYXIgdWMgPSBzWzBdOwotICB3aW50X3Qgd2MgPSBtYnJ0b3djX2NhY2hlW3VjXTsKLQotICBp ZiAod2MgPT0gV0VPRikKLSAgICB7Ci0gICAgICBzaXplX3QgbmJ5dGVzID0gbWJydG93YyAocHdj LCBzLCBuLCBtYnMpOwotICAgICAgaWYgKDAgPCBuYnl0ZXMgJiYgbmJ5dGVzIDwgKHNpemVfdCkg LTIpCi0gICAgICAgIHJldHVybiBuYnl0ZXM7Ci0gICAgICBtZW1zZXQgKG1icywgMCwgc2l6ZW9m ICptYnMpOwotICAgICAgd2MgPSB1YzsKLSAgICB9Ci0KLSAgKnB3YyA9IHdjOwotICByZXR1cm4g MTsKLX0KLQogLyogU29tZSBtYWNyb3MgZm9yIHVzZXIgYWNjZXNzIHRvIGRmYSBpbnRlcm5hbHMu ICAqLwogCiAvKiBBQ0NFUFRJTkcgcmV0dXJucyB0cnVlIGlmIHMgY291bGQgcG9zc2libHkgYmUg YW4gYWNjZXB0aW5nIHN0YXRlIG9mIHIuICAqLwpAQCAtNTMzLDYgKzQ4NSw2MCBAQCBzdGF0aWMg dm9pZCByZWdleHAgKHZvaWQpOwogICAgIH0JCQkJCQkJCVwKICAgd2hpbGUgKGZhbHNlKQogCitz dGF0aWMgdm9pZAorZGZhbWJjYWNoZSAoc3RydWN0IGRmYSAqZCkKK3sKKyNpZiBNQlNfU1VQUE9S VAorICBpbnQgaTsKKyAgTUFMTE9DIChkLT5tYnJ0b3djX2NhY2hlLCBOT1RDSEFSKTsKKyAgZm9y IChpID0gQ0hBUl9NSU47IGkgPD0gQ0hBUl9NQVg7ICsraSkKKyAgICB7CisgICAgICBjaGFyIGMg PSBpOworICAgICAgdW5zaWduZWQgY2hhciB1YyA9IGk7CisgICAgICBtYnN0YXRlX3QgcyA9IHsg MCB9OworICAgICAgc3dpdGNoIChtYnJ0b3djICgod2NoYXJfdCAqKSAmZC0+bWJydG93Y19jYWNo ZVt1Y10sICZjLCAxLCAmcykpCisgICAgICAgIHsKKyAgICAgICAgY2FzZSAoc2l6ZV90KSAtMjog ZC0+bWJydG93Y19jYWNoZVt1Y10gPSBXRU9GOyBicmVhazsKKyAgICAgICAgY2FzZSAoc2l6ZV90 KSAtMTogZC0+bWJydG93Y19jYWNoZVt1Y10gPSB1YzsgYnJlYWs7CisgICAgICAgIH0KKyAgICB9 CisjZW5kaWYKK30KKworI2lmIE1CU19TVVBQT1JUCisvKiBTdG9yZSBpbnRvICpQV0MgdGhlIHJl c3VsdCBvZiBjb252ZXJ0aW5nIHRoZSBsZWFkaW5nIGJ5dGVzIG9mIHRoZQorICAgbXVsdGlieXRl IGJ1ZmZlciBTIG9mIGxlbmd0aCBOIGJ5dGVzLCB1cGRhdGluZyB0aGUgY29udmVyc2lvbiBzdGF0 ZQorICAgaW4gKk1CUy4gIE9uIGNvbnZlcnNpb24gZXJyb3IsIGNvbnZlcnQganVzdCBhIHNpbmds ZSBieXRlIGFzLWlzLgorICAgUmV0dXJuIHRoZSBudW1iZXIgb2YgYnl0ZXMgY29udmVydGVkLgor CisgICBUaGlzIGRpZmZlcnMgZnJvbSBtYnJ0b3djIChQV0MsIFMsIE4sIE1CUykgYXMgZm9sbG93 czoKKworICAgKiBOIG11c3QgYmUgYXQgbGVhc3QgMS4KKyAgICogU1tOIC0gMV0gbXVzdCBiZSBh IHNlbnRpbmVsIGJ5dGUuCisgICAqIFNoaWZ0IGVuY29kaW5ncyBhcmUgbm90IHN1cHBvcnRlZC4K KyAgICogVGhlIHJldHVybiB2YWx1ZSBpcyBhbHdheXMgaW4gdGhlIHJhbmdlIDEuLk4uCisgICAq ICpNQlMgaXMgYWx3YXlzIHZhbGlkIGFmdGVyd2FyZHMuCisgICAqICpQV0MgaXMgYWx3YXlzIHNl dCB0byBzb21ldGhpbmcuCisgICAqIFRoaXMgdXNlcyBtYnJ0b3djX2NhY2hlIGZvciBzcGVlZCBp biB0aGUgdHlwaWNhbCBjYXNlLiAgKi8KK3N0YXRpYyBzaXplX3QKK21ic190b193Y2hhciAoc3Ry dWN0IGRmYSAqZCwgd2NoYXJfdCAqcHdjLCBjaGFyIGNvbnN0ICpzLCBzaXplX3QgbiwgbWJzdGF0 ZV90ICptYnMpCit7CisgIHVuc2lnbmVkIGNoYXIgdWMgPSBzWzBdOworICB3aW50X3Qgd2MgPSBk LT5tYnJ0b3djX2NhY2hlW3VjXTsKKworICBpZiAod2MgPT0gV0VPRikKKyAgICB7CisgICAgICBz aXplX3QgbmJ5dGVzID0gbWJydG93YyAocHdjLCBzLCBuLCBtYnMpOworICAgICAgaWYgKDAgPCBu Ynl0ZXMgJiYgbmJ5dGVzIDwgKHNpemVfdCkgLTIpCisgICAgICAgIHJldHVybiBuYnl0ZXM7Cisg ICAgICBtZW1zZXQgKG1icywgMCwgc2l6ZW9mICptYnMpOworICAgICAgd2MgPSB1YzsKKyAgICB9 CisKKyAgKnB3YyA9IHdjOworICByZXR1cm4gMTsKK30KKyNlbmRpZgogCiAjaWZkZWYgREVCVUcK IApAQCAtOTAwLDcgKzkwNiw3IEBAIHN0YXRpYyB1bnNpZ25lZCBjaGFyIGNvbnN0ICpidWZfZW5k OyAgICAvKiByZWZlcmVuY2UgdG8gZW5kIGluIGRmYWV4ZWMuICAqLwogICAgIGVsc2UJCQkJCVwK ICAgICAgIHsJCQkJCQlcCiAgICAgICAgIHdjaGFyX3QgX3djOwkJCQlcCi0gICAgICAgIHNpemVf dCBuYnl0ZXMgPSBtYnNfdG9fd2NoYXIgKCZfd2MsIGxleHB0ciwgbGV4bGVmdCwgJm1icyk7IFwK KyAgICAgICAgc2l6ZV90IG5ieXRlcyA9IG1ic190b193Y2hhciAoZGZhLCAmX3djLCBsZXhwdHIs IGxleGxlZnQsICZtYnMpOyBcCiAgICAgICAgIGN1cl9tYl9sZW4gPSBuYnl0ZXM7CQkJXAogICAg ICAgICAod2MpID0gX3djOwkJCQlcCiAgICAgICAgIChjKSA9IG5ieXRlcyA9PSAxID8gdG9fdWNo YXIgKCpsZXhwdHIpIDogRU9GOyAgICBcCkBAIC0zMzUzLDcgKzMzNTksNyBAQCB0cmFuc2l0X3N0 YXRlIChzdHJ1Y3QgZGZhICpkLCBzdGF0ZV9udW0gcywgdW5zaWduZWQgY2hhciBjb25zdCAqKnBw KQogLyogSW5pdGlhbGl6ZSBtYmxlbl9idWYgYW5kIGlucHV0d2NzIHdpdGggZGF0YSBmcm9tIHRo ZSBuZXh0IGxpbmUuICAqLwogCiBzdGF0aWMgdm9pZAotcHJlcGFyZV93Y19idWYgKGNvbnN0IGNo YXIgKmJlZ2luLCBjb25zdCBjaGFyICplbmQpCitwcmVwYXJlX3djX2J1ZiAoc3RydWN0IGRmYSAq ZCwgY29uc3QgY2hhciAqYmVnaW4sIGNvbnN0IGNoYXIgKmVuZCkKIHsKICNpZiBNQlNfU1VQUE9S VAogICB1bnNpZ25lZCBjaGFyIGVvbCA9IGVvbGJ5dGU7CkBAIC0zMzY0LDcgKzMzNzAsNyBAQCBw cmVwYXJlX3djX2J1ZiAoY29uc3QgY2hhciAqYmVnaW4sIGNvbnN0IGNoYXIgKmVuZCkKIAogICBm b3IgKGkgPSAwOyBpIDwgaWxpbTsgaSsrKQogICAgIHsKLSAgICAgIHNpemVfdCBuYnl0ZXMgPSBt YnNfdG9fd2NoYXIgKGlucHV0d2NzICsgaSwgYmVnaW4gKyBpLCBpbGltIC0gaSwgJm1icyk7Cisg ICAgICBzaXplX3QgbmJ5dGVzID0gbWJzX3RvX3djaGFyIChkLCBpbnB1dHdjcyArIGksIGJlZ2lu ICsgaSwgaWxpbSAtIGksICZtYnMpOwogICAgICAgbWJsZW5fYnVmW2ldID0gbmJ5dGVzIC0gKG5i eXRlcyA9PSAxKTsKICAgICAgIGlmIChiZWdpbltpXSA9PSBlb2wpCiAgICAgICAgIGJyZWFrOwpA QCAtMzQxOSw3ICszNDI1LDcgQEAgZGZhZXhlYyAoc3RydWN0IGRmYSAqZCwgY2hhciBjb25zdCAq YmVnaW4sIGNoYXIgKmVuZCwKICAgICAgIE1BTExPQyAobWJsZW5fYnVmLCBlbmQgLSBiZWdpbiAr IDIpOwogICAgICAgTUFMTE9DIChpbnB1dHdjcywgZW5kIC0gYmVnaW4gKyAyKTsKICAgICAgIG1l bXNldCAoJm1icywgMCwgc2l6ZW9mIChtYnN0YXRlX3QpKTsKLSAgICAgIHByZXBhcmVfd2NfYnVm ICgoY29uc3QgY2hhciAqKSBwLCBlbmQpOworICAgICAgcHJlcGFyZV93Y19idWYgKGQsIChjb25z dCBjaGFyICopIHAsIGVuZCk7CiAgICAgfQogCiAgIGZvciAoOzspCkBAIC0zNTA5LDcgKzM1MTUs NyBAQCBkZmFleGVjIChzdHJ1Y3QgZGZhICpkLCBjaGFyIGNvbnN0ICpiZWdpbiwgY2hhciAqZW5k LAogICAgICAgICAgICAgKysqY291bnQ7CiAKICAgICAgICAgICBpZiAoZC0+bWJfY3VyX21heCA+ IDEpCi0gICAgICAgICAgICBwcmVwYXJlX3djX2J1ZiAoKGNvbnN0IGNoYXIgKikgcCwgZW5kKTsK KyAgICAgICAgICAgIHByZXBhcmVfd2NfYnVmIChkLCAoY29uc3QgY2hhciAqKSBwLCBlbmQpOwog ICAgICAgICB9CiAKICAgICAgIC8qIENoZWNrIGlmIHdlJ3ZlIHJ1biBvZmYgdGhlIGVuZCBvZiB0 aGUgYnVmZmVyLiAgKi8KQEAgLTM2MjgsNyArMzYzNCw3IEBAIHZvaWQKIGRmYWNvbXAgKGNoYXIg Y29uc3QgKnMsIHNpemVfdCBsZW4sIHN0cnVjdCBkZmEgKmQsIGludCBzZWFyY2hmbGFnKQogewog ICBkZmFpbml0IChkKTsKLSAgYnVpbGRfbWJydG93Y19jYWNoZSAoKTsKKyAgZGZhbWJjYWNoZSAo ZCk7CiAgIGRmYXBhcnNlIChzLCBsZW4sIGQpOwogICBkZmFtdXN0IChkKTsKICAgZGZhb3B0aW1p emUgKGQpOwpAQCAtMzY0Nyw2ICszNjUzLDkgQEAgZGZhZnJlZSAoc3RydWN0IGRmYSAqZCkKIAog ICBpZiAoZC0+bWJfY3VyX21heCA+IDEpCiAgICAgZnJlZV9tYmRhdGEgKGQpOworI2lmIE1CU19T VVBQT1JUCisgIGZyZWUgKGQtPm1icnRvd2NfY2FjaGUpOworI2VuZGlmCiAKICAgZm9yIChpID0g MDsgaSA8IGQtPnNpbmRleDsgKytpKQogICAgIHsKLS0gCjEuOS4xCgo= --------_53359AA6000000000627_MULTIPART_MIXED_-- From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 28 12:36:24 2014 Received: (at 16842-done) by debbugs.gnu.org; 28 Mar 2014 16:36:24 +0000 Received: from localhost ([127.0.0.1]:55045 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTZlX-0000gr-Jq for submit@debbugs.gnu.org; Fri, 28 Mar 2014 12:36:24 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:58904) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTZlU-0000gg-D0 for 16842-done@debbugs.gnu.org; Fri, 28 Mar 2014 12:36:21 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 3E8C239E801C; Fri, 28 Mar 2014 09:36:19 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sxVdqdgeWbJh; Fri, 28 Mar 2014 09:36:18 -0700 (PDT) Received: from penguin.cs.ucla.edu (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id A275439E801A; Fri, 28 Mar 2014 09:36:18 -0700 (PDT) Message-ID: <5335A4FB.2000302@cs.ucla.edu> Date: Fri, 28 Mar 2014 09:36:11 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: Norihiro Tanaka Subject: Re: bug#16842: [PATCH] Use mbrtowc_cache in DFA engine References: <20140223004626.6B29.27F6AC2D@kcn.ne.jp> <53352477.9050201@cs.ucla.edu> <20140329010522.062C.27F6AC2D@kcn.ne.jp> In-Reply-To: <20140329010522.062C.27F6AC2D@kcn.ne.jp> Content-Type: multipart/mixed; boundary="------------090000000007070106020002" X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 16842-done Cc: 16842-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.7 (--) This is a multi-part message in MIME format. --------------090000000007070106020002 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Thanks for the review and the fixes. I found a couple more things. First, it's not portable to cast wint_t * to wchar_t *, since the pointed-to types might be different sizes or representations. Second, we can put the cache directly in the struct dfa, saving the overhead of doing a separate malloc. The attached further patch should address these problems. I pushed this, along with the earlier two patches in this sequence, and am marking this as done. --------------090000000007070106020002 Content-Type: text/x-patch; name="0003-dfa-avoid-an-indirection-and-port-wint_t-usage.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0003-dfa-avoid-an-indirection-and-port-wint_t-usage.patch" >From c51be43c00c8ae05a1a19d503865ef5f97ea6612 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Fri, 28 Mar 2014 09:32:29 -0700 Subject: [PATCH 3/3] dfa: avoid an indirection and port wint_t usage * src/dfa.c (struct dfa): Put mbrtowc_cache directly into struct dfa rather than having a pointer; this saves a malloc and an indirection. All uses changed. (dfambcache): Port to hosts where wint_t * can't be cast to wchar_t *. --- src/dfa.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/src/dfa.c b/src/dfa.c index 1ca7f38..4ed2189 100644 --- a/src/dfa.c +++ b/src/dfa.c @@ -381,7 +381,7 @@ struct dfa character (if any) for that byte. WEOF means the byte is the leading byte of a multibyte character. Invalid and null bytes are mapped to themselves. */ - wint_t *mbrtowc_cache; + wint_t mbrtowc_cache[NOTCHAR]; #endif /* Array of the bracket expression in the DFA. */ @@ -490,38 +490,42 @@ dfambcache (struct dfa *d) { #if MBS_SUPPORT int i; - MALLOC (d->mbrtowc_cache, NOTCHAR); for (i = CHAR_MIN; i <= CHAR_MAX; ++i) { char c = i; unsigned char uc = i; mbstate_t s = { 0 }; - switch (mbrtowc ((wchar_t *) &d->mbrtowc_cache[uc], &c, 1, &s)) + wchar_t wc; + wint_t wi; + switch (mbrtowc (&wc, &c, 1, &s)) { - case (size_t) -2: d->mbrtowc_cache[uc] = WEOF; break; - case (size_t) -1: d->mbrtowc_cache[uc] = uc; break; + default: wi = wc; break; + case (size_t) -2: wi = WEOF; break; + case (size_t) -1: wi = uc; break; } + d->mbrtowc_cache[uc] = wi; } #endif } #if MBS_SUPPORT -/* Store into *PWC the result of converting the leading bytes of the - multibyte buffer S of length N bytes, updating the conversion state - in *MBS. On conversion error, convert just a single byte as-is. - Return the number of bytes converted. +/* Given the dfa D, store into *PWC the result of converting the + leading bytes of the multibyte buffer S of length N bytes, updating + the conversion state in *MBS. On conversion error, convert just a + single byte as-is. Return the number of bytes converted. This differs from mbrtowc (PWC, S, N, MBS) as follows: + * Extra arg D, containing an mbrtowc_cache for speed. * N must be at least 1. * S[N - 1] must be a sentinel byte. * Shift encodings are not supported. * The return value is always in the range 1..N. * *MBS is always valid afterwards. - * *PWC is always set to something. - * This uses mbrtowc_cache for speed in the typical case. */ + * *PWC is always set to something. */ static size_t -mbs_to_wchar (struct dfa *d, wchar_t *pwc, char const *s, size_t n, mbstate_t *mbs) +mbs_to_wchar (struct dfa *d, wchar_t *pwc, char const *s, size_t n, + mbstate_t *mbs) { unsigned char uc = s[0]; wint_t wc = d->mbrtowc_cache[uc]; @@ -3653,9 +3657,6 @@ dfafree (struct dfa *d) if (d->mb_cur_max > 1) free_mbdata (d); -#if MBS_SUPPORT - free (d->mbrtowc_cache); -#endif for (i = 0; i < d->sindex; ++i) { -- 1.9.0 --------------090000000007070106020002-- From unknown Tue Jun 17 20:18:21 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 26 Apr 2014 11:24:03 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator