From unknown Sat Jun 14 18:41:44 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#44861 <44861@debbugs.gnu.org> To: bug#44861 <44861@debbugs.gnu.org> Subject: Status: 27.1; [PATCH] signal in `replace-regexp-in-string' Reply-To: bug#44861 <44861@debbugs.gnu.org> Date: Sun, 15 Jun 2025 01:41:44 +0000 retitle 44861 27.1; [PATCH] signal in `replace-regexp-in-string' reassign 44861 emacs submitter 44861 Shigeru Fukaya severity 44861 normal tag 44861 patch confirmed thanks From debbugs-submit-bounces@debbugs.gnu.org Tue Nov 24 23:02:21 2020 Received: (at submit) by debbugs.gnu.org; 25 Nov 2020 04:02:22 +0000 Received: from localhost ([127.0.0.1]:33794 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1khm0b-0002WL-Fy for submit@debbugs.gnu.org; Tue, 24 Nov 2020 23:02:21 -0500 Received: from lists.gnu.org ([209.51.188.17]:42890) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1khm0Z-0002WE-Oj for submit@debbugs.gnu.org; Tue, 24 Nov 2020 23:02:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:35120) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1khm0Z-0003uv-He for bug-gnu-emacs@gnu.org; Tue, 24 Nov 2020 23:02:19 -0500 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]:44987) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1khm0W-0004Cn-St for bug-gnu-emacs@gnu.org; Tue, 24 Nov 2020 23:02:19 -0500 Received: by mail-pl1-x636.google.com with SMTP id b23so436945pls.11 for ; Tue, 24 Nov 2020 20:02:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:mime-version:message-id; bh=x0V4W/hNEI87AsaRR5nNH12k7pbfEZwvK5yLli2w26E=; b=c+W9eOFtOgK1eq4d6SRTjJqXmDLUH+ERXshmFpcsAUHN87yMRXK3w0eh5kGTDGfcjK Z1m5e1N8G1LcmdqcIpKaGkvq7ULQeEqUFRTa24ynH3MIWWuOyXdXJAnIXwMx3KBiu17C QLUprBWEYWtnLkjWSgqNrhWTyA2KvscWAmhBmCFyiQRlNq8qjv+xB15RRS3hAlOHAK+c SXoRMgov9TB4PjmqFkcL/CZSFYmypEw2toEcbOZGo5U9B29sdPDt/0/0XgsUsTl8d3mN tnmepiVwy48Am9AOu6g0Z0xEqomMhUf26gejfhDU8wk+XFCijIZUWls/GUQsNFvOFCwx da1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:mime-version:message-id; bh=x0V4W/hNEI87AsaRR5nNH12k7pbfEZwvK5yLli2w26E=; b=Wap3OWKSTNheNk2BZj8xZXCceQJYz0t6CeCH17HR7QLMpwbK8lGPbYnSgA7yY5H0hC dZtok4otndBr97EPFJFAioVz6GVqX7TwoB9EtuZTyTDLGp7TbIE+A01wAs0mmMlDSZz8 GtTT6Lf7S0k3EvJCzmD906zsr1ztKjZXcAuh0t7x+WJ/w9li+CzHKXwm1hG24G+rCchW bOwR6yG2jhDCy+w7/K6v400dpO5iVexMw9BeCHcEEVbldA1AcsY1ufpg6HlKWvuYOqpe M97DHLvxOrY1Q8C4SCyVPWrNPOAj/SwYuYdWvyQBA/k1RYtp7psC5PkEebGON3mVkTCq rijA== X-Gm-Message-State: AOAM532fSsD9HCbkrPWxZkaCvloxuaQDLzgq8gE6YVH/Pn038LG1k9aP Va7ZpF/5sifvowyO8cSkDbSkhgopPu8= X-Google-Smtp-Source: ABdhPJyTUhcMAGrqWn6GyZ2XfdfsRCMpeCA/UvIVlLzJqoKYNU7RMcbySgw2VL1FzhELvLNTFJcmhA== X-Received: by 2002:a17:902:8bcb:b029:d9:d765:d7f3 with SMTP id r11-20020a1709028bcbb02900d9d765d7f3mr1458834plo.69.1606276934907; Tue, 24 Nov 2020 20:02:14 -0800 (PST) Received: from gmail.com (softbank126177221059.bbtec.net. [126.177.221.59]) by smtp.gmail.com with ESMTPSA id 204sm454163pfy.59.2020.11.24.20.02.12 for (version=TLS1 cipher=ECDHE-ECDSA-AES128-SHA bits=128/128); Tue, 24 Nov 2020 20:02:14 -0800 (PST) From: Shigeru Fukaya To: bug-gnu-emacs@gnu.org Subject: 27.1; [PATCH] signal in `replace-regexp-in-string' Date: Wed, 25 Nov 2020 13:02:11 +0900 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="Boundary-bwbrSPTtCXXORHdVPuDtb" X-Mailer: HidemaruMail 6.75 (WinNT,A00) Message-Id: <4ED6C2DFC045A8C31D0B86@gmail.com> Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=shigeru.fukaya@gmail.com; helo=mail-pl1-x636.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --Boundary-bwbrSPTtCXXORHdVPuDtb Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit `replace-regexp-in-string' sometimes signals an error when REGEXP contains some bondary sequence. Difference of searches between against an original passed string and against an extracted substring causes the incident. (replace-regexp-in-string-simple "a\\B" "A" "a aaaa") error --> cons: Args out of range: 2, 3 expected ==> "a AAAa" (replace-regexp-in-string "\\Ba" "A" "a aaaa") error --> cons: Args out of range: 3, 4 expected ==> "a aAAA" -- Shigeru --Boundary-bwbrSPTtCXXORHdVPuDtb Content-Type: application/octet-stream; name="subr.diff" Content-Disposition: attachment; filename="subr.diff" Content-Transfer-Encoding: base64 ZGlmZiAtdSAvdXNyL2xvY2FsL3NoYXJlL2VtYWNzLzI3LjEvbGlzcC9zdWJyLmVsIC91c3Iv bG9jYWwvc2hhcmUvZW1hY3MvMjcuMS9maXgvc3Vici1yZXBsYWNlUmVnZXhwSW5TdHJpbmcu ZWwKLS0tIC91c3IvbG9jYWwvc2hhcmUvZW1hY3MvMjcuMS9saXNwL3N1YnIuZWwJMjAyMC0w Ny0zMCAwNjo0MDo0MS4wMDAwMDAwMDAgKzA5MDAKKysrIC91c3IvbG9jYWwvc2hhcmUvZW1h Y3MvMjcuMS9maXgvc3Vici1yZXBsYWNlUmVnZXhwSW5TdHJpbmcuZWwJMjAyMC0xMS0yNSAx MjoyNTo1My45NTQ1MTU1MDAgKzA5MDAKQEAgLTQ0MjYsOSArNDQyNiw5IEBACiAJOzsgR2Vu ZXJhdGUgYSByZXBsYWNlbWVudCBmb3IgdGhlIG1hdGNoZWQgc3Vic3RyaW5nLgogCTs7IE9w ZXJhdGUgb24gb25seSB0aGUgc3Vic3RyaW5nIHRvIG1pbmltaXplIHN0cmluZyBjb25zaW5n LgogCTs7IFNldCB1cCBtYXRjaCBkYXRhIGZvciB0aGUgc3Vic3RyaW5nIGZvciByZXBsYWNl bWVudDsKLQk7OyBwcmVzdW1hYmx5IHRoaXMgaXMgbGlrZWx5IHRvIGJlIGZhc3RlciB0aGFu IG11bmdpbmcgdGhlCi0JOzsgbWF0Y2ggZGF0YSBkaXJlY3RseSBpbiBMaXNwLgotCShzdHJp bmctbWF0Y2ggcmVnZXhwIChzZXRxIHN0ciAoc3Vic3RyaW5nIHN0cmluZyBtYiBtZSkpKQor CTs7IHJlc2V0IG1hdGNoLWRhdGEgZm9yIHN1YnN0cmluZyBieSBzdWJ0cmFjdGluZyBvZmZz ZXRzLgorCShzZXQtbWF0Y2gtZGF0YSAobWFwY2FyIChsYW1iZGEgKG4pIChhbmQgbiAoLSBu IG1iKSkpIChtYXRjaC1kYXRhKSkpCisJKHNldHEgc3RyIChzdWJzdHJpbmcgc3RyaW5nIG1i IG1lKSkKIAkoc2V0cSBtYXRjaGVzCiAJICAgICAgKGNvbnMgKHJlcGxhY2UtbWF0Y2ggKGlm IChzdHJpbmdwIHJlcCkKIAkJCQkgICAgICAgcmVwCgpEaWZmIGZpbmlzaGVkLiAgV2VkIE5v diAyNSAxMjoyOToyOCAyMDIwCg== --Boundary-bwbrSPTtCXXORHdVPuDtb-- From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 25 05:58:20 2020 Received: (at 44861) by debbugs.gnu.org; 25 Nov 2020 10:58:20 +0000 Received: from localhost ([127.0.0.1]:34553 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1khsVA-0006fs-J5 for submit@debbugs.gnu.org; Wed, 25 Nov 2020 05:58:20 -0500 Received: from mail1440c50.megamailservers.eu ([91.136.14.40]:57280 helo=mail264c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1khsV8-0006fd-9O for 44861@debbugs.gnu.org; Wed, 25 Nov 2020 05:58:19 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1606301891; bh=hrv+pdoZ/rz2pHbGV4u+8+vJBSaN9a39dwQ2evQHc/o=; h=From:Subject:Date:Cc:To:From; b=qk0AKJSvc1Wu5dZ9aKWR58t+zJloQS56i/Zf+iXlTRAom9iKkAIkatIGpc8+E8YCm hrqlam7jhrSQbJ96VM/KCbuCargq5xjikUC6jEl8xRTnyUFi9LbJrRsQtsFknusOTu 7EDp0OUPYqHTHzSPwUUfsPirsuFWoqEivVq/VuAQ= Feedback-ID: mattiase@acm.or Received: from stanniol.lan (c-064ae655.032-75-73746f71.bbcust.telenor.se [85.230.74.6]) (authenticated bits=0) by mail264c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0APAw9Fl004266; Wed, 25 Nov 2020 10:58:10 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' Message-Id: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> Date: Wed, 25 Nov 2020 11:58:08 +0100 To: Shigeru Fukaya X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F18.5FBE38C3.001A, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=FoV7AFjq c=1 sm=1 tr=0 a=Ni+dBsiEfW2GqKMPYZim9A==:117 a=Ni+dBsiEfW2GqKMPYZim9A==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=fMa-jFpCfTFcEY-L9v4A:9 a=CjuIK1q_8ugA:10 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Thank you, and I agree, we probably want to do something like your suggested patch. We would need to write a test suite, of course, but it looks like the general approach would solve bug#15107 as well [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: megamailservers.eu] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 44861 Cc: 44861@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Thank you, and I agree, we probably want to do something like your = suggested patch. We would need to write a test suite, of course, but it looks like the = general approach would solve bug#15107 as well which looks like the same = bug. Some benchmarking would also be in order. From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 25 09:58:33 2020 Received: (at 44861) by debbugs.gnu.org; 25 Nov 2020 14:58:33 +0000 Received: from localhost ([127.0.0.1]:36648 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1khwFc-0006gD-Vu for submit@debbugs.gnu.org; Wed, 25 Nov 2020 09:58:33 -0500 Received: from mail33c50.megamailservers.eu ([91.136.10.43]:38316) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1khwFX-0006fr-2u; Wed, 25 Nov 2020 09:58:28 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1606316305; bh=KgQ7vsrvrRieaUxw0hGKexXZ5ZQtp+ZgIogmG8OeuEw=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=coRSS68n7BPlmU+oT4ltbhydtaCQ3KXPgtWdHIUivYHz/rG8soVoaYwaYiITyYnaN wXVHk6atVA+qfv7TTiHWAIH4k8QMEl/nG61VGMVHGkX0jnaIq96mdFxoOCp8ASwFdF a0UGWvbvNiBoNviww8VgDW03T56PxGrVk0LDle2c= Feedback-ID: mattiase@acm.or Received: from stanniol.lan (c-064ae655.032-75-73746f71.bbcust.telenor.se [85.230.74.6]) (authenticated bits=0) by mail33c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0APEwNbf004969; Wed, 25 Nov 2020 14:58:24 +0000 Content-Type: multipart/mixed; boundary="Apple-Mail=_00941A56-F79B-4BEA-9413-E45E955F9583" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> Date: Wed, 25 Nov 2020 15:58:22 +0100 Message-Id: <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> References: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> To: Shigeru Fukaya X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F23.5FBE7111.004D, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=C6KXNjH+ c=1 sm=1 tr=0 a=Ni+dBsiEfW2GqKMPYZim9A==:117 a=Ni+dBsiEfW2GqKMPYZim9A==:17 a=M51BFTxLslgA:10 a=4GVXjkHO3qP_L8fHQkQA:9 a=CjuIK1q_8ugA:10 a=neCVTB8LpOVk-CJV-C0A:9 a=B2y7HmGcmWMA:10 X-Origin-Country: SE X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 44861 Cc: 44861@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --Apple-Mail=_00941A56-F79B-4BEA-9413-E45E955F9583 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii forcemerge 15107 44861 stop Suggested patch attached. A small test suite for = replace-regexp-in-string has already been pushed to master -- very = rudimentary, but better than nothing -- and the patch amends it with = some new relevant cases that didn't work before. It is basically your patch but slightly optimised; it turned out that = the function call and allocation overhead of the original patch made it = a tad too expensive (a pity, because it was very neat). Now performance = is about the same as before when the pattern contains no submatches, and = slightly above (< 10% slower) with one submatch. It seems worth the = correctness. --Apple-Mail=_00941A56-F79B-4BEA-9413-E45E955F9583 Content-Disposition: attachment; filename=0001-Fix-replace-regexp-in-string-substring-match-data-tr.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Fix-replace-regexp-in-string-substring-match-data-tr.patch" Content-Transfer-Encoding: quoted-printable =46rom=209bc8dc80be5cee517fa53e6b8f37881d4220f162=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Wed,=2025=20Nov=202020=2015:32:08=20+0100=0A= Subject:=20[PATCH]=20Fix=20replace-regexp-in-string=20substring=20match=20= data=20translation=0A=0AFor=20certain=20patterns,=20re-matching=20the=20= same=20regexp=20on=20the=20matched=0Asubstring=20does=20not=20produce=20= correctly=20translated=20match=20data=0A(bug#15107=20and=20bug#44861).=0A= =0AReported=20by=20Kevin=20Ryde=20and=20Shigeru=20Fukaya.=0A=0A*=20= lisp/subr.el=20(replace-regexp-in-string):=20Translate=20the=20match=20= data=0Aby=20explicit=20manipulation=20instead=20of=20trusting=20a=20call=20= to=20string-match=20on=0Athe=20matched=20string=20to=20do=20the=20job.=0A= *=20test/lisp/subr-tests.el=20(subr-replace-regexp-in-string):=0AAdd=20= test=20cases.=0A---=0A=20lisp/subr.el=20=20=20=20=20=20=20=20=20=20=20=20= |=2017=20++++++++++++-----=0A=20test/lisp/subr-tests.el=20|=20=206=20= +++++-=0A=202=20files=20changed,=2017=20insertions(+),=206=20= deletions(-)=0A=0Adiff=20--git=20a/lisp/subr.el=20b/lisp/subr.el=0Aindex=20= 1fb0f9ab7e..0ee2199933=20100644=0A---=20a/lisp/subr.el=0A+++=20= b/lisp/subr.el=0A@@=20-4537,7=20+4537,7=20@@=20replace-regexp-in-string=0A= =20=20=20;;=20might=20be=20reasonable=20to=20do=20so=20for=20long=20= enough=20STRING.]=0A=20=20=20(let=20((l=20(length=20string))=0A=20=09= (start=20(or=20start=200))=0A-=09matches=20str=20mb=20me)=0A+=09matches=20= str=20mb=20me=20md)=0A=20=20=20=20=20(save-match-data=0A=20=20=20=20=20=20= =20(while=20(and=20(<=20start=20l)=20(string-match=20regexp=20string=20= start))=0A=20=09(setq=20mb=20(match-beginning=200)=0A@@=20-4546,10=20= +4546,17=20@@=20replace-regexp-in-string=0A=20=09(when=20(=3D=20me=20mb)=20= (setq=20me=20(min=20l=20(1+=20mb))))=0A=20=09;;=20Generate=20a=20= replacement=20for=20the=20matched=20substring.=0A=20=09;;=20Operate=20on=20= only=20the=20substring=20to=20minimize=20string=20consing.=0A-=09;;=20= Set=20up=20match=20data=20for=20the=20substring=20for=20replacement;=0A-=09= ;;=20presumably=20this=20is=20likely=20to=20be=20faster=20than=20munging=20= the=0A-=09;;=20match=20data=20directly=20in=20Lisp.=0A-=09(string-match=20= regexp=20(setq=20str=20(substring=20string=20mb=20me)))=0A+=0A+=20=20=20=20= =20=20=20=20;;=20Translate=20the=20match=20data=20so=20that=20it=20= applies=20to=20the=20matched=20substring.=0A+=20=20=20=20=20=20=20=20= (setq=20md=20(match-data=20nil=20md=20t))=20=20;=20Reuse=20list=20from=20= previous=20match.=0A+=20=20=20=20=20=20=20=20(let=20((m=20md))=0A+=20=20=20= =20=20=20=20=20=20=20(while=20m=0A+=20=20=20=20=20=20=20=20=20=20=20=20= (when=20(car=20m)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20(setcar=20= m=20(-=20(car=20m)=20mb)))=0A+=20=20=20=20=20=20=20=20=20=20=20=20(setq=20= m=20(cdr=20m)))=0A+=20=20=20=20=20=20=20=20=20=20(set-match-data=20md))=0A= +=0A+=20=20=20=20=20=20=20=20(setq=20str=20(substring=20string=20mb=20= me))=0A=20=09(setq=20matches=0A=20=09=20=20=20=20=20=20(cons=20= (replace-match=20(if=20(stringp=20rep)=0A=20=09=09=09=09=20=20=20=20=20=20= =20rep=0Adiff=20--git=20a/test/lisp/subr-tests.el=20= b/test/lisp/subr-tests.el=0Aindex=20c77be511dc..67f7fc9749=20100644=0A= ---=20a/test/lisp/subr-tests.el=0A+++=20b/test/lisp/subr-tests.el=0A@@=20= -545,7=20+545,11=20@@=20subr-replace-regexp-in-string=0A=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= (match-beginning=201)=20(match-end=201)))=0A=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20"babbcaacabc")=0A=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20"ba"))=0A= -=20=20)=0A+=20=20;;=20anchors=20(bug#15107,=20bug#44861)=0A+=20=20= (should=20(equal=20(replace-regexp-in-string=20"a\\B"=20"b"=20"a=20= aaaa")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20"a=20= bbba"))=0A+=20=20(should=20(equal=20(replace-regexp-in-string=20= "\\`\\|x"=20"z"=20"--xx--")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20"z--zz--")))=0A=20=0A=20(provide=20'subr-tests)=0A=20;;;=20= subr-tests.el=20ends=20here=0A--=20=0A2.21.1=20(Apple=20Git-122.3)=0A=0A= --Apple-Mail=_00941A56-F79B-4BEA-9413-E45E955F9583-- From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 25 16:39:19 2020 Received: (at 44861) by debbugs.gnu.org; 25 Nov 2020 21:39:19 +0000 Received: from localhost ([127.0.0.1]:37330 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ki2VT-0003pd-5S for submit@debbugs.gnu.org; Wed, 25 Nov 2020 16:39:19 -0500 Received: from mail-ej1-f43.google.com ([209.85.218.43]:45915) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ki2VR-0003pO-Fl for 44861@debbugs.gnu.org; Wed, 25 Nov 2020 16:39:18 -0500 Received: by mail-ej1-f43.google.com with SMTP id lv15so5044500ejb.12 for <44861@debbugs.gnu.org>; Wed, 25 Nov 2020 13:39:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:in-reply-to:references:mime-version:date:message-id:subject:to :cc:content-transfer-encoding; bh=OWaipFu4bAJe1wHaEkytAfsTkNDpbpm6fvH4omjdzkk=; b=agRcITfuJmOanKXudiNfyZeC23l9LijcLYufZZIqzOeOECLRcF7aa9bmw/ix7a7WUG AZiKB8EtSaZgP5Na7hXavWMvsEzqJ1lq7AMNfQ1hIrXYz+DwJwvh94Dn9N7t2BbyLRXA gQt7BXXLzpI4GxCY43H/IYo0NK+p7Xwj0mqy4uSzVyLxT832jffgi8vCbher0KSgOrsH fDrUrZ6Z/TSVIPqjeMStEj/L4IoOb/yXk5bLkhV6rsIpc3pWpK4qvsNXRmz9Upph/1Bs jj7VDv88t+sIRd27ZrTgccaJl9Evm/Sp0RhIDdjIP/L+tcvslALLl2INt1ZUk+9VNHAD t/Kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:in-reply-to:references:mime-version:date :message-id:subject:to:cc:content-transfer-encoding; bh=OWaipFu4bAJe1wHaEkytAfsTkNDpbpm6fvH4omjdzkk=; b=IFR/PcaeqnLFhclxP5EpvdIxeAe3p2RN4+2K/cwGK/yPP0GrzrBdNVUWZFqHkcYaGy Gd1hITBpayr3MNw4UulUCLeYoAy0P0pludjnPdLz6p3eFasrxMfc/sD3vMlzqf3nAcIE r15tkqPnDZIv129ShnvQP0ZCoC+j3BrNEHqLnEnsQeRZJ9sFzyURhOPtNjNlJZ7lDdon RArW4tF1rhBslaov2i+EjUz9FPl9OSxz4wA9mvNogB/tSNXISymC5DdrnxxgneTYhBs/ OTc3xXmnQSXjxeeUfzKp2inU9hZxIocmivO3ENcEYhSIIxNz/idY03aP3qrGUxAv9dP4 0mMQ== X-Gm-Message-State: AOAM533gK+b5ptYyezD/I/m2jX5iGMDzco3f71Z6GhnKnBuPf0v8E3Qd mrK9Q8EpURA9dVOe9IDQ1xDd/glwyFskRECRmlc= X-Google-Smtp-Source: ABdhPJzAC24+N/yebfEaqZXuB/xvkJERWz03q12yMla/4NPH8RU7k18MlqcW3kW/3JisokGb9ycyH6melPNcqalc2kw= X-Received: by 2002:a17:906:eb50:: with SMTP id mc16mr4289825ejb.420.1606340351521; Wed, 25 Nov 2020 13:39:11 -0800 (PST) Received: from 753933720722 named unknown by gmailapi.google.com with HTTPREST; Wed, 25 Nov 2020 16:39:10 -0500 From: Stefan Kangas In-Reply-To: <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> References: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> MIME-Version: 1.0 Date: Wed, 25 Nov 2020 16:39:10 -0500 Message-ID: Subject: Re: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' To: =?UTF-8?Q?Mattias_Engdeg=C3=A5rd?= , Shigeru Fukaya Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 44861 Cc: 44861@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Mattias Engdeg=C3=A5rd writes: > It is basically your patch but slightly optimised; it turned out that > the function call and allocation overhead of the original patch made > it a tad too expensive (a pity, because it was very neat). Now > performance is about the same as before when the pattern contains no > submatches, and slightly above (< 10% slower) with one submatch. It > seems worth the correctness. Presumably this hasn't worked correctly for a long time, if ever. Is that correct? I personally worry about the performance here. Since we use regexps heavily all over, it is not clear (to me) that 10 % overall performance drop with subexpressions is worth it to work correctly in these rare edge-cases. I suppose we do have to fix the bug here, but is it feasible to solve this in a way that has less performance impact? From debbugs-submit-bounces@debbugs.gnu.org Thu Nov 26 07:58:11 2020 Received: (at 44861) by debbugs.gnu.org; 26 Nov 2020 12:58:11 +0000 Received: from localhost ([127.0.0.1]:40092 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiGqh-0003x0-LA for submit@debbugs.gnu.org; Thu, 26 Nov 2020 07:58:11 -0500 Received: from mail1480c50.megamailservers.eu ([91.136.14.80]:42712 helo=mail118c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiGqf-0003wl-6q for 44861@debbugs.gnu.org; Thu, 26 Nov 2020 07:58:10 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1606395482; bh=cToqMShpQcqaOWJZi1bL6K2MAuJPpdJqRf1yLy7t/T0=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=hiErruedloPurEgb4LCqjZIM1WRnBnGhqVNxwh8Q4zQssmlMzNao3Ky89Ycip5Ctz GEPdEpd+neg3irh35XHUIn6EpNsdUIEa1gzZk1ddx6JzZDQkBis0FSRXK9SK1kt7B3 haDU8XxfrbLML99qqkM7yBZtVVzhyrPgMFlecfR8= Feedback-ID: mattiase@acm.or Received: from stanniol.lan (c-064ae655.032-75-73746f71.bbcust.telenor.se [85.230.74.6]) (authenticated bits=0) by mail118c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0AQCvxJf028278; Thu, 26 Nov 2020 12:58:01 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Message-Id: <83EC926B-DE9E-48BC-8FD2-C7CB3617AD50@acm.org> Content-Type: multipart/mixed; boundary="Apple-Mail=_FB3BA8E5-4848-4CB2-BF78-96ACA7A98898" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' Date: Thu, 26 Nov 2020 13:57:59 +0100 In-Reply-To: To: Stefan Kangas References: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F1E.5FBFA65A.009A, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=U/Ps8tju c=1 sm=1 tr=0 a=Ni+dBsiEfW2GqKMPYZim9A==:117 a=Ni+dBsiEfW2GqKMPYZim9A==:17 a=M51BFTxLslgA:10 a=pGLkceISAAAA:8 a=8Umpk8v5Mqs1vKtccoYA:9 a=CjuIK1q_8ugA:10 a=useeaeFYC0j9yllu8zAA:9 a=B2y7HmGcmWMA:10 X-Origin-Country: SE X-Spam-Score: 1.2 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 25 nov. 2020 kl. 22.39 skrev Stefan Kangas : > I personally worry about the performance here. Since we use regexps > heavily all over, it is not clear (to me) that 10 % overall performance > drop with subexpressions is worth it to work correctly [...] Content analysis details: (1.2 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.3 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 44861 Cc: 44861@debbugs.gnu.org, Shigeru Fukaya X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --Apple-Mail=_FB3BA8E5-4848-4CB2-BF78-96ACA7A98898 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii 25 nov. 2020 kl. 22.39 skrev Stefan Kangas : > I personally worry about the performance here. Since we use regexps > heavily all over, it is not clear (to me) that 10 % overall = performance > drop with subexpressions is worth it to work correctly in these rare > edge-cases. I suppose we do have to fix the bug here, but is it > feasible to solve this in a way that has less performance impact? We can't really let it remain buggy, especially as the consequence can = be an error or silently wrong results. Also remember that one man's edge = case is another's reasonable use. However, unlike Boris we can eat our cake and have it! The attached = patch performs the match-data translation in a C function, which = obviously is much faster and indeed speeds up replace-regexp-in-string = in all cases (as long as there is any match at all). The new primitive = is a bit ad-hoc, but does one well-defined thing and isn't intended for = use by the general public anyway. --Apple-Mail=_FB3BA8E5-4848-4CB2-BF78-96ACA7A98898 Content-Disposition: attachment; filename=0001-Fix-replace-regexp-in-string-substring-match-data-tr.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Fix-replace-regexp-in-string-substring-match-data-tr.patch" Content-Transfer-Encoding: quoted-printable =46rom=2088d5a8d847045e23c2ab39786dc6e5a9a5412a32=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Wed,=2025=20Nov=202020=2015:32:08=20+0100=0A= Subject:=20[PATCH]=20Fix=20replace-regexp-in-string=20substring=20match=20= data=20translation=0A=0AFor=20certain=20patterns,=20re-matching=20the=20= same=20regexp=20on=20the=20matched=0Asubstring=20does=20not=20produce=20= correctly=20translated=20match=20data=0A(bug#15107=20and=20bug#44861).=0A= =0AUsing=20a=20new=20builtin=20function=20also=20improves=20performance=20= since=20the=0Anumber=20of=20calls=20to=20string-match=20is=20halved.=0A=0A= Reported=20by=20Kevin=20Ryde=20and=20Shigeru=20Fukaya.=0A=0A*=20= lisp/subr.el=20(replace-regexp-in-string):=20Translate=20the=20match=20= data=0Ausing=20match-data--translate=20instead=20of=20trusting=20a=20= call=20to=20string-match=0Aon=20the=20matched=20string=20to=20do=20the=20= job.=0A*=20test/lisp/subr-tests.el=20(subr-replace-regexp-in-string):=0A= Add=20test=20cases.=0A*=20src/search.c=20(Fmatch_data__translate):=20New=20= internal=20function.=0A(syms_of_search):=20Register=20it=20as=20a=20= subroutine.=0A---=0A=20lisp/subr.el=20=20=20=20=20=20=20=20=20=20=20=20|=20= =207=20+++----=0A=20src/search.c=20=20=20=20=20=20=20=20=20=20=20=20|=20= 18=20++++++++++++++++++=0A=20test/lisp/subr-tests.el=20|=20=206=20+++++-=0A= =203=20files=20changed,=2026=20insertions(+),=205=20deletions(-)=0A=0A= diff=20--git=20a/lisp/subr.el=20b/lisp/subr.el=0Aindex=20= 1fb0f9ab7e..e009dcc2b9=20100644=0A---=20a/lisp/subr.el=0A+++=20= b/lisp/subr.el=0A@@=20-4546,10=20+4546,9=20@@=20replace-regexp-in-string=0A= =20=09(when=20(=3D=20me=20mb)=20(setq=20me=20(min=20l=20(1+=20mb))))=0A=20= =09;;=20Generate=20a=20replacement=20for=20the=20matched=20substring.=0A=20= =09;;=20Operate=20on=20only=20the=20substring=20to=20minimize=20string=20= consing.=0A-=09;;=20Set=20up=20match=20data=20for=20the=20substring=20= for=20replacement;=0A-=09;;=20presumably=20this=20is=20likely=20to=20be=20= faster=20than=20munging=20the=0A-=09;;=20match=20data=20directly=20in=20= Lisp.=0A-=09(string-match=20regexp=20(setq=20str=20(substring=20string=20= mb=20me)))=0A+=20=20=20=20=20=20=20=20;;=20Translate=20the=20match=20= data=20so=20that=20it=20applies=20to=20the=20matched=20substring.=0A+=20=20= =20=20=20=20=20=20(match-data--translate=20(-=20mb))=0A+=20=20=20=20=20=20= =20=20(setq=20str=20(substring=20string=20mb=20me))=0A=20=09(setq=20= matches=0A=20=09=20=20=20=20=20=20(cons=20(replace-match=20(if=20= (stringp=20rep)=0A=20=09=09=09=09=20=20=20=20=20=20=20rep=0Adiff=20--git=20= a/src/search.c=20b/src/search.c=0Aindex=20e7f9094946..4eb634a3c0=20= 100644=0A---=20a/src/search.c=0A+++=20b/src/search.c=0A@@=20-3031,6=20= +3031,23=20@@=20DEFUN=20("set-match-data",=20Fset_match_data,=20= Sset_match_data,=201,=202,=200,=0A=20=20=20return=20Qnil;=0A=20}=0A=20=0A= +DEFUN=20("match-data--translate",=20Fmatch_data__translate,=20= Smatch_data__translate,=0A+=20=20=20=20=20=20=201,=201,=200,=0A+=20=20=20= =20=20=20=20doc:=20/*=20Add=20N=20to=20all=20string=20positions=20in=20= the=20match=20data.=20=20Internal.=20=20*/)=0A+=20=20(Lisp_Object=20n)=0A= +{=0A+=20=20CHECK_FIXNUM=20(n);=0A+=20=20EMACS_INT=20delta=20=3D=20= XFIXNUM=20(n);=0A+=20=20if=20(EQ=20(last_thing_searched,=20Qt))=20=20=20= /*=20String=20match=20data=20only.=20=20*/=0A+=20=20=20=20for=20= (ptrdiff_t=20i=20=3D=200;=20i=20<=20search_regs.num_regs;=20i++)=0A+=20=20= =20=20=20=20if=20(search_regs.start[i]=20>=3D=200)=0A+=20=20=20=20=20=20=20= =20{=0A+=20=20=20=20=20=20=20=20=20=20search_regs.start[i]=20=3D=20max=20= (0,=20search_regs.start[i]=20+=20delta);=0A+=20=20=20=20=20=20=20=20=20=20= search_regs.end[i]=20=3D=20max=20(0,=20search_regs.end[i]=20+=20delta);=0A= +=20=20=20=20=20=20=20=20}=0A+=20=20return=20Qnil;=0A+}=0A+=0A=20/*=20= Called=20from=20Flooking_at,=20Fstring_match,=20search_buffer,=20= Fstore_match_data=0A=20=20=20=20if=20asynchronous=20code=20(filter=20or=20= sentinel)=20is=20running.=20*/=0A=20static=20void=0A@@=20-3388,6=20= +3405,7=20@@=20syms_of_search=20(void)=0A=20=20=20defsubr=20= (&Smatch_end);=0A=20=20=20defsubr=20(&Smatch_data);=0A=20=20=20defsubr=20= (&Sset_match_data);=0A+=20=20defsubr=20(&Smatch_data__translate);=0A=20=20= =20defsubr=20(&Sregexp_quote);=0A=20=20=20defsubr=20= (&Snewline_cache_check);=0A=20=0Adiff=20--git=20= a/test/lisp/subr-tests.el=20b/test/lisp/subr-tests.el=0Aindex=20= c77be511dc..67f7fc9749=20100644=0A---=20a/test/lisp/subr-tests.el=0A+++=20= b/test/lisp/subr-tests.el=0A@@=20-545,7=20+545,11=20@@=20= subr-replace-regexp-in-string=0A=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20(match-beginning=201)=20= (match-end=201)))=0A=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20"babbcaacabc")=0A=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= "ba"))=0A-=20=20)=0A+=20=20;;=20= anchors=20(bug#15107,=20bug#44861)=0A+=20=20(should=20(equal=20= (replace-regexp-in-string=20"a\\B"=20"b"=20"a=20aaaa")=0A+=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20"a=20bbba"))=0A+=20=20(should=20= (equal=20(replace-regexp-in-string=20"\\`\\|x"=20"z"=20"--xx--")=0A+=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20"z--zz--")))=0A=20=0A=20= (provide=20'subr-tests)=0A=20;;;=20subr-tests.el=20ends=20here=0A--=20=0A= 2.21.1=20(Apple=20Git-122.3)=0A=0A= --Apple-Mail=_FB3BA8E5-4848-4CB2-BF78-96ACA7A98898-- From debbugs-submit-bounces@debbugs.gnu.org Thu Nov 26 08:12:50 2020 Received: (at 44861) by debbugs.gnu.org; 26 Nov 2020 13:12:51 +0000 Received: from localhost ([127.0.0.1]:40142 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiH4s-0006R3-Mu for submit@debbugs.gnu.org; Thu, 26 Nov 2020 08:12:50 -0500 Received: from quimby.gnus.org ([95.216.78.240]:50280) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiH4q-0006Qp-Lh for 44861@debbugs.gnu.org; Thu, 26 Nov 2020 08:12:49 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID :In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=Y8XdtTgFDsBwS2vPW61FBBYfynArKwO1Lsv0YRZLSX8=; b=VVs+qD1sAFhbBYZf2QNchu2dyB WYwMaVRpPF1V8xUHIzpS1JdPJsltrBkW9ApEtHo7mPbFmtZn8H4eAMFv262y5GkamytqAcG/SMq00 kJ0OE7B/wd3uE0swNFHb2yu5OCbKOojjhzpwo0wNMk7X7mSvVpkmThxFmLSVYsPD7zh8=; Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=xo) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kiH4e-0001a7-Vv; Thu, 26 Nov 2020 14:12:42 +0100 From: Lars Ingebrigtsen To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= Subject: Re: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' References: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> <83EC926B-DE9E-48BC-8FD2-C7CB3617AD50@acm.org> Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAAGFBMVEVvja90l8Y6ZZ49 OziJg3iXprT38dn////PMCpNAAAAAWJLR0QHFmGI6wAAAAd0SU1FB+QLGgwxNgmIHpEAAAFISURB VDjLpZTBboQgEIYHsvYMdvfu0qQvIH0Cxr23yp43jfL+j9ABQRfUtEn/xBjmY+aHCQDwPzEpxX6Y JFSQD8hMr23QMpuzmn6CM56AAq64uhJQ/t+oBHio1qhrrfzgyt9SKXngUcTl2URw+czBZYhAqEbW qhELsD6uAeoIWJaxgrVUb/aBQIwe1CPq3FLqbLA1Ybl3lzQtGdpkwPmVXzz42AFzRg/Uu3f3oG8F Yesz+H4CcASqsFr0+9gAEzJEYV4NsZTcAB83WwDt7FGVgKXj8VWYzyJyCOrfQB9MO9O5UaM3SuYR DLcJtSmAaXVnraG0EiBir9HkYESk6XbsBsIsN9cab6OmGS0TK6AxItjJTlCdnoHXelAjeKgXN7m7 W6LhKj41McuAorte3gLEFsxXegdEsgHpDSjB8iDYVULC3pPxJ51+ACBzvgDPBT2aAAAAJXRFWHRk YXRlOmNyZWF0ZQAyMDIwLTExLTI2VDEyOjQ5OjU0KzAwOjAwNIQCzQAAACV0RVh0ZGF0ZTptb2Rp ZnkAMjAyMC0xMS0yNlQxMjo0OTo1NCswMDowMEXZunEAAAAASUVORK5CYII= X-Now-Playing: Simple Minds's _Sister Feelings Call_: "Careful in Career" Date: Thu, 26 Nov 2020 14:12:35 +0100 In-Reply-To: <83EC926B-DE9E-48BC-8FD2-C7CB3617AD50@acm.org> ("Mattias =?utf-8?Q?Engdeg=C3=A5rd=22's?= message of "Thu, 26 Nov 2020 13:57:59 +0100") Message-ID: <871rggs1mk.fsf@gnus.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Mattias EngdegÄrd writes: > However, unlike Boris we can eat our cake and have it! The attached > patch performs the match-data translation in a C function, which > obviously is much faster and indeed speeds up replace-regexp- [...] Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 44861 Cc: 44861@debbugs.gnu.org, Stefan Kangas , Shigeru Fukaya X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Mattias Engdeg=C3=A5rd writes: > However, unlike Boris we can eat our cake and have it! The attached > patch performs the match-data translation in a C function, which > obviously is much faster and indeed speeds up replace-regexp-in-string > in all cases (as long as there is any match at all). I'm all for speeding up replace-regexp-in-string (which is used all over the place), so your change looks reasonable to me. But I wonder -- would it make sense to move the entire replace-regexp-in-string function to C? --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no From debbugs-submit-bounces@debbugs.gnu.org Thu Nov 26 08:39:09 2020 Received: (at 44861-done) by debbugs.gnu.org; 26 Nov 2020 13:39:09 +0000 Received: from localhost ([127.0.0.1]:40212 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiHUL-0002td-68 for submit@debbugs.gnu.org; Thu, 26 Nov 2020 08:39:09 -0500 Received: from mail18c50.megamailservers.eu ([91.136.10.28]:35314) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiHUI-0002tU-AF for 44861-done@debbugs.gnu.org; Thu, 26 Nov 2020 08:39:07 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1606397944; bh=qRtUDaLl5H17p0RDB4FfvX9bEjX+S3TmznMxIhHbgxE=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=KET9zSevJRb7BWv7KCOGosvyl0QLQ74BOTruefHc0UwmF1MsYU9W1WtXYkS8HZVRJ UWHBMNGCwd+hwVDmmN4rDhjgRvoSobqN349KkR/QLXkSiuag+EGXW9yND+Tx8X31G3 tlUcY20i5huTZWpddD2COSJbX0WwzBSKFw2G49uM= Feedback-ID: mattiase@acm.or Received: from stanniol.lan (c-064ae655.032-75-73746f71.bbcust.telenor.se [85.230.74.6]) (authenticated bits=0) by mail18c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0AQDd2TT030149; Thu, 26 Nov 2020 13:39:03 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <871rggs1mk.fsf@gnus.org> Date: Thu, 26 Nov 2020 14:39:01 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> <83EC926B-DE9E-48BC-8FD2-C7CB3617AD50@acm.org> <871rggs1mk.fsf@gnus.org> To: Lars Ingebrigtsen X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F1C.5FBFAFF8.006F, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=c8jVvi1l c=1 sm=1 tr=0 a=Ni+dBsiEfW2GqKMPYZim9A==:117 a=Ni+dBsiEfW2GqKMPYZim9A==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=OocQHUDgAAAA:8 a=7AurVBKHxq3X-xEwr_MA:9 a=CjuIK1q_8ugA:10 a=xUZTl98r3Qw_uB5NK3jt:22 X-Origin-Country: SE X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 44861-done Cc: Dmitry Gutov , 44861-done@debbugs.gnu.org, Stefan Kangas , Shigeru Fukaya X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 26 nov. 2020 kl. 14.12 skrev Lars Ingebrigtsen : > I'm all for speeding up replace-regexp-in-string (which is used all = over > the place), so your change looks reasonable to me. Thank you! Pushed to master. > But I wonder -- would it make sense to move the entire > replace-regexp-in-string function to C? Probably, but that would be a pure performance improvement. Most of the = time is currently consumed in primitives (string-match, replace-match, = substring, concat) so don't expect huge savings unless a substantially = different approach is taken. (Dmitry Gutov asked for a C implementation in bug#20273 for improving = the speed of json encoding; is that still relevant?) A bigger saving yet would be to use the much faster string-replace = wherever possible. A little sweeping refactoring project perhaps? It = would also improve readability -- no regexp quoting, fewer mysterious = arguments like LITERAL and FIXEDCASE to worry about, etc. From debbugs-submit-bounces@debbugs.gnu.org Thu Nov 26 08:43:39 2020 Received: (at 44861) by debbugs.gnu.org; 26 Nov 2020 13:43:40 +0000 Received: from localhost ([127.0.0.1]:40237 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiHYh-00031I-Ld for submit@debbugs.gnu.org; Thu, 26 Nov 2020 08:43:39 -0500 Received: from mail-ed1-f52.google.com ([209.85.208.52]:37671) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiHYd-000311-Rz for 44861@debbugs.gnu.org; Thu, 26 Nov 2020 08:43:38 -0500 Received: by mail-ed1-f52.google.com with SMTP id n24so215409edb.4 for <44861@debbugs.gnu.org>; Thu, 26 Nov 2020 05:43:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:in-reply-to:references:mime-version:date:message-id:subject:to :cc; bh=CCrmB8sR2Q+h1DbWyIJz767k/ATyqb0YUy/FKrgiUS4=; b=qAhrla37HiSX2HeBYgCBQKFwnZxG34RsyzsZkSlzk71cqJHiU3LA7kEeUABn0XI8s/ xlUcuAep1vQyrYuUm8auC8Jauqu+Vnhhytr31EqSEX+RT6AI4XNrvucyHoPB/vGW38Ye cdA1zRwVVnaYAPLECNoNoxDhCMQVQIgnh0El9rjPmeWyIxplJ2NcY9DW2Y7QC1/UR/xN yiohVF5V3aSdgZUSjZXwiYNPB/x45OP58MWWVJPZ+n7leJLo75r/I9oQYBzry8WKAdmB 9W5uZUlVVOBHmJ/yIzUAvVuL8B8DcI/dXm/eQ1IJSJVW5d9Ym/l/10dxlcQTcqsmgVGT P5nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:in-reply-to:references:mime-version:date :message-id:subject:to:cc; bh=CCrmB8sR2Q+h1DbWyIJz767k/ATyqb0YUy/FKrgiUS4=; b=jTT5yZaDkCw9yVkgCe8DTeO4w1pABmqliJ8dg2TIISfOsNatuqDznO+GjIu7P3T0BW W7/w93pHLa/tk6lwXHEU+ohoeb367R2ilePKdlESPVv7URZzNg5ETFYCrB0WVadcHyjM lcwka02HUIYrMDTvSAHA/dtRk7JjA1lT/r86pyScQBUZAgGde8Tf+zD7leJcgOFzSQle e/QCVZV0XMkfv81e08ehpkQYs8xJYNwwKO1k44lFIe/C8UAEsWf3Y6FLeaFAHmBpoWf+ fGKhfjkz0drYasj23PyP9VaSI/Tay8L1Mnn1zleOCD59AeWPGPpKq4vhaVd+8N4kX0ce ayWg== X-Gm-Message-State: AOAM532+EJcTaU4uPOfbMQZQfYDSOKueGwgiSt+n4e2XiFZ5dp2qY73S zPMUWZyVIkhiLx33fq5wtrqC562gp1Cycj5dFjM= X-Google-Smtp-Source: ABdhPJyRCL+2GKFwBU48XD/snycmumF3sbfDIBUhCCCBJ0BYDLeMvO5WGE2p926I7ZCNEv9yoT/cN/ZGqIY2xxal+yM= X-Received: by 2002:a05:6402:3089:: with SMTP id de9mr2602367edb.100.1606398209859; Thu, 26 Nov 2020 05:43:29 -0800 (PST) Received: from 753933720722 named unknown by gmailapi.google.com with HTTPREST; Thu, 26 Nov 2020 05:43:29 -0800 From: Stefan Kangas In-Reply-To: <871rggs1mk.fsf@gnus.org> References: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> <83EC926B-DE9E-48BC-8FD2-C7CB3617AD50@acm.org> <871rggs1mk.fsf@gnus.org> MIME-Version: 1.0 Date: Thu, 26 Nov 2020 05:43:29 -0800 Message-ID: Subject: Re: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' To: Lars Ingebrigtsen , =?UTF-8?Q?Mattias_Engdeg=C3=A5rd?= Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 44861 Cc: 44861@debbugs.gnu.org, Shigeru Fukaya X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Lars Ingebrigtsen writes: > But I wonder -- would it make sense to move the entire > replace-regexp-in-string function to C? Before we undertake any major changes in that direction, perhaps we should benchmark the relevant functions on the native-comp branch? It changes the benchmarks by quite a lot in some cases. From debbugs-submit-bounces@debbugs.gnu.org Thu Nov 26 09:03:42 2020 Received: (at 44861-done) by debbugs.gnu.org; 26 Nov 2020 14:03:43 +0000 Received: from localhost ([127.0.0.1]:40338 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiHs6-0005e8-KZ for submit@debbugs.gnu.org; Thu, 26 Nov 2020 09:03:42 -0500 Received: from quimby.gnus.org ([95.216.78.240]:50850) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiHs5-0005du-08 for 44861-done@debbugs.gnu.org; Thu, 26 Nov 2020 09:03:41 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID :In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=RqMNGeuHtY0Scs8J/YM6lfVAOfAQM0rtPzWlln31aOU=; b=uTZq2ugfPNDvZwT0nJaFvaM3xP hNjlKZ+f3Tbf3s4N6T8dYK46F71+IQ/c/q1OtliAhqP9u4XvPEg/NyTX9zozH3lDu8L2qcdwLEifP hbhcUpzJeuCqd0hZhDgL0dzTpm/kTLubcZJDsq6f3c0hfGpuGtyHyzvmdR0wtnizmYdo=; Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=xo) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kiHrv-0002DX-MP; Thu, 26 Nov 2020 15:03:34 +0100 From: Lars Ingebrigtsen To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= Subject: Re: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' References: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> <83EC926B-DE9E-48BC-8FD2-C7CB3617AD50@acm.org> <871rggs1mk.fsf@gnus.org> Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAgMAAAAqbBEUAAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAADFBMVEXW0sqJgHMmJST/ //8NscnUAAAAAWJLR0QDEQxM8gAAAAd0SU1FB+QLGg03OyChr50AAAF6SURBVCjPRZHBTuMwEIZ/ R3FEc2KlBlGfYSX6FO4KOKeotrQ5bxHkKbrSqmezUnPgFFCK8DwlM25TRoqcb/7MzB8PkKIGlnKe 8VO4WyiBKRznXfpA33lXm72FYs4kX9Ie98QKq8oQxYYsqgVQEQ2eIis1ZjsiCopqKOtdyRDROujK oSX6oP2zRT7Ni6YjCZk51cZWrwm4da6htVRJtAEPkahPhuh+VazX7S4p1PBgFyfy7iie89ELVE1c s9dsk7PJztPqt8VEVO081UWwf6SiuvH0fxGykGDX0IC6tAK+M/QJXKYh5b+frCBPSrFaUmDzh/s7 38x7NcLSGnZiD7BA2fU4hkX5FnCKi74ea9jjW/2tKP6F1Qh5i+sfp7Kn/CrbHZvjBk+/tiPM+8/i /QTD0Axjx/nLYGI1wvbF8KquE5htKCnoR5lpTfeXV2RbcRPMx63nu56JspmTbmSTcj9o48QftoVn FWk2Qsxia457VJSyCZSUUpSd7r8ADi2IHHCSwcQAAAAldEVYdGRhdGU6Y3JlYXRlADIwMjAtMTEt MjZUMTM6NTU6NTgrMDA6MDD1AI5yAAAAJXRFWHRkYXRlOm1vZGlmeQAyMDIwLTExLTI2VDEzOjU1 OjU4KzAwOjAwhF02zgAAAABJRU5ErkJggg== X-Now-Playing: Liaisons Dangereuses's _Liaisons Dangereuses_: "Dupont" Date: Thu, 26 Nov 2020 15:03:30 +0100 In-Reply-To: ("Mattias =?utf-8?Q?Engdeg=C3=A5rd=22's?= message of "Thu, 26 Nov 2020 14:39:01 +0100") Message-ID: <87zh34p64t.fsf@gnus.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Mattias EngdegÄrd writes: > Probably, but that would be a pure performance improvement. Most of > the time is currently consumed in primitives (string-match, > replace-match, substring, concat) so don't expect huge savings unl [...] Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 44861-done Cc: Dmitry Gutov , 44861-done@debbugs.gnu.org, Stefan Kangas , Shigeru Fukaya X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Mattias Engdeg=C3=A5rd writes: > Probably, but that would be a pure performance improvement. Most of > the time is currently consumed in primitives (string-match, > replace-match, substring, concat) so don't expect huge savings unless > a substantially different approach is taken. Yeah, perhaps there's isn't a lot to be gained there, unless a lot of the re-checking of all the arguments (etc.) (which is unnecessary once we've ascertained that everything is, indeed, a string) can be done by refactoring some of the underlying primitives. > (Dmitry Gutov asked for a C implementation in bug#20273 for improving > the speed of json encoding; is that still relevant?) No, probably not, since it's now done by Jansson? So I'm closing that one. > A bigger saving yet would be to use the much faster string-replace > wherever possible. A little sweeping refactoring project perhaps? It > would also improve readability -- no regexp quoting, fewer mysterious > arguments like LITERAL and FIXEDCASE to worry about, etc. I started looking at that, and there's a huge pile of calls like (replace-regexp-in-string ":" ";" string) that can be rewritten to use string-replace. But! Every single case requires careful analysis, exactly because replace-regexp-in-string sets the match data. Perhaps five lines later, there's a reference to (match-string 0 string)? Perhaps the reference is in the function that called this function? So most changes are fraught with possible unforeseen breakages, the code is super-duper straightforward like (setq string (replace-regexp-in-string ":" ";" string)) (setq string (replace-regexp-in-string "a" "b" string)) Then you know that you can replace the first one without any danger. --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no From debbugs-submit-bounces@debbugs.gnu.org Thu Nov 26 09:04:06 2020 Received: (at 44861) by debbugs.gnu.org; 26 Nov 2020 14:04:06 +0000 Received: from localhost ([127.0.0.1]:40342 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiHsT-0005f6-Uu for submit@debbugs.gnu.org; Thu, 26 Nov 2020 09:04:06 -0500 Received: from quimby.gnus.org ([95.216.78.240]:50866) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiHsR-0005eV-Sk for 44861@debbugs.gnu.org; Thu, 26 Nov 2020 09:04:04 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date: References:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=THZtwDN41s+iDmFkkxzHGp2EPHKf9mSE53CUQREPsQw=; b=uxOqJBRnNeHeA5y2LfO44j00zw a1QEawP5Ms0YBGdlv9aJtdETZK2ZTGZOchPr0EVKshm9AyobjcwRbpIIejfj4yq2VKNF/XQFMfdEh OYaydemsgRz/+D41bMloPGUmuZHwXYtZtm6AUqqBZeNH4UNZ3M0uM0jzH7p81RUwUGDI=; Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=xo) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kiHsJ-0002Dm-LB; Thu, 26 Nov 2020 15:03:58 +0100 From: Lars Ingebrigtsen To: Stefan Kangas Subject: Re: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' References: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> <83EC926B-DE9E-48BC-8FD2-C7CB3617AD50@acm.org> <871rggs1mk.fsf@gnus.org> Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAgMAAAAqbBEUAAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAADFBMVEVMMEUaDxmZXmX/ ///z/V5DAAAAAWJLR0QDEQxM8gAAAAd0SU1FB+QLGg4DL4MXNk4AAAGDSURBVCjPlZLBipxAEIZ/ RUE8OcHOYU8eQpB+inbBZeJJxQrBR9incGQbJnPKQubeKxto6ilT7Qwh1/1P/9dV9R+qGgk7V787 RfT9iqSuYqwvRDQpLBtfOAcFIWaRX9QOwfP21rY30ELOtv8qvF5vbTrULjITCWwywqU8I723XUOu 2tuYVdtDtBdOpu/rqga7QJFqoDX4HCLsJJ7BmXi2/YJug699SHu07ZHh4CXQ5CcZAszqcq7Z0hcG 3FrHPubEJzE+Ab7Q7D/7B+lCkRU5u/L8gLsqlNwZoBBvjovlXABxUmhsqT4aHDoUebVAPddAbyjO pB6NSwA1JU5loD7CD9OS0cNoFRTmlExTN1OpIGtIjD1UROW1EYj6P8PQqvhVdpKks++/JWl62LfV +bL5HbWDQG+ffikqPAUoTo9nmsZwOtnjqdNlOX/dr6DfOi6mbk33K7x3nNHMhx2WkX/SfLEBUD5t rwJ7AFL5ADT6/XLU0qhosjcYpozuCr+h/R/og/AXXEe7+JoeW/oAAAAldEVYdGRhdGU6Y3JlYXRl ADIwMjAtMTEtMjZUMTQ6MDM6NDcrMDA6MDDO2CRYAAAAJXRFWHRkYXRlOm1vZGlmeQAyMDIwLTEx LTI2VDE0OjAzOjQ3KzAwOjAwv4Wc5AAAAABJRU5ErkJggg== X-Now-Playing: The Slits's _Return Of The Giant Slits_: "Earthbeat (Daichi No Oto)" Date: Thu, 26 Nov 2020 15:03:54 +0100 In-Reply-To: (Stefan Kangas's message of "Thu, 26 Nov 2020 05:43:29 -0800") Message-ID: <87v9dsp645.fsf@gnus.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Stefan Kangas writes: > Before we undertake any major changes in that direction, perhaps we > should benchmark the relevant functions on the native-comp branch? It > changes the benchmarks by quite a lot in some cases. Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 44861 Cc: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= , 44861@debbugs.gnu.org, Shigeru Fukaya X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Stefan Kangas writes: > Before we undertake any major changes in that direction, perhaps we > should benchmark the relevant functions on the native-comp branch? It > changes the benchmarks by quite a lot in some cases. Yes, that's true. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no From debbugs-submit-bounces@debbugs.gnu.org Thu Nov 26 09:41:59 2020 Received: (at 44861) by debbugs.gnu.org; 26 Nov 2020 14:41:59 +0000 Received: from localhost ([127.0.0.1]:40455 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiIT9-0000J1-I0 for submit@debbugs.gnu.org; Thu, 26 Nov 2020 09:41:59 -0500 Received: from eggs.gnu.org ([209.51.188.92]:46542) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiIT8-0000Io-0e for 44861@debbugs.gnu.org; Thu, 26 Nov 2020 09:41:58 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:43368) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiIT2-0004AA-A3; Thu, 26 Nov 2020 09:41:52 -0500 Received: from [176.228.60.248] (port=4869 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kiIT1-0004gM-DN; Thu, 26 Nov 2020 09:41:52 -0500 Date: Thu, 26 Nov 2020 16:41:32 +0200 Message-Id: <83d000qixv.fsf@gnu.org> From: Eli Zaretskii To: Stefan Kangas In-Reply-To: (message from Stefan Kangas on Thu, 26 Nov 2020 05:43:29 -0800) Subject: Re: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' References: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> <83EC926B-DE9E-48BC-8FD2-C7CB3617AD50@acm.org> <871rggs1mk.fsf@gnus.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 44861 Cc: mattiase@acm.org, larsi@gnus.org, 44861@debbugs.gnu.org, shigeru.fukaya@gmail.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Stefan Kangas > Date: Thu, 26 Nov 2020 05:43:29 -0800 > Cc: 44861@debbugs.gnu.org, Shigeru Fukaya > > Lars Ingebrigtsen writes: > > > But I wonder -- would it make sense to move the entire > > replace-regexp-in-string function to C? > > Before we undertake any major changes in that direction, perhaps we > should benchmark the relevant functions on the native-comp branch? It > changes the benchmarks by quite a lot in some cases. Benchmarking is always welcome, but I don't think we should dismiss performance improvements by assuming everyone will use the natively-compiled Lisp code VSN. I'm quite sure *.elc files will be used in the observable future by many people. I also expect C code to run faster than natively-compiled Lisp, when significant implementation changes, such as those suggested by Mattias, are involved. From debbugs-submit-bounces@debbugs.gnu.org Thu Nov 26 09:55:09 2020 Received: (at 44861) by debbugs.gnu.org; 26 Nov 2020 14:55:09 +0000 Received: from localhost ([127.0.0.1]:40577 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiIft-0000f7-Di for submit@debbugs.gnu.org; Thu, 26 Nov 2020 09:55:09 -0500 Received: from mail212c50.megamailservers.eu ([91.136.10.222]:47902 helo=mail194c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kiIfp-0000d9-K9 for 44861@debbugs.gnu.org; Thu, 26 Nov 2020 09:55:08 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1606402498; bh=vyTD+lbvWeD5fDuanX6M5AfjCbK5IxiXtXf/nRx31Vc=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=SaRuQquDaqAKtVM0vTtRBr+oprbSevnG2ROwqmexVFhjuYTmp02rHaDV+JD7L/Jyh BTJ2ja9LAnHTxwBxwqP99lS+fNzTx6qlp2fo9p7s0Rt+MtHvq0cDEQ51vl2YHOeKyh Q+XWVGcKxn4rKUMEAzQRDXMIWLd24SRrs82Z1Yp0= Feedback-ID: mattiase@acm.or Received: from stanniol.lan (c-064ae655.032-75-73746f71.bbcust.telenor.se [85.230.74.6]) (authenticated bits=0) by mail194c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0AQEsuol031597; Thu, 26 Nov 2020 14:54:58 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <87zh34p64t.fsf@gnus.org> Date: Thu, 26 Nov 2020 15:54:56 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <5722D619-98C0-421B-84B5-F30B858FBDF8@acm.org> References: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> <83EC926B-DE9E-48BC-8FD2-C7CB3617AD50@acm.org> <871rggs1mk.fsf@gnus.org> <87zh34p64t.fsf@gnus.org> To: Lars Ingebrigtsen X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F24.5FBFC1C2.00A6, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=CeB2G4jl c=1 sm=1 tr=0 a=Ni+dBsiEfW2GqKMPYZim9A==:117 a=Ni+dBsiEfW2GqKMPYZim9A==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=OocQHUDgAAAA:8 a=f5oXje33BE8G-T2eEq0A:9 a=CjuIK1q_8ugA:10 a=xUZTl98r3Qw_uB5NK3jt:22 X-Origin-Country: SE X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 44861 Cc: Dmitry Gutov , 44861@debbugs.gnu.org, Stefan Kangas , Shigeru Fukaya X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 26 nov. 2020 kl. 15.03 skrev Lars Ingebrigtsen : > I started looking at that, and there's a huge pile of calls like >=20 > (replace-regexp-in-string ":" ";" string) >=20 > that can be rewritten to use string-replace. But! Every single case > requires careful analysis, exactly because replace-regexp-in-string = sets > the match data. No it doesn't; the entire body is wrapped in save-match-data. It does set the match data locally for use in the replacement function, = if any, but then string-replace cannot be used anyway. There are other things that may need investigating for a switch to = string-replace: whether case-folding is relevant, and whether a = nil-or-absent argument to FIXEDCASE is intended, an oversight, or = irrelevant. In my experience, most code does not take case-folding into account at = all or tacitly assume it does not apply. (Having a global variable = controlling it is a terrible interface, by the way.) Similarly, most = calls that omit FIXEDCASE do it without any thought that the replacement = would be anything but literal. Thus, the risk isn't very big for either = but these are still issues requiring some consideration. From debbugs-submit-bounces@debbugs.gnu.org Sun Nov 29 08:29:06 2020 Received: (at 44861) by debbugs.gnu.org; 29 Nov 2020 13:29:07 +0000 Received: from localhost ([127.0.0.1]:50043 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kjMlG-0008Jw-KQ for submit@debbugs.gnu.org; Sun, 29 Nov 2020 08:29:06 -0500 Received: from mail-wm1-f49.google.com ([209.85.128.49]:39405) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kjMlE-0008JQ-E2 for 44861@debbugs.gnu.org; Sun, 29 Nov 2020 08:29:05 -0500 Received: by mail-wm1-f49.google.com with SMTP id 3so13294913wmg.4 for <44861@debbugs.gnu.org>; Sun, 29 Nov 2020 05:29:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tcd-ie.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-transfer-encoding; bh=qwXVxmZAJKlGEUYyEZOQ2Ti35B652cRSDHwHMC1X6Zw=; b=LLcvOXhHHMCdFHf64FS6ejIRdAcgW0SY0mlyZJtfUQH8hhFkja1MRebDTpqJvlelEB uWFUbTqQj7IDq6f2vZdnSKPvviflHA39VGOlv+xYgfVz0FO89ThTbADNbjicL4Xp1nc+ XY7OxVZEWjDk46+wNnjLd+2LzF/1/h2zpbp/xUEwkoEn7TRrF+1TcR6MXRr11mzJRAZA z337Dohm50ghP1LZ0sAeBlKTiW3xn6b2hqRh0KC9fE57l4Vp+WXuV4yQBvn65i7+7qET w3RNOhM1yu7mN4f6MmqtvrvYEOBJOe7D1L6M/R1Nau77UB6S1bjP+z2jTeOlwEP9Igj9 YX7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version:content-transfer-encoding; bh=qwXVxmZAJKlGEUYyEZOQ2Ti35B652cRSDHwHMC1X6Zw=; b=kDhiiJ1eAfGYLTsOlOR7vQwJ4TMolBd/4HIHLGGNHeTkXWFXk+SZZnWJBRmX/3YdlM BobcMwt8p0F2YnaJ7GiCMtnsFBHIIUQZP46BlICHHuYRPlogUpiBdxGbGz0co29i0GyI Bo2bD2x4oN4n2Ux70C2LehQXjsyeuh3veAmtgRYtS/PTBFrgz+VApdQTSaC51aw5h0Jd ME6/+JmAYf1pAY/RiAO95tRXFKXT+kWpLritaSQlfGCEEAZu7Kx6S7t+9c5bNcy0toE/ gvutxvc/VDKbdoTW4vtN873B7oXwzYqVXTPM7E4ZiaipkEnqfM2MBDDl1gjc2g/XPKdV bd2Q== X-Gm-Message-State: AOAM530xL28kEfOM76M0dfYoSAhkjMzGmThy6Kv4XwDx6xWnLht+E9II wHEwa6cL98JENRAzYdprvKL8fg== X-Google-Smtp-Source: ABdhPJwaR+HRBosDh9i7U/vUPjSFPO6jrs9V9Zn5H0rIl58NfNc0fC7ou5KfigGJnSlpFD9sY69RIA== X-Received: by 2002:a7b:c3d5:: with SMTP id t21mr18399768wmj.37.1606656538609; Sun, 29 Nov 2020 05:28:58 -0800 (PST) Received: from localhost ([2a02:8084:20e2:c380:1f68:7ff5:120d:64e]) by smtp.gmail.com with ESMTPSA id c190sm15812843wme.19.2020.11.29.05.28.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Nov 2020 05:28:57 -0800 (PST) From: "Basil L. Contovounesios" To: Lars Ingebrigtsen Subject: Re: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' References: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> <83EC926B-DE9E-48BC-8FD2-C7CB3617AD50@acm.org> <871rggs1mk.fsf@gnus.org> <87zh34p64t.fsf@gnus.org> Date: Sun, 29 Nov 2020 13:28:55 +0000 In-Reply-To: <87zh34p64t.fsf@gnus.org> (Lars Ingebrigtsen's message of "Thu, 26 Nov 2020 15:03:30 +0100") Message-ID: <87ft4sl2aw.fsf@tcd.ie> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 44861 Cc: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= , Dmitry Gutov , 44861@debbugs.gnu.org, Stefan Kangas , Shigeru Fukaya X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Lars Ingebrigtsen writes: > Mattias Engdeg=C3=A5rd writes: > >> (Dmitry Gutov asked for a C implementation in bug#20273 for improving >> the speed of json encoding; is that still relevant?) > > No, probably not, since it's now done by Jansson? So I'm closing that > one. Also, bug#20273 was prompted by bug#20154, which was addressed by avoiding replace-regexp-in-string, and further optimised in bug#40693. --=20 Basil From unknown Sat Jun 14 18:41:44 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Mon, 28 Dec 2020 12:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator