From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 03 Apr 2020 16:11:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: 40407@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.158593024120868 (code B ref -1); Fri, 03 Apr 2020 16:11:01 +0000 Received: (at submit) by debbugs.gnu.org; 3 Apr 2020 16:10:41 +0000 Received: from localhost ([127.0.0.1]:43066 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKOu1-0005QV-JR for submit@debbugs.gnu.org; Fri, 03 Apr 2020 12:10:41 -0400 Received: from lists.gnu.org ([209.51.188.17]:50276) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKOty-0005QD-4r for submit@debbugs.gnu.org; Fri, 03 Apr 2020 12:10:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34168) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jKOtw-0005PP-2g for bug-gnu-emacs@gnu.org; Fri, 03 Apr 2020 12:10:38 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: * X-Spam-Status: No, score=1.1 required=5.0 tests=BAYES_50,KHOP_HELO_FCRDNS, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jKOtu-0000dC-TK for bug-gnu-emacs@gnu.org; Fri, 03 Apr 2020 12:10:35 -0400 Received: from mail1447c50.megamailservers.eu ([91.136.14.47]:37348 helo=mail265c50.megamailservers.eu) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1jKOtu-0000Mj-CX for bug-gnu-emacs@gnu.org; Fri, 03 Apr 2020 12:10:34 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1585923525; bh=cmmkHNgz5EDJE04LBORiH5MIRDrgZwXY9fIVJzfO3n4=; h=From:Subject:Date:To:From; b=bFc72WnjN+V4z9VqCidsNLjVMlz5jcSqXEPv3oVVQ2VaR8D0YO20MftrKHdQr47Hd WshkA2jSx6M6KDRDRrlxofuaxS9Tzgf9SgtJ6JN0Y4TuqYKNuZZdSeyawhXJzjPimK /DHagAqlJYTzcQYQMusLLqtzzGes3caQ11D1qD6s= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail265c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 033EIhIi027027 for ; Fri, 3 Apr 2020 14:18:45 +0000 From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Content-Type: multipart/mixed; boundary="Apple-Mail=_DA8A3D29-F208-4082-ACC1-88BC9E8B1B54" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) Message-Id: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> Date: Fri, 3 Apr 2020 16:18:43 +0200 X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F18.5E874595.006F, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=D5w51cZj c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=M51BFTxLslgA:10 a=KYCzAwbNy5S3BsPW-U0A:9 a=CjuIK1q_8ugA:10 a=TqocWJiu5xKD2_rvUcIA:9 a=B2y7HmGcmWMA:10 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-Received-From: 91.136.14.47 X-Spam-Score: 0.3 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --Apple-Mail=_DA8A3D29-F208-4082-ACC1-88BC9E8B1B54 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii ENCODE_FILE and DECODE_FILE turn out to be surprisingly slow, and = allocate copious amounts of memory, to the point that they often turn up = in both memory and cpu profiles. (This is on macOS; I haven't checked = the situation elsewhere.) For instance, a single call to file-relative-name, with ASCII-only = arguments, manages to allocate 140 KiB. There are several conversion = steps each involving creating temporary buffers as well as the = compilation and execution of very large "quick-check" regexps. Example: (progn (require 'profiler) (profiler-reset) (garbage-collect) (profiler-start 'mem) (file-relative-name "abc") (profiler-stop) (profiler-report)) This applies to just about every function dealing with files or file = names. The attached patch is somewhat conservatively written but at least a = starting point. It reduces the memory consumption by file-relative-name = in the example above to zero. Perhaps we can assume that file names = codings are always ASCII-compatible; if so, the shortcut can be taken in = encode_file_name and decode_file_name directly. There is already a hack in encode_file_name that assumes that no unibyte = string ever needs encoding; if so, the shortcut could perhaps be = extended to decode_file_name and simplified. --Apple-Mail=_DA8A3D29-F208-4082-ACC1-88BC9E8B1B54 Content-Disposition: attachment; filename=0001-Avoid-expensive-recoding-for-ASCII-identity-cases.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Avoid-expensive-recoding-for-ASCII-identity-cases.patch" Content-Transfer-Encoding: quoted-printable =46rom=20dca8b997d3e7c36667e12f1c77fc6ffed7d8f555=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Fri,=203=20Apr=202020=2016:01:01=20+0200=0A= Subject:=20[PATCH]=20Avoid=20expensive=20recoding=20for=20ASCII=20= identity=20cases=0A=0AOptimise=20for=20the=20common=20case=20of=20= encoding=20or=20decoding=20an=20ASCII-only=0Astring=20using=20an=20= ASCII-compatible=20coding,=20for=20file=20names=20in=20particular.=0A=0A= *=20src/coding.c=20(string_ascii_p):=20New=20function.=0A= (code_convert_string):=20Return=20the=20input=20string=20for=20= ASCII-only=20inputs=0Aand=20ASCII-compatible=20codings.=0A---=0A=20= src/coding.c=20|=2023=20++++++++++++++++++++++-=0A=201=20file=20changed,=20= 22=20insertions(+),=201=20deletion(-)=0A=0Adiff=20--git=20a/src/coding.c=20= b/src/coding.c=0Aindex=200bea2a0c2b..9a17fafb05=20100644=0A---=20= a/src/coding.c=0A+++=20b/src/coding.c=0A@@=20-9471,6=20+9471,17=20@@=20= used=20(which=20may=20be=20different=20from=20CODING-SYSTEM=20if=20= CODING-SYSTEM=20is=0A=20=20=20return=20code_convert_region=20(start,=20= end,=20coding_system,=20destination,=201,=200);=0A=20}=0A=20=0A+/*=20= Whether=20a=20(unibyte)=20string=20only=20contains=20chars=20in=20the=20= 0..127=20range.=20=20*/=0A+static=20bool=0A+string_ascii_p=20= (Lisp_Object=20str)=0A+{=0A+=20=20ptrdiff_t=20nbytes=20=3D=20SBYTES=20= (str);=0A+=20=20for=20(ptrdiff_t=20i=20=3D=200;=20i=20<=20nbytes;=20i++)=0A= +=20=20=20=20if=20(SREF=20(str,=20i)=20>=20127)=0A+=20=20=20=20=20=20= return=20false;=0A+=20=20return=20true;=0A+}=0A+=0A=20Lisp_Object=0A=20= code_convert_string=20(Lisp_Object=20string,=20Lisp_Object=20= coding_system,=0A=20=09=09=20=20=20=20=20Lisp_Object=20dst_object,=20= bool=20encodep,=20bool=20nocopy,=0A@@=20-9502,7=20+9513,17=20@@=20= code_convert_string=20(Lisp_Object=20string,=20Lisp_Object=20= coding_system,=0A=20=20=20chars=20=3D=20SCHARS=20(string);=0A=20=20=20= bytes=20=3D=20SBYTES=20(string);=0A=20=0A-=20=20if=20(BUFFERP=20= (dst_object))=0A+=20=20if=20(EQ=20(dst_object,=20Qt))=0A+=20=20=20=20{=0A= +=20=20=20=20=20=20/*=20Fast=20path=20for=20ASCII-only=20input=20and=20= an=20ASCII-compatible=20coding:=0A+=20=20=20=20=20=20=20=20=20act=20as=20= identity.=20=20*/=0A+=20=20=20=20=20=20Lisp_Object=20attrs=20=3D=20= CODING_ID_ATTRS=20(coding.id);=0A+=20=20=20=20=20=20if=20(!=20NILP=20= (CODING_ATTR_ASCII_COMPAT=20(attrs))=0A+=20=20=20=20=20=20=20=20=20=20&&=20= (STRING_MULTIBYTE=20(string)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= ?=20(chars=20=3D=3D=20bytes)=20:=20string_ascii_p=20(string)))=0A+=20=20=20= =20=20=20=20=20return=20string;=0A+=20=20=20=20}=0A+=20=20else=20if=20= (BUFFERP=20(dst_object))=0A=20=20=20=20=20{=0A=20=20=20=20=20=20=20= struct=20buffer=20*buf=20=3D=20XBUFFER=20(dst_object);=0A=20=20=20=20=20=20= =20ptrdiff_t=20buf_pt=20=3D=20BUF_PT=20(buf);=0A--=20=0A2.21.1=20(Apple=20= Git-122.3)=0A=0A= --Apple-Mail=_DA8A3D29-F208-4082-ACC1-88BC9E8B1B54-- From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 03 Apr 2020 16:25:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158593107224113 (code B ref 40407); Fri, 03 Apr 2020 16:25:01 +0000 Received: (at 40407) by debbugs.gnu.org; 3 Apr 2020 16:24:32 +0000 Received: from localhost ([127.0.0.1]:43086 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKP7P-0006Go-NY for submit@debbugs.gnu.org; Fri, 03 Apr 2020 12:24:32 -0400 Received: from eggs.gnu.org ([209.51.188.92]:49748) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKP7N-0006GL-Jl for 40407@debbugs.gnu.org; Fri, 03 Apr 2020 12:24:30 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:52552) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jKP7I-0008CG-CF; Fri, 03 Apr 2020 12:24:24 -0400 Received: from [176.228.60.248] (port=1905 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jKP7I-0003FO-0H; Fri, 03 Apr 2020 12:24:24 -0400 Date: Fri, 03 Apr 2020 19:24:09 +0300 Message-Id: <835zegwn9y.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Fri, 3 Apr 2020 16:18:43 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Fri, 3 Apr 2020 16:18:43 +0200 > > ENCODE_FILE and DECODE_FILE turn out to be surprisingly slow, and allocate copious amounts of memory, to the point that they often turn up in both memory and cpu profiles. (This is on macOS; I haven't checked the situation elsewhere.) AFAIR, on macOS the situation is worse than elsewhere, because of the normalization thing. > For instance, a single call to file-relative-name, with ASCII-only arguments, manages to allocate 140 KiB. There are several conversion steps each involving creating temporary buffers as well as the compilation and execution of very large "quick-check" regexps. Example: > > (progn > (require 'profiler) > (profiler-reset) > (garbage-collect) > (profiler-start 'mem) > (file-relative-name "abc") > (profiler-stop) > (profiler-report)) Can you tell more about the conversion steps and the memory each one allocates? > Perhaps we can assume that file names codings are always ASCII-compatible I don't think every encoding is ASCII compatible, so I don't see how we can assume that in general. But the check whether an encoding is ASCII-compatible takes a negligible amount of time, so why bother with such an assumption? > There is already a hack in encode_file_name that assumes that no unibyte string ever needs encoding; if so, the shortcut could perhaps be extended to decode_file_name and simplified. I'm not sure I understand what you mean by extending the shortcut to decode_file_name. Please elaborate. > - if (BUFFERP (dst_object)) > + if (EQ (dst_object, Qt)) > + { > + /* Fast path for ASCII-only input and an ASCII-compatible coding: > + act as identity. */ > + Lisp_Object attrs = CODING_ID_ATTRS (coding.id); > + if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs)) > + && (STRING_MULTIBYTE (string) > + ? (chars == bytes) : string_ascii_p (string))) > + return string; I don't think we can return the same string if NOCOPY is non-zero. The callers might not expect that, and you might inadvertently cause the original string be modified behind the caller's back. But if NOCOPY is 'false', I think this change is OK. Just make sure the test suite doesn't start failing, maybe there's something else we are missing. Thanks. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 03 Apr 2020 22:33:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158595314921103 (code B ref 40407); Fri, 03 Apr 2020 22:33:01 +0000 Received: (at 40407) by debbugs.gnu.org; 3 Apr 2020 22:32:29 +0000 Received: from localhost ([127.0.0.1]:43283 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKUrU-0005UH-UY for submit@debbugs.gnu.org; Fri, 03 Apr 2020 18:32:29 -0400 Received: from mail155c50.megamailservers.eu ([91.136.10.165]:48798 helo=mail51c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKUrS-0005Tr-N4 for 40407@debbugs.gnu.org; Fri, 03 Apr 2020 18:32:27 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1585953144; bh=eFO0qFdUSxytWZY/DJFszDJ6veF99F1xaUxDptQLHNI=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=prfM62IMccL6xTSGDXXVkbc5AhSs3Hehhd655b6255vUJGK3PsfPijh6Jcj4bVAtk WjADttbckSgKIGt9A7CmSWVpq4RHVkzQTtFugNAPcFgtAx4pnSBFB+Rnwtt6y5C9Gm +5HowzHZgvw0+SSYnv6dX97EL7LPzh5lSVzzfkVo= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail51c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 033MWMYv021405; Fri, 3 Apr 2020 22:32:24 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <835zegwn9y.fsf@gnu.org> Date: Sat, 4 Apr 2020 00:32:21 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F16.5E87B956.0011, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=MOMeZ/Rl c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=QotmkJJByNf4bd_E1JAA:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 3 apr. 2020 kl. 18.24 skrev Eli Zaretskii : > AFAIR, on macOS the situation is worse than elsewhere, because of the > normalization thing. Very likely. It's just what I had in my lap. > Can you tell more about the conversion steps and the memory each one > allocates? Courtesy the memory profiler: - file-relative-name 141,551 = 15% - file-name-case-insensitive-p 100,613 = 11% - ucs-normalize-hfs-nfd-pre-write-conversion 100,613 = 11% - ucs-normalize-HFS-NFD-region 100,613 = 11% ucs-normalize-region 100,613 = 11% - expand-file-name 40,828 = 4% - ucs-normalize-hfs-nfd-post-read-conversion 40,828 = 4% - ucs-normalize-HFS-NFC-region 40,828 = 4% ucs-normalize-region 40,828 = 4% where file_name_case_insensitive_p calls ENCODE_FILE and = expand_file_name calls DECODE_FILE. I'm not sure how much each part of = ucs-normalize-region actually consumes, but I think we can agree that we = don't want it called on any platform unless strictly necessary. > I don't think every encoding is ASCII compatible, so I don't see how > we can assume that in general. But the check whether an encoding is > ASCII-compatible takes a negligible amount of time, so why bother with > such an assumption? Quite, I just thought I'd ask in case there were some unwritten = invariant that you knew about. > I'm not sure I understand what you mean by extending the shortcut to > decode_file_name. Please elaborate. Never mind, it was an under-thought idea. The existing bootstrap hack = making encode_file_name identity for any unibyte string does not seem to = need or allow any symmetry in decode_file_name. > I don't think we can return the same string if NOCOPY is non-zero. > The callers might not expect that, and you might inadvertently cause > the original string be modified behind the caller's back. You are no doubt correct, but doesn't it look like the sense of NOCOPY = has been inverted here? It runs contrary to the intuitive meaning and to = the doc string of {encode,decode}-coding-string. In fact: (let* ((nocopy nil) (x "abc") (y (decode-coding-string x nil nocopy nil))) (eq x y)) =3D> t Looks like we suddenly got more work on our hands. What a surprise. Since string mutation is so rare, I doubt it has caused any real = trouble. Now, do we fix it by inverting the sense of the argument, or by = renaming it to COPY? I'm fairly neutral, but there are arguments in = either way, both in terms of performance and correctness. And what about = internal calls to code_convert_string? There are 193 calls to {encode, decode}-coding-string in the Emacs tree, = and only 14 of them pass a non-nil value to NOCOPY. I'd be inclined to = keep the semantics but rename the argument to COPY, on the grounds that = no-copy is a better default; then change those 14 calls to pass nil = instead, since that obviously was the intent. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Apr 2020 09:27:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158599239115270 (code B ref 40407); Sat, 04 Apr 2020 09:27:02 +0000 Received: (at 40407) by debbugs.gnu.org; 4 Apr 2020 09:26:31 +0000 Received: from localhost ([127.0.0.1]:43458 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKf4Q-0003yE-Ld for submit@debbugs.gnu.org; Sat, 04 Apr 2020 05:26:30 -0400 Received: from eggs.gnu.org ([209.51.188.92]:39362) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKf4O-0003xl-6u for 40407@debbugs.gnu.org; Sat, 04 Apr 2020 05:26:28 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:40720) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jKf4J-0005mD-0m; Sat, 04 Apr 2020 05:26:23 -0400 Received: from [176.228.60.248] (port=4456 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jKf4I-0004Iy-BL; Sat, 04 Apr 2020 05:26:22 -0400 Date: Sat, 04 Apr 2020 12:26:11 +0300 Message-Id: <83mu7rvbyk.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Sat, 4 Apr 2020 00:32:21 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Sat, 4 Apr 2020 00:32:21 +0200 > Cc: 40407@debbugs.gnu.org > > - file-relative-name 141,551 15% > - file-name-case-insensitive-p 100,613 11% > - ucs-normalize-hfs-nfd-pre-write-conversion 100,613 11% > - ucs-normalize-HFS-NFD-region 100,613 11% > ucs-normalize-region 100,613 11% > - expand-file-name 40,828 4% > - ucs-normalize-hfs-nfd-post-read-conversion 40,828 4% > - ucs-normalize-HFS-NFC-region 40,828 4% > ucs-normalize-region 40,828 4% > > where file_name_case_insensitive_p calls ENCODE_FILE and expand_file_name calls DECODE_FILE. DECODE_FILE is called because the file name in question starts with a "~"? Otherwise, I don't think I understand why would expand-file-name need to decode a file name. > I'm not sure how much each part of ucs-normalize-region actually consumes, but I think we can agree that we don't want it called on any platform unless strictly necessary. Any expensive code should be avoided if it isn't necessary, so yes, I agree. And yes, Unicode normalization is expensive. If we consider the macOS filesystem idiosyncrasies important to support efficiently, perhaps we should rewrite the normalization code in C. > > I don't think every encoding is ASCII compatible, so I don't see how > > we can assume that in general. But the check whether an encoding is > > ASCII-compatible takes a negligible amount of time, so why bother with > > such an assumption? > > Quite, I just thought I'd ask in case there were some unwritten invariant that you knew about. Whether a coding-system is ASCII-compatible is determined by the definition of that coding-system. Look in mule-conf.el, and you will see there several that aren't ASCII-compatible. UTF-16 is one example, but there are others. > > I don't think we can return the same string if NOCOPY is non-zero. > > The callers might not expect that, and you might inadvertently cause > > the original string be modified behind the caller's back. > > You are no doubt correct, but doesn't it look like the sense of NOCOPY has been inverted here? That ship has sailed long ago (I could explain how this "inverted" meaning could make sense, but I don't think it's relevant to the issue at hand), and there are several other internal functions that use a similar argument in the same "inverted" sense. This is a separate issue, anyway. > Since string mutation is so rare, I doubt it has caused any real trouble. You are wrong here, it can happen very easily, especially when you manipulate the encoded string in C. The simplest use case is that you encode a file name, and then make some change to the encoded string, like change the letter-case or remove the trailing slash. Suddenly the original string is changed as well, and the Lisp caller of the high-level function might be mightily surprised by the result. IME, the cases where we can safely assume it's OK to return the same string are actually very rare. It is no accident that you saw so few calls of these functions where we use that optional behavior. > Now, do we fix it by inverting the sense of the argument, or by renaming it to COPY? Neither, IMO. Again, it's a separate problem, and let's keep our sights squarely on the original issue you wanted to fix. Let's tackle the NOCOPY issue in a separate discussion, OK? Thanks. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Apr 2020 10:27:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158599600228261 (code B ref 40407); Sat, 04 Apr 2020 10:27:02 +0000 Received: (at 40407) by debbugs.gnu.org; 4 Apr 2020 10:26:42 +0000 Received: from localhost ([127.0.0.1]:43499 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKg0f-0007Lj-V5 for submit@debbugs.gnu.org; Sat, 04 Apr 2020 06:26:42 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50705) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKg0e-0007L9-AY for 40407@debbugs.gnu.org; Sat, 04 Apr 2020 06:26:40 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:41119) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jKg0X-0001pb-P1; Sat, 04 Apr 2020 06:26:33 -0400 Received: from [176.228.60.248] (port=4133 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jKg0X-0005SM-2H; Sat, 04 Apr 2020 06:26:33 -0400 Date: Sat, 04 Apr 2020 13:26:20 +0300 Message-Id: <83imifv96b.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Sat, 4 Apr 2020 00:32:21 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Sat, 4 Apr 2020 00:32:21 +0200 > Cc: 40407@debbugs.gnu.org > > Since string mutation is so rare, I doubt it has caused any real trouble. Now, do we fix it by inverting the sense of the argument, or by renaming it to COPY? I'm fairly neutral, but there are arguments in either way, both in terms of performance and correctness. And what about internal calls to code_convert_string? > > There are 193 calls to {encode, decode}-coding-string in the Emacs tree, and only 14 of them pass a non-nil value to NOCOPY. I'd be inclined to keep the semantics but rename the argument to COPY, on the grounds that no-copy is a better default; then change those 14 calls to pass nil instead, since that obviously was the intent. After looking at this for some time, I think the problem is rarely if ever seen. The only function which has the NOCOPY sense inverted is code_convert_string, and it only does that when the CODING_SYSTEM argument is nil, which should almost never happen. So I think it's OK to change code_convert_string on master to use NOCOPY in its correct sense. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Apr 2020 16:42:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.15860185131257 (code B ref 40407); Sat, 04 Apr 2020 16:42:02 +0000 Received: (at 40407) by debbugs.gnu.org; 4 Apr 2020 16:41:53 +0000 Received: from localhost ([127.0.0.1]:44659 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKlrk-0000KA-Pl for submit@debbugs.gnu.org; Sat, 04 Apr 2020 12:41:53 -0400 Received: from mail1459c50.megamailservers.eu ([91.136.14.59]:40852 helo=mail267c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKlrh-0000Jd-NZ for 40407@debbugs.gnu.org; Sat, 04 Apr 2020 12:41:50 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586018502; bh=AN8rfhemmHefjybxTqE89KrzMh+7EErEXtB2qGwC480=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=mH9TeUhEXGY4gocX9O9aky1o6316rgboekFaW52a9DNY0EuiHqh4jHILm0i1JKNSA 2ly6XjTOv/H4CSI3nJ3SE/S7RI655RiMWpJmBGxjh2OpirrmbOVbuarWF6YrGRQe2D jULrQEQ8QDakqi/Dzida+2gt13lPMT3Fzb2xcICU= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail267c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 034GfesK028093; Sat, 4 Apr 2020 16:41:42 +0000 From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Message-Id: <729DE2D1-EA0F-46F9-8B4B-2ED146CE6892@acm.org> Content-Type: multipart/mixed; boundary="Apple-Mail=_9B9AC2A3-5983-46DF-AE16-BAACF6DF753B" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) Date: Sat, 4 Apr 2020 18:41:39 +0200 In-Reply-To: <83mu7rvbyk.fsf@gnu.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83mu7rvbyk.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F20.5E88B887.009F, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=Cf92G4jl c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=zcUfEPlord_Q0e7CtzMA:9 a=CjuIK1q_8ugA:10 a=bdXfzROkdeKqxy9yWSUA:9 a=B2y7HmGcmWMA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.2 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 4 apr. 2020 kl. 11.26 skrev Eli Zaretskii : > DECODE_FILE is called because the file name in question starts with a > "~"? Otherwise, I don't think I understand why would expand-file-name > need to decode a file name. Content analysis details: (1.2 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: gnu.org] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.3 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --Apple-Mail=_9B9AC2A3-5983-46DF-AE16-BAACF6DF753B Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii 4 apr. 2020 kl. 11.26 skrev Eli Zaretskii : > DECODE_FILE is called because the file name in question starts with a > "~"? Otherwise, I don't think I understand why would expand-file-name > need to decode a file name. Maybe it's because default-directory started with a tilde. It doesn't = really matter; it's a common case, and the profiler tells us as much. > IME, the cases where we can safely assume it's OK to return the same > string are actually very rare. It is no accident that you saw so few > calls of these functions where we use that optional behavior. This does not mean that the remaining 179 calls require a copy; they = just use the default value of the parameter. > Neither, IMO. Again, it's a separate problem, and let's keep our > sights squarely on the original issue you wanted to fix. Let's tackle > the NOCOPY issue in a separate discussion, OK? Thank you, a separate bug for it is fine. Here is a revised patch which takes the nocopy parameter into account = (in its inverted sense). Obviously it needs to be adapted if the nocopy = inversion is dealt with first; the two bugs do not commute. --Apple-Mail=_9B9AC2A3-5983-46DF-AE16-BAACF6DF753B Content-Disposition: attachment; filename=0001-Avoid-expensive-recoding-for-ASCII-identity-cases-bu.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Avoid-expensive-recoding-for-ASCII-identity-cases-bu.patch" Content-Transfer-Encoding: quoted-printable =46rom=200c6139ab490733f3c1257665535fc4ed2ad0dbe7=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Fri,=203=20Apr=202020=2016:01:01=20+0200=0A= Subject:=20[PATCH]=20Avoid=20expensive=20recoding=20for=20ASCII=20= identity=20cases=20(bug#40407)=0A=0AOptimise=20for=20the=20common=20case=20= of=20encoding=20or=20decoding=20an=20ASCII-only=0Astring=20using=20an=20= ASCII-compatible=20coding,=20for=20file=20names=20in=20particular.=0A=0A= *=20src/coding.c=20(string_ascii_p):=20New=20function.=0A= (code_convert_string):=20Return=20the=20input=20string=20for=20= ASCII-only=20inputs=0Aand=20ASCII-compatible=20codings.=0A---=0A=20= src/coding.c=20|=2023=20++++++++++++++++++++++-=0A=201=20file=20changed,=20= 22=20insertions(+),=201=20deletion(-)=0A=0Adiff=20--git=20a/src/coding.c=20= b/src/coding.c=0Aindex=200bea2a0c2b..0fdbc95939=20100644=0A---=20= a/src/coding.c=0A+++=20b/src/coding.c=0A@@=20-9471,6=20+9471,17=20@@=20= used=20(which=20may=20be=20different=20from=20CODING-SYSTEM=20if=20= CODING-SYSTEM=20is=0A=20=20=20return=20code_convert_region=20(start,=20= end,=20coding_system,=20destination,=201,=200);=0A=20}=0A=20=0A+/*=20= Whether=20a=20(unibyte)=20string=20only=20contains=20chars=20in=20the=20= 0..127=20range.=20=20*/=0A+static=20bool=0A+string_ascii_p=20= (Lisp_Object=20str)=0A+{=0A+=20=20ptrdiff_t=20nbytes=20=3D=20SBYTES=20= (str);=0A+=20=20for=20(ptrdiff_t=20i=20=3D=200;=20i=20<=20nbytes;=20i++)=0A= +=20=20=20=20if=20(SREF=20(str,=20i)=20>=20127)=0A+=20=20=20=20=20=20= return=20false;=0A+=20=20return=20true;=0A+}=0A+=0A=20Lisp_Object=0A=20= code_convert_string=20(Lisp_Object=20string,=20Lisp_Object=20= coding_system,=0A=20=09=09=20=20=20=20=20Lisp_Object=20dst_object,=20= bool=20encodep,=20bool=20nocopy,=0A@@=20-9502,7=20+9513,17=20@@=20= code_convert_string=20(Lisp_Object=20string,=20Lisp_Object=20= coding_system,=0A=20=20=20chars=20=3D=20SCHARS=20(string);=0A=20=20=20= bytes=20=3D=20SBYTES=20(string);=0A=20=0A-=20=20if=20(BUFFERP=20= (dst_object))=0A+=20=20if=20(EQ=20(dst_object,=20Qt))=0A+=20=20=20=20{=0A= +=20=20=20=20=20=20/*=20Fast=20path=20for=20ASCII-only=20input=20and=20= an=20ASCII-compatible=20coding:=0A+=20=20=20=20=20=20=20=20=20act=20as=20= identity.=20=20*/=0A+=20=20=20=20=20=20Lisp_Object=20attrs=20=3D=20= CODING_ID_ATTRS=20(coding.id);=0A+=20=20=20=20=20=20if=20(!=20NILP=20= (CODING_ATTR_ASCII_COMPAT=20(attrs))=0A+=20=20=20=20=20=20=20=20=20=20&&=20= (STRING_MULTIBYTE=20(string)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= ?=20(chars=20=3D=3D=20bytes)=20:=20string_ascii_p=20(string)))=0A+=09= return=20nocopy=20?=20Fcopy_sequence=20(string)=20:=20string;=0A+=20=20=20= =20}=0A+=20=20else=20if=20(BUFFERP=20(dst_object))=0A=20=20=20=20=20{=0A=20= =20=20=20=20=20=20struct=20buffer=20*buf=20=3D=20XBUFFER=20(dst_object);=0A= =20=20=20=20=20=20=20ptrdiff_t=20buf_pt=20=3D=20BUF_PT=20(buf);=0A--=20=0A= 2.21.1=20(Apple=20Git-122.3)=0A=0A= --Apple-Mail=_9B9AC2A3-5983-46DF-AE16-BAACF6DF753B-- From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Apr 2020 16:56:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.15860193394212 (code B ref 40407); Sat, 04 Apr 2020 16:56:02 +0000 Received: (at 40407) by debbugs.gnu.org; 4 Apr 2020 16:55:39 +0000 Received: from localhost ([127.0.0.1]:44682 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKm55-00015r-G1 for submit@debbugs.gnu.org; Sat, 04 Apr 2020 12:55:39 -0400 Received: from mail1463c50.megamailservers.eu ([91.136.14.63]:51544 helo=mail268c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKm51-00015C-Lk for 40407@debbugs.gnu.org; Sat, 04 Apr 2020 12:55:38 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586019329; bh=f8WOWVRR7VVapvJlmMMjSfhmJnUA7p0EgA4XqrNVJT4=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=fX5dkJT9b2nhlM/Vc2NQkwUHH4AueW7y4hjbsQBBA/pPXcZ7qAlYo4sKfgEQWSwYQ icgxU29XCZqJVYc3Xlf0YWX4qmSyknBi0ClkA8iNzjwnJZWzV93qf1L48tENW3wSXA M6z62v0Du6ZxSx/YsQlMh9DqNr5djbTMVle8JNT0= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail268c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 034GtQRo003717; Sat, 4 Apr 2020 16:55:28 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <83imifv96b.fsf@gnu.org> Date: Sat, 4 Apr 2020 18:55:26 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <42EFD1AE-A96E-4613-A31E-C3723382DC6D@acm.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83imifv96b.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F29.5E88BBC2.000D, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=BZ+mLYl2 c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=HO5i985fcbljDZ1IHx0A:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.2 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 4 apr. 2020 kl. 12.26 skrev Eli Zaretskii : > After looking at this for some time, I think the problem is rarely if > ever seen. The only function which has the NOCOPY sense inverted is > code_convert_string, and it only does that when the CODI [...] Content analysis details: (1.2 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: gnu.org] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.3 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 4 apr. 2020 kl. 12.26 skrev Eli Zaretskii : > After looking at this for some time, I think the problem is rarely if > ever seen. The only function which has the NOCOPY sense inverted is > code_convert_string, and it only does that when the CODING_SYSTEM > argument is nil, which should almost never happen. Oh, that's the easy part. It's the ASCII optimisation that makes it = interesting. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Apr 2020 17:05:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158601988713976 (code B ref 40407); Sat, 04 Apr 2020 17:05:02 +0000 Received: (at 40407) by debbugs.gnu.org; 4 Apr 2020 17:04:47 +0000 Received: from localhost ([127.0.0.1]:44696 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKmDv-0003dL-E9 for submit@debbugs.gnu.org; Sat, 04 Apr 2020 13:04:47 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44525) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKmDu-0003cr-2L for 40407@debbugs.gnu.org; Sat, 04 Apr 2020 13:04:46 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46378) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jKmDo-000279-WE; Sat, 04 Apr 2020 13:04:41 -0400 Received: from [176.228.60.248] (port=1160 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jKmDm-0006IU-Md; Sat, 04 Apr 2020 13:04:40 -0400 Date: Sat, 04 Apr 2020 20:04:27 +0300 Message-Id: <83r1x3tc6c.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <42EFD1AE-A96E-4613-A31E-C3723382DC6D@acm.org> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Sat, 4 Apr 2020 18:55:26 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83imifv96b.fsf@gnu.org> <42EFD1AE-A96E-4613-A31E-C3723382DC6D@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Sat, 4 Apr 2020 18:55:26 +0200 > Cc: 40407@debbugs.gnu.org > > 4 apr. 2020 kl. 12.26 skrev Eli Zaretskii : > > > After looking at this for some time, I think the problem is rarely if > > ever seen. The only function which has the NOCOPY sense inverted is > > code_convert_string, and it only does that when the CODING_SYSTEM > > argument is nil, which should almost never happen. > > Oh, that's the easy part. It's the ASCII optimisation that makes it interesting. How so? From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Apr 2020 17:23:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158602097917745 (code B ref 40407); Sat, 04 Apr 2020 17:23:02 +0000 Received: (at 40407) by debbugs.gnu.org; 4 Apr 2020 17:22:59 +0000 Received: from localhost ([127.0.0.1]:44701 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKmVW-0004c8-U8 for submit@debbugs.gnu.org; Sat, 04 Apr 2020 13:22:59 -0400 Received: from eggs.gnu.org ([209.51.188.92]:46184) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKmVV-0004bk-0j for 40407@debbugs.gnu.org; Sat, 04 Apr 2020 13:22:57 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46533) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jKmVP-0006TS-OM; Sat, 04 Apr 2020 13:22:51 -0400 Received: from [176.228.60.248] (port=2258 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jKmVO-00069W-QX; Sat, 04 Apr 2020 13:22:51 -0400 Date: Sat, 04 Apr 2020 20:22:37 +0300 Message-Id: <83pncntbc2.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <729DE2D1-EA0F-46F9-8B4B-2ED146CE6892@acm.org> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Sat, 4 Apr 2020 18:41:39 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83mu7rvbyk.fsf@gnu.org> <729DE2D1-EA0F-46F9-8B4B-2ED146CE6892@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Sat, 4 Apr 2020 18:41:39 +0200 > Cc: 40407@debbugs.gnu.org > > > DECODE_FILE is called because the file name in question starts with a > > "~"? Otherwise, I don't think I understand why would expand-file-name > > need to decode a file name. > > Maybe it's because default-directory started with a tilde. It doesn't really matter; it's a common case, and the profiler tells us as much. I think it's important that we understand what happens here to the last detail, but okay. > > IME, the cases where we can safely assume it's OK to return the same > > string are actually very rare. It is no accident that you saw so few > > calls of these functions where we use that optional behavior. > > This does not mean that the remaining 179 calls require a copy; they just use the default value of the parameter. And IMO the default must stay that a copy is returned, except when the caller says otherwise. > + if (EQ (dst_object, Qt)) > + { > + /* Fast path for ASCII-only input and an ASCII-compatible coding: > + act as identity. */ > + Lisp_Object attrs = CODING_ID_ATTRS (coding.id); > + if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs)) > + && (STRING_MULTIBYTE (string) > + ? (chars == bytes) : string_ascii_p (string))) > + return nocopy ? Fcopy_sequence (string) : string; I think in the use case where we return a copy, we should make sure the return value is unibyte when encoding and multibyte when decoding. Otherwise, I think this is OK (for the master branch, obviously). Thanks. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Apr 2020 17:39:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: mattiase@acm.org Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158602188521032 (code B ref 40407); Sat, 04 Apr 2020 17:39:02 +0000 Received: (at 40407) by debbugs.gnu.org; 4 Apr 2020 17:38:05 +0000 Received: from localhost ([127.0.0.1]:44726 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKmk9-0005T8-CE for submit@debbugs.gnu.org; Sat, 04 Apr 2020 13:38:05 -0400 Received: from eggs.gnu.org ([209.51.188.92]:47767) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKmk7-0005SL-Ba for 40407@debbugs.gnu.org; Sat, 04 Apr 2020 13:38:03 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46785) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jKmk2-0006ma-1w; Sat, 04 Apr 2020 13:37:58 -0400 Received: from [176.228.60.248] (port=3177 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jKmk1-0001Bo-1E; Sat, 04 Apr 2020 13:37:57 -0400 Date: Sat, 04 Apr 2020 20:37:47 +0300 Message-Id: <83o8s7tams.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <83pncntbc2.fsf@gnu.org> (message from Eli Zaretskii on Sat, 04 Apr 2020 20:22:37 +0300) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83mu7rvbyk.fsf@gnu.org> <729DE2D1-EA0F-46F9-8B4B-2ED146CE6892@acm.org> <83pncntbc2.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > Date: Sat, 04 Apr 2020 20:22:37 +0300 > From: Eli Zaretskii > Cc: 40407@debbugs.gnu.org > > > + if (EQ (dst_object, Qt)) > > + { > > + /* Fast path for ASCII-only input and an ASCII-compatible coding: > > + act as identity. */ > > + Lisp_Object attrs = CODING_ID_ATTRS (coding.id); > > + if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs)) > > + && (STRING_MULTIBYTE (string) > > + ? (chars == bytes) : string_ascii_p (string))) > > + return nocopy ? Fcopy_sequence (string) : string; > > I think in the use case where we return a copy, we should make sure > the return value is unibyte when encoding and multibyte when decoding. > Otherwise, I think this is OK (for the master branch, obviously). Btw, if we want this particular use case to be as fast as possible, then Fcopy_sequence is not the best way, because it is not optimized for the case of copying a single string. We could do better by calling make_uninit_multibyte/unibyte_string and memcpy directly. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Apr 2020 18:02:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158602327926403 (code B ref 40407); Sat, 04 Apr 2020 18:02:01 +0000 Received: (at 40407) by debbugs.gnu.org; 4 Apr 2020 18:01:19 +0000 Received: from localhost ([127.0.0.1]:44789 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKn6d-0006rk-IW for submit@debbugs.gnu.org; Sat, 04 Apr 2020 14:01:19 -0400 Received: from mail179c50.megamailservers.eu ([91.136.10.189]:35410 helo=mail18c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKn6a-0006rR-QE for 40407@debbugs.gnu.org; Sat, 04 Apr 2020 14:01:18 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586023275; bh=wQRSuqnnSjNkdznrggL8JPf/dcAQWCPB7zxN8S62bxc=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=XJQA/3vbeiSH0f43/pshOsc3KNTv/MD+H2/+22G0appMEz6JKQ+jRO7m+oOlvieFK xNgFnowiXvN0ODoeqqo4Q0kMZ4QG4MA31oSqpdooSNUoIqin1LV/aEITC0NjW6EIGh 6fGjMg3D4SsPRJZW3j1KAi0OXsI8JFp9aoc03SdI= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail18c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 034I1C6D024588; Sat, 4 Apr 2020 18:01:14 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <83r1x3tc6c.fsf@gnu.org> Date: Sat, 4 Apr 2020 20:01:12 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <1C9D87C7-57DD-4C61-86FE-99A7095E5085@acm.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83imifv96b.fsf@gnu.org> <42EFD1AE-A96E-4613-A31E-C3723382DC6D@acm.org> <83r1x3tc6c.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F1F.5E88CB47.0074, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=K8Zc4BeI c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=ah6caREQ2aC89ADGwaIA:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 4 apr. 2020 kl. 19.04 skrev Eli Zaretskii : > How so? Because then NOCOPY suddenly matters for almost all coding systems, not = just nil. After all, an all-ASCII input string and ASCII-compatible = coding is not an unusual combination. This forces us to be careful when = correcting the NOCOPY sense, and may expose latent bugs. But you are right: we should trust calls to {en,de}code-coding-system = with NOCOPY=3Dt, and the rest can also remain as they are until someone = cares enough to change them. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Apr 2020 18:07:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158602359427536 (code B ref 40407); Sat, 04 Apr 2020 18:07:02 +0000 Received: (at 40407) by debbugs.gnu.org; 4 Apr 2020 18:06:34 +0000 Received: from localhost ([127.0.0.1]:44793 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKnBi-0007A4-6L for submit@debbugs.gnu.org; Sat, 04 Apr 2020 14:06:34 -0400 Received: from mail1434c50.megamailservers.eu ([91.136.14.34]:35582 helo=mail263c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKnBg-00079Y-KS for 40407@debbugs.gnu.org; Sat, 04 Apr 2020 14:06:33 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586023586; bh=sHhbXxhZIimD2pgSrApfqDBXE2fWNMqt9ssJtycePso=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=gYDqTXQR9C2COLBcF/Z2W0lW25Rp7ViixNFCdBzBsOtuqgJE5xGpSzUONiMEroLlj AQQsbaDGqhF80PGTYSgqNy02hJvne5sUn9OQtDD1n5SbEVVCEdAn+fOMwQNRi9a+SO hDBz0hISXSTmql26Pr2sAKFUcIYksMJo26KY44HM= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail263c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 034I6OTD019478; Sat, 4 Apr 2020 18:06:25 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <83o8s7tams.fsf@gnu.org> Date: Sat, 4 Apr 2020 20:06:23 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <0D7F6CFF-89DB-4A0F-8F4C-3F0C7E7235D8@acm.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83mu7rvbyk.fsf@gnu.org> <729DE2D1-EA0F-46F9-8B4B-2ED146CE6892@acm.org> <83pncntbc2.fsf@gnu.org> <83o8s7tams.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F25.5E88CC70.0074, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=e6d4tph/ c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=TomUGQFZFQpKtidUOcoA:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.2 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 4 apr. 2020 kl. 19.37 skrev Eli Zaretskii : > Btw, if we want this particular use case to be as fast as possible, > then Fcopy_sequence is not the best way, because it is not optimized > for the case of copying a single string. We could do bett [...] Content analysis details: (1.2 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: megamailservers.eu] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.3 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 4 apr. 2020 kl. 19.37 skrev Eli Zaretskii : > Btw, if we want this particular use case to be as fast as possible, > then Fcopy_sequence is not the best way, because it is not optimized > for the case of copying a single string. We could do better by > calling make_uninit_multibyte/unibyte_string and memcpy directly. Yes, if that would provide a benefit. The pattern should probably be = encapsulated in copy_string or similar, if it doesn't already exist. = (Should it copy properties? Probably not.) From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Apr 2020 18:26:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158602474631646 (code B ref 40407); Sat, 04 Apr 2020 18:26:02 +0000 Received: (at 40407) by debbugs.gnu.org; 4 Apr 2020 18:25:46 +0000 Received: from localhost ([127.0.0.1]:44818 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKnUI-0008EL-5A for submit@debbugs.gnu.org; Sat, 04 Apr 2020 14:25:46 -0400 Received: from eggs.gnu.org ([209.51.188.92]:54584) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKnUG-0008Dw-PN for 40407@debbugs.gnu.org; Sat, 04 Apr 2020 14:25:45 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:47595) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jKnUB-0007mL-E8; Sat, 04 Apr 2020 14:25:39 -0400 Received: from [176.228.60.248] (port=2131 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jKnU8-0006MY-0E; Sat, 04 Apr 2020 14:25:38 -0400 Date: Sat, 04 Apr 2020 21:25:20 +0300 Message-Id: <83imift8fj.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <1C9D87C7-57DD-4C61-86FE-99A7095E5085@acm.org> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Sat, 4 Apr 2020 20:01:12 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83imifv96b.fsf@gnu.org> <42EFD1AE-A96E-4613-A31E-C3723382DC6D@acm.org> <83r1x3tc6c.fsf@gnu.org> <1C9D87C7-57DD-4C61-86FE-99A7095E5085@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Sat, 4 Apr 2020 20:01:12 +0200 > Cc: 40407@debbugs.gnu.org > > 4 apr. 2020 kl. 19.04 skrev Eli Zaretskii : > > > How so? > > Because then NOCOPY suddenly matters for almost all coding systems, not just nil. After all, an all-ASCII input string and ASCII-compatible coding is not an unusual combination. This forces us to be careful when correcting the NOCOPY sense, and may expose latent bugs. That's true. But as a matter of fact, I don't see any calls to code_convert_string with NOCOPY non-zero, they all pass zero or false to it. So none of the existing direct calls from C wants or expects to get the same string. > But you are right: we should trust calls to {en,de}code-coding-system with NOCOPY=t, and the rest can also remain as they are until someone cares enough to change them. Agreed. Btw, this bug was introduced in commit 4031e2b, 18 years ago. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Apr 2020 02:38:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158605424115538 (code B ref 40407); Sun, 05 Apr 2020 02:38:02 +0000 Received: (at 40407) by debbugs.gnu.org; 5 Apr 2020 02:37:21 +0000 Received: from localhost ([127.0.0.1]:45012 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKvA0-00042Y-Jb for submit@debbugs.gnu.org; Sat, 04 Apr 2020 22:37:20 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44354) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKv9y-00042K-Ju for 40407@debbugs.gnu.org; Sat, 04 Apr 2020 22:37:19 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:54552) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jKv9s-0001lP-Bm; Sat, 04 Apr 2020 22:37:12 -0400 Received: from [176.228.60.248] (port=4061 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jKv9r-00086L-QC; Sat, 04 Apr 2020 22:37:12 -0400 Date: Sun, 05 Apr 2020 05:37:03 +0300 Message-Id: <83ftdiu08g.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <0D7F6CFF-89DB-4A0F-8F4C-3F0C7E7235D8@acm.org> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Sat, 4 Apr 2020 20:06:23 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83mu7rvbyk.fsf@gnu.org> <729DE2D1-EA0F-46F9-8B4B-2ED146CE6892@acm.org> <83pncntbc2.fsf@gnu.org> <83o8s7tams.fsf@gnu.org> <0D7F6CFF-89DB-4A0F-8F4C-3F0C7E7235D8@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Sat, 4 Apr 2020 20:06:23 +0200 > Cc: 40407@debbugs.gnu.org > > 4 apr. 2020 kl. 19.37 skrev Eli Zaretskii : > > > Btw, if we want this particular use case to be as fast as possible, > > then Fcopy_sequence is not the best way, because it is not optimized > > for the case of copying a single string. We could do better by > > calling make_uninit_multibyte/unibyte_string and memcpy directly. > > Yes, if that would provide a benefit. The pattern should probably be encapsulated in copy_string or similar, if it doesn't already exist. I wouldn't make it a separate function for the benefit of just one caller. Every function call is a slowdown, albeit a small one. > (Should it copy properties? Probably not.) Definitely not. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Apr 2020 03:43:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: 40407@debbugs.gnu.org, mattiase@acm.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org, Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= X-Debbugs-Original-Cc: 40407@debbugs.gnu.org Received: via spool by submit@debbugs.gnu.org id=B.158605818029278 (code B ref -1); Sun, 05 Apr 2020 03:43:02 +0000 Received: (at submit) by debbugs.gnu.org; 5 Apr 2020 03:43:00 +0000 Received: from localhost ([127.0.0.1]:45026 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKwBX-0007cA-P4 for submit@debbugs.gnu.org; Sat, 04 Apr 2020 23:43:00 -0400 Received: from lists.gnu.org ([209.51.188.17]:36854) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKwBV-0007bw-OS for submit@debbugs.gnu.org; Sat, 04 Apr 2020 23:42:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35161) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jKwBU-0004UR-Hj for bug-gnu-emacs@gnu.org; Sat, 04 Apr 2020 23:42:57 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:470:142:3::e]:55097) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jKwBU-0008Pb-9c; Sat, 04 Apr 2020 23:42:56 -0400 Received: from [176.12.139.238] (port=36347 helo=[10.210.72.17]) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1jKwBT-00015N-Iy; Sat, 04 Apr 2020 23:42:56 -0400 Date: Sun, 05 Apr 2020 06:42:51 +0300 User-Agent: K-9 Mail for Android In-Reply-To: <83ftdiu08g.fsf@gnu.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83mu7rvbyk.fsf@gnu.org> <729DE2D1-EA0F-46F9-8B4B-2ED146CE6892@acm.org> <83pncntbc2.fsf@gnu.org> <83o8s7tams.fsf@gnu.org> <0D7F6CFF-89DB-4A0F-8F4C-3F0C7E7235D8@acm.org> <83ftdiu08g.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Eli Zaretskii Message-ID: <0A93250C-0851-4608-B8B6-F36D993494FB@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) On April 5, 2020 5:37:03 AM GMT+03:00, Eli Zaretskii wrote= : > > From: Mattias Engdeg=C3=A5rd > > Date: Sat, 4 Apr 2020 20:06:23 +0200 > > Cc: 40407@debbugs=2Egnu=2Eorg > >=20 > > 4 apr=2E 2020 kl=2E 19=2E37 skrev Eli Zaretskii : > >=20 > > > Btw, if we want this particular use case to be as fast as > possible, > > > then Fcopy_sequence is not the best way, because it is not > optimized > > > for the case of copying a single string=2E We could do better by > > > calling make_uninit_multibyte/unibyte_string and memcpy directly=2E > >=20 > > Yes, if that would provide a benefit=2E The pattern should probably be > encapsulated in copy_string or similar, if it doesn't already exist=2E >=20 > I wouldn't make it a separate function for the benefit of just one > caller=2E Every function call is a slowdown, albeit a small one=2E However, we already have those functions ready, see make_unibyte_string an= d make_multibyte_string=2E From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Apr 2020 10:16:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158608171232172 (code B ref 40407); Sun, 05 Apr 2020 10:16:01 +0000 Received: (at 40407) by debbugs.gnu.org; 5 Apr 2020 10:15:12 +0000 Received: from localhost ([127.0.0.1]:45225 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL2J6-0008Mq-7u for submit@debbugs.gnu.org; Sun, 05 Apr 2020 06:15:12 -0400 Received: from mail1442c50.megamailservers.eu ([91.136.14.42]:52684 helo=mail264c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL2J3-0008Lz-CO for 40407@debbugs.gnu.org; Sun, 05 Apr 2020 06:15:10 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586081702; bh=JF3pzlSmqPRs875YiToej9adftjRo90r9EOd+TY/4Bs=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=nj1aGXTKCa/q8N/jDhTRv6cL+smazjopZsi9wSvWOEa+Mtwx4hDUrj9+YZVRQdhnK FeVBRp+iyOjOOjmJDKWFPbwMIXG6UNIEtoYGzPYL+qa1JnF6pv9xu+jsmDLbCezw9i sN489GsIY2Pg8YgeN/aYlYbHtFelDcJD4P7TjqWI= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail264c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 035AExXM030946; Sun, 5 Apr 2020 10:15:01 +0000 From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Message-Id: <038251F3-AAA0-4528-ADB3-6E29F5A51B82@acm.org> Content-Type: multipart/mixed; boundary="Apple-Mail=_D7BDBD8D-5921-40C0-8C17-134642884BAB" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) Date: Sun, 5 Apr 2020 12:14:59 +0200 In-Reply-To: <83pncntbc2.fsf@gnu.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83mu7rvbyk.fsf@gnu.org> <729DE2D1-EA0F-46F9-8B4B-2ED146CE6892@acm.org> <83pncntbc2.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F22.5E89AF82.001D, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=PPNxBsiC c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=prhDnmYfxZWMZ_zM81YA:9 a=CjuIK1q_8ugA:10 a=q1fQKG-m5HQ5ZqqctQsA:9 a=B2y7HmGcmWMA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.2 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 4 apr. 2020 kl. 19.22 skrev Eli Zaretskii : >> This does not mean that the remaining 179 calls require a copy; they just use the default value of the parameter. > > And IMO the default must stay that a copy is returned, except when the > caller [...] Content analysis details: (1.2 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: gnu.org] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.3 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --Apple-Mail=_D7BDBD8D-5921-40C0-8C17-134642884BAB Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii 4 apr. 2020 kl. 19.22 skrev Eli Zaretskii : >> This does not mean that the remaining 179 calls require a copy; they = just use the default value of the parameter. >=20 > And IMO the default must stay that a copy is returned, except when the > caller says otherwise. Yes, those can be dealt with piecemeal, and we are in no hurry to do so. > I think in the use case where we return a copy, we should make sure > the return value is unibyte when encoding and multibyte when decoding. I'm not necessarily opposed to the suggestion, but why not return a = unibyte string in both cases, simplifying the code? In addition, some = operations (aref) are faster on unibyte. Either way, it's nothing that a = caller could rely on, is there? (In particular when taking NOCOPY into = account.) > Otherwise, I think this is OK (for the master branch, obviously). Indeed the intention, thanks. Here is what I would commit, unless you think the string copy should = really be multibyte when decoding. --Apple-Mail=_D7BDBD8D-5921-40C0-8C17-134642884BAB Content-Disposition: attachment; filename=0001-Avoid-expensive-recoding-for-ASCII-identity-cases-bu.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Avoid-expensive-recoding-for-ASCII-identity-cases-bu.patch" Content-Transfer-Encoding: quoted-printable =46rom=2063400cc62506d2c3d9d5f2f27e7bb3bfe7f8f877=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Fri,=203=20Apr=202020=2016:01:01=20+0200=0A= Subject:=20[PATCH]=20Avoid=20expensive=20recoding=20for=20ASCII=20= identity=20cases=20(bug#40407)=0A=0AOptimise=20for=20the=20common=20case=20= of=20encoding=20or=20decoding=20an=20ASCII-only=0Astring=20using=20an=20= ASCII-compatible=20coding,=20for=20file=20names=20in=20particular.=0A=0A= *=20src/coding.c=20(string_ascii_p):=20New=20function.=0A= (code_convert_string):=20Return=20the=20input=20string=20for=20= ASCII-only=20inputs=0Aand=20ASCII-compatible=20codings.=0A*=20= test/src/coding-tests.el=20(coding-nocopy-ascii):=20New=20test.=0A---=0A=20= src/coding.c=20=20=20=20=20=20=20=20=20=20=20=20=20|=2023=20= ++++++++++++++++++++++-=0A=20test/src/coding-tests.el=20|=2011=20= +++++++++++=0A=202=20files=20changed,=2033=20insertions(+),=201=20= deletion(-)=0A=0Adiff=20--git=20a/src/coding.c=20b/src/coding.c=0Aindex=20= 1049f1b755..2425f5952f=20100644=0A---=20a/src/coding.c=0A+++=20= b/src/coding.c=0A@@=20-9471,6=20+9471,17=20@@=20used=20(which=20may=20be=20= different=20from=20CODING-SYSTEM=20if=20CODING-SYSTEM=20is=0A=20=20=20= return=20code_convert_region=20(start,=20end,=20coding_system,=20= destination,=201,=200);=0A=20}=0A=20=0A+/*=20Whether=20a=20(unibyte)=20= string=20only=20contains=20chars=20in=20the=200..127=20range.=20=20*/=0A= +static=20bool=0A+string_ascii_p=20(Lisp_Object=20str)=0A+{=0A+=20=20= ptrdiff_t=20nbytes=20=3D=20SBYTES=20(str);=0A+=20=20for=20(ptrdiff_t=20i=20= =3D=200;=20i=20<=20nbytes;=20i++)=0A+=20=20=20=20if=20(SREF=20(str,=20i)=20= >=20127)=0A+=20=20=20=20=20=20return=20false;=0A+=20=20return=20true;=0A= +}=0A+=0A=20Lisp_Object=0A=20code_convert_string=20(Lisp_Object=20= string,=20Lisp_Object=20coding_system,=0A=20=09=09=20=20=20=20=20= Lisp_Object=20dst_object,=20bool=20encodep,=20bool=20nocopy,=0A@@=20= -9502,7=20+9513,17=20@@=20code_convert_string=20(Lisp_Object=20string,=20= Lisp_Object=20coding_system,=0A=20=20=20chars=20=3D=20SCHARS=20(string);=0A= =20=20=20bytes=20=3D=20SBYTES=20(string);=0A=20=0A-=20=20if=20(BUFFERP=20= (dst_object))=0A+=20=20if=20(EQ=20(dst_object,=20Qt))=0A+=20=20=20=20{=0A= +=20=20=20=20=20=20/*=20Fast=20path=20for=20ASCII-only=20input=20and=20= an=20ASCII-compatible=20coding:=0A+=20=20=20=20=20=20=20=20=20act=20as=20= identity.=20=20*/=0A+=20=20=20=20=20=20Lisp_Object=20attrs=20=3D=20= CODING_ID_ATTRS=20(coding.id);=0A+=20=20=20=20=20=20if=20(!=20NILP=20= (CODING_ATTR_ASCII_COMPAT=20(attrs))=0A+=20=20=20=20=20=20=20=20=20=20&&=20= (STRING_MULTIBYTE=20(string)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= ?=20(chars=20=3D=3D=20bytes)=20:=20string_ascii_p=20(string)))=0A+=09= return=20nocopy=20?=20string=20:=20make_unibyte_string=20(SDATA=20= (string),=20bytes);=0A+=20=20=20=20}=0A+=20=20else=20if=20(BUFFERP=20= (dst_object))=0A=20=20=20=20=20{=0A=20=20=20=20=20=20=20struct=20buffer=20= *buf=20=3D=20XBUFFER=20(dst_object);=0A=20=20=20=20=20=20=20ptrdiff_t=20= buf_pt=20=3D=20BUF_PT=20(buf);=0Adiff=20--git=20= a/test/src/coding-tests.el=20b/test/src/coding-tests.el=0Aindex=20= 110ff12696..93e6709d44=20100644=0A---=20a/test/src/coding-tests.el=0A+++=20= b/test/src/coding-tests.el=0A@@=20-383,6=20+383,17=20@@=20= coding-nocopy-trivial=0A=20=20=20=20=20(should-not=20(eq=20= (encode-coding-string=20s=20nil=20nil)=20s))=0A=20=20=20=20=20(should=20= (eq=20(encode-coding-string=20s=20nil=20t)=20s))))=0A=20=0A+(ert-deftest=20= coding-nocopy-ascii=20()=0A+=20=20"Check=20that=20the=20NOCOPY=20= parameter=20works=20for=20ASCII-only=20strings."=0A+=20=20(let*=20((uni=20= (apply=20#'string=20(number-sequence=200=20127)))=0A+=20=20=20=20=20=20=20= =20=20(multi=20(string-to-multibyte=20uni)))=0A+=20=20=20=20(dolist=20(s=20= (list=20uni=20multi))=0A+=20=20=20=20=20=20(dolist=20(coding=20= '(us-ascii=20iso-latin-1=20utf-8))=0A+=20=20=20=20=20=20=20=20= (should-not=20(eq=20(decode-coding-string=20s=20coding=20nil)=20s))=0A+=20= =20=20=20=20=20=20=20(should-not=20(eq=20(encode-coding-string=20s=20= coding=20nil)=20s))=0A+=20=20=20=20=20=20=20=20(should=20(eq=20= (decode-coding-string=20s=20coding=20t)=20s))=0A+=20=20=20=20=20=20=20=20= (should=20(eq=20(encode-coding-string=20s=20coding=20t)=20s))))))=0A+=0A=20= ;;=20Local=20Variables:=0A=20;;=20byte-compile-warnings:=20(not=20= obsolete)=0A=20;;=20End:=0A--=20=0A2.21.1=20(Apple=20Git-122.3)=0A=0A= --Apple-Mail=_D7BDBD8D-5921-40C0-8C17-134642884BAB-- From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Apr 2020 10:50:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.15860837437208 (code B ref 40407); Sun, 05 Apr 2020 10:50:02 +0000 Received: (at 40407) by debbugs.gnu.org; 5 Apr 2020 10:49:03 +0000 Received: from localhost ([127.0.0.1]:45253 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL2pr-0001sA-Em for submit@debbugs.gnu.org; Sun, 05 Apr 2020 06:49:03 -0400 Received: from mail212c50.megamailservers.eu ([91.136.10.222]:48612 helo=mail194c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL2pp-0001rJ-4H for 40407@debbugs.gnu.org; Sun, 05 Apr 2020 06:49:02 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586083734; bh=Qsh3rxSsePlA/U21FLZT78lAk8DtoI8V+bqzWoGBMPE=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=lRvSf/NMOGNA880ZOTj0wiwnFn+7FQ1UMt2bR7WpJMDcGskTy1nVjRJpobpQAm7yK 1uu7ukk/1KoMP6u0DyUBdQT85OiNsqI0K5sW1VnjS6nkbsZ3nL4dDdNyQHnI8SCcEE U9LtWhjGp1gGCQPuNsB6Y34VPW7c4Zs1xKozCN2s= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail194c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 035Ampk8019188; Sun, 5 Apr 2020 10:48:53 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <83imift8fj.fsf@gnu.org> Date: Sun, 5 Apr 2020 12:48:51 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <51EFA20B-3F32-4242-82D5-EA2D2FB2FD3E@acm.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83imifv96b.fsf@gnu.org> <42EFD1AE-A96E-4613-A31E-C3723382DC6D@acm.org> <83r1x3tc6c.fsf@gnu.org> <1C9D87C7-57DD-4C61-86FE-99A7095E5085@acm.org> <83imift8fj.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F17.5E89B76C.0013, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=KsozJleN c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=wv5BUt5QEtqpqBFRnnYA:9 a=FMZy9d7vl412MyZU:21 a=BBryQZiZ5wwlhcHs:21 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 4 apr. 2020 kl. 20.25 skrev Eli Zaretskii : > That's true. But as a matter of fact, I don't see any calls to > code_convert_string with NOCOPY non-zero, they all pass zero or false > to it. So none of the existing direct calls from C wants or expects > to get the same string. Right. However, I did some reading and believe that nocopy=3Dtrue is = actually correct for all uses of {EN,DE}CODE_FILE, and in fact all calls = to code_convert_string_norecord. One of the reasons is that the callers = need to be careful with mutation wrt GC anyway; any post-recoding = mutation is done on copies. (Not being able to change the length of = strings also helps.) I pushed what we agreed on in part for the pleasure of resolving such an = old-standing bug) to master (962562cde4). Given the limited scope of the change, would you agree to a backport of = that to emacs-27? For the reasons above, I think it's correct and proper to do (on master) --- a/src/coding.c +++ b/src/coding.c @@ -9554,7 +9554,7 @@ code_convert_string (Lisp_Object string, = Lisp_Object coding_system, code_convert_string_norecord (Lisp_Object string, Lisp_Object = coding_system, bool encodep) { - return code_convert_string (string, coding_system, Qt, encodep, 0, = 1); + return code_convert_string (string, coding_system, Qt, encodep, 1, = 1); } From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Apr 2020 13:29:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158609330916401 (code B ref 40407); Sun, 05 Apr 2020 13:29:02 +0000 Received: (at 40407) by debbugs.gnu.org; 5 Apr 2020 13:28:29 +0000 Received: from localhost ([127.0.0.1]:45366 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL5K9-0004GR-73 for submit@debbugs.gnu.org; Sun, 05 Apr 2020 09:28:29 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40299) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL5K7-0004G2-QD for 40407@debbugs.gnu.org; Sun, 05 Apr 2020 09:28:28 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:36707) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jL5K2-00039B-Ie; Sun, 05 Apr 2020 09:28:22 -0400 Received: from [176.228.60.248] (port=3711 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jL5K0-0006Rd-Rm; Sun, 05 Apr 2020 09:28:21 -0400 Date: Sun, 05 Apr 2020 16:28:13 +0300 Message-Id: <835zeet636.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <038251F3-AAA0-4528-ADB3-6E29F5A51B82@acm.org> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Sun, 5 Apr 2020 12:14:59 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83mu7rvbyk.fsf@gnu.org> <729DE2D1-EA0F-46F9-8B4B-2ED146CE6892@acm.org> <83pncntbc2.fsf@gnu.org> <038251F3-AAA0-4528-ADB3-6E29F5A51B82@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Sun, 5 Apr 2020 12:14:59 +0200 > Cc: 40407@debbugs.gnu.org > > > I think in the use case where we return a copy, we should make sure > > the return value is unibyte when encoding and multibyte when decoding. > > I'm not necessarily opposed to the suggestion, but why not return a unibyte string in both cases, simplifying the code? For compatibility with what happens now: (multibyte-string-p (decode-coding-string "abc" 'utf-8)) => t > In addition, some operations (aref) are faster on unibyte. Either way, it's nothing that a caller could rely on, is there? (In particular when taking NOCOPY into account.) That is true, of course, but many/most of our strings are multibyte nowadays, even if they are ASCII. Suddenly getting a unibyte string instead would be surprising, I think, even if no one should depend on it not happening. (NOCOPY case is different: then it's the caller's responsibility to deal with the issue.) So I'd rather we produced a multibyte string when "decoding" by copying. > +/* Whether a (unibyte) string only contains chars in the 0..127 range. */ One subtle point regarding this comment: I'd remove the "unibyte" part, because (1) you apply this test to multibyte strings as well, and (2) strings encoded in iso-2022 will look "pure-ASCII", but they aren't. The latter subtlety doesn't interfere with the caller, because iso-2022 is not ASCII-compatible, but it's something I'd mention in the comment, lest someone uses this function for some other use case. The patch is OK otherwise. Thanks. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Apr 2020 13:40:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158609398218831 (code B ref 40407); Sun, 05 Apr 2020 13:40:01 +0000 Received: (at 40407) by debbugs.gnu.org; 5 Apr 2020 13:39:42 +0000 Received: from localhost ([127.0.0.1]:45374 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL5Uz-0004td-OH for submit@debbugs.gnu.org; Sun, 05 Apr 2020 09:39:42 -0400 Received: from eggs.gnu.org ([209.51.188.92]:41529) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL5Uy-0004tC-9D for 40407@debbugs.gnu.org; Sun, 05 Apr 2020 09:39:40 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:36871) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jL5Us-00023L-72; Sun, 05 Apr 2020 09:39:34 -0400 Received: from [176.228.60.248] (port=4387 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jL5Ur-0004ds-Ij; Sun, 05 Apr 2020 09:39:34 -0400 Date: Sun, 05 Apr 2020 16:39:25 +0300 Message-Id: <834ktyt5ki.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <51EFA20B-3F32-4242-82D5-EA2D2FB2FD3E@acm.org> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Sun, 5 Apr 2020 12:48:51 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83imifv96b.fsf@gnu.org> <42EFD1AE-A96E-4613-A31E-C3723382DC6D@acm.org> <83r1x3tc6c.fsf@gnu.org> <1C9D87C7-57DD-4C61-86FE-99A7095E5085@acm.org> <83imift8fj.fsf@gnu.org> <51EFA20B-3F32-4242-82D5-EA2D2FB2FD3E@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Sun, 5 Apr 2020 12:48:51 +0200 > Cc: 40407@debbugs.gnu.org > > 4 apr. 2020 kl. 20.25 skrev Eli Zaretskii : > > > That's true. But as a matter of fact, I don't see any calls to > > code_convert_string with NOCOPY non-zero, they all pass zero or false > > to it. So none of the existing direct calls from C wants or expects > > to get the same string. > > Right. However, I did some reading and believe that nocopy=true is actually correct for all uses of {EN,DE}CODE_FILE, and in fact all calls to code_convert_string_norecord. I don't think I follow. We call code_convert_string_norecord, which invokes code_convert_string with NOCOPY set to 'false'. So all those users should NOT receive the same string as the argument, and I don't believe they expect that and can cope with it. Perhaps what you meant was that NOCOPY = false was actually acting as if the value were 'true', due to the bug. But if so, that didn't affect ENCODE_FILE and DECODE_FILE, because the bug is only visible when the CODING_SYSTEM argument is nil, and both ENCODE_FILE and DECODE_FILE never let that happen: if (! NILP (Vfile_name_coding_system)) return code_convert_string_norecord (fname, Vfile_name_coding_system, 1); else if (! NILP (Vdefault_file_name_coding_system)) return code_convert_string_norecord (fname, Vdefault_file_name_coding_system, 1); else return fname; So in practice this bug was probably never seen until now. > One of the reasons is that the callers need to be careful with mutation wrt GC anyway; any post-recoding mutation is done on copies. (Not being able to change the length of strings also helps.) I don't think I understand your line of reasoning here. I don't think GC is relevant, and as long as we are talking about file names, the first null byte terminates it even though the Lisp string's length could be larger. > Given the limited scope of the change, would you agree to a backport of that to emacs-27? That'd be a mistake, I think. My reasoning goes like this: If I'm right that this bug was never seen, fixing it on emacs-27 will have no visible effect; and if I'm wrong, then we will break the release branch. The danger of breakage in the latter case is much more severe than the gain from the fix in the former case. > For the reasons above, I think it's correct and proper to do (on master) > > --- a/src/coding.c > +++ b/src/coding.c > @@ -9554,7 +9554,7 @@ code_convert_string (Lisp_Object string, Lisp_Object coding_system, > code_convert_string_norecord (Lisp_Object string, Lisp_Object coding_system, > bool encodep) > { > - return code_convert_string (string, coding_system, Qt, encodep, 0, 1); > + return code_convert_string (string, coding_system, Qt, encodep, 1, 1); > } I hope you now agree with me that we should not do this. The default should stay NOCOPY = false, and any caller that wants otherwise must explicitly request that by calling code_convert_string. Thanks. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Apr 2020 13:41:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158609404919064 (code B ref 40407); Sun, 05 Apr 2020 13:41:01 +0000 Received: (at 40407) by debbugs.gnu.org; 5 Apr 2020 13:40:49 +0000 Received: from localhost ([127.0.0.1]:45378 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL5W5-0004xP-7o for submit@debbugs.gnu.org; Sun, 05 Apr 2020 09:40:49 -0400 Received: from mail1445c50.megamailservers.eu ([91.136.14.45]:42560 helo=mail265c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL5W2-0004wu-8Y for 40407@debbugs.gnu.org; Sun, 05 Apr 2020 09:40:47 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586094039; bh=QnIKWgMVVdg8NwFpcmenIHMU0H+/XHB0iKaq1gSnFJI=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=O4LlSVHGbmKTPz0LymQvB0+GH/4iTZg0C4kGhvJZk+3ZDlEq/qHcPGMJfpHC2jWzB LaDJ+ina2sh3hWbd37Zt3ipYrRQgcsUrZYwx0EWuTk6HaDyvpbbF7FSZMTWvFjTXjO HP90qKyFhnsZCEgpCHxzSfgY5u7sqXiuo+T0JwwM= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail265c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 035Debj2022691; Sun, 5 Apr 2020 13:40:38 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <835zeet636.fsf@gnu.org> Date: Sun, 5 Apr 2020 15:40:36 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83mu7rvbyk.fsf@gnu.org> <729DE2D1-EA0F-46F9-8B4B-2ED146CE6892@acm.org> <83pncntbc2.fsf@gnu.org> <038251F3-AAA0-4528-ADB3-6E29F5A51B82@acm.org> <835zeet636.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F1A.5E89DFB2.005E, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=D5w51cZj c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=J67hpjz6vsPl-x7vM6sA:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 5 apr. 2020 kl. 15.28 skrev Eli Zaretskii : > That is true, of course, but many/most of our strings are multibyte > nowadays, even if they are ASCII. Suddenly getting a unibyte string > instead would be surprising, I think, even if no one shoul [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: gnu.org] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 5 apr. 2020 kl. 15.28 skrev Eli Zaretskii : > That is true, of course, but many/most of our strings are multibyte > nowadays, even if they are ASCII. Suddenly getting a unibyte string > instead would be surprising, I think, even if no one should depend on > it not happening. (NOCOPY case is different: then it's the caller's > responsibility to deal with the issue.) So I'd rather we produced a > multibyte string when "decoding" by copying. I don't agree fully but it is definitely not a strongly held opinion, so = I followed your suggestion. > One subtle point regarding this comment: I'd remove the "unibyte" > part Right, done. Thanks for the reviews! From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Apr 2020 15:04:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158609904112600 (code B ref 40407); Sun, 05 Apr 2020 15:04:03 +0000 Received: (at 40407) by debbugs.gnu.org; 5 Apr 2020 15:04:01 +0000 Received: from localhost ([127.0.0.1]:46484 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL6ob-0003H8-4z for submit@debbugs.gnu.org; Sun, 05 Apr 2020 11:04:01 -0400 Received: from mail1442c50.megamailservers.eu ([91.136.14.42]:36186 helo=mail264c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL6oZ-0003GX-5H for 40407@debbugs.gnu.org; Sun, 05 Apr 2020 11:04:00 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586099032; bh=JVCWi565Kx2poLilgKzXqBBoS2BgsfmT6GfN+46Hnn8=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=eXTIFksXL5z8xCXRbyVkjd9wctRMvrZOCs1gLLL44ow+ZcczzkRgPFXRyxEzsCIzj sLjLgBkvQm1CTH1PVLVY3cPuUhh5oEyyvjF7Nz0HHWtxRuTkKArokv+CWWdHjAhEUW S33ZLCOFI9hRXP5LDkWGM3FU7CinZ6wQS1ZSZ9B4= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail264c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 035F3nIL031366; Sun, 5 Apr 2020 15:03:51 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <834ktyt5ki.fsf@gnu.org> Date: Sun, 5 Apr 2020 17:03:49 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <048BDA86-F50A-49CF-872A-2C94D1864181@acm.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83imifv96b.fsf@gnu.org> <42EFD1AE-A96E-4613-A31E-C3723382DC6D@acm.org> <83r1x3tc6c.fsf@gnu.org> <1C9D87C7-57DD-4C61-86FE-99A7095E5085@acm.org> <83imift8fj.fsf@gnu.org> <51EFA20B-3F32-4242-82D5-EA2D2FB2FD3E@acm.org> <834ktyt5ki.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F27.5E89F341.0067, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=PPNxBsiC c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=XLVZcQpClIlbVbcS_RYA:9 a=ne_HDnwHIvIjrzZs:21 a=7VchOnPulSiIualH:21 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 5 apr. 2020 kl. 15.39 skrev Eli Zaretskii : > I don't think I follow. We call code_convert_string_norecord, which > invokes code_convert_string with NOCOPY set to 'false'. So all those > users should NOT receive the same string as the argument, [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: gnu.org] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 5 apr. 2020 kl. 15.39 skrev Eli Zaretskii : > I don't think I follow. We call code_convert_string_norecord, which > invokes code_convert_string with NOCOPY set to 'false'. So all those > users should NOT receive the same string as the argument, and I don't > believe they expect that and can cope with it. Actually they can, as far as I can tell. Have a look yourself. > I don't think I understand your line of reasoning here. I don't think > GC is relevant, and as long as we are talking about file names, the > first null byte terminates it even though the Lisp string's length > could be larger. It is stated as a reason in Fexpand_file_name for working on copies of = strings; see comments therein. But that is not really important in = itself. >> Given the limited scope of the change, would you agree to a backport = of that to emacs-27? >=20 > That'd be a mistake, I think. My reasoning goes like this: If I'm > right that this bug was never seen, fixing it on emacs-27 will have no > visible effect; and if I'm wrong, then we will break the release > branch. The danger of breakage in the latter case is much more severe > than the gain from the fix in the former case. We do fix clear bugs on emacs-27 even when nobody complained about them, = but you are right that it's not that important in this case. Let's leave = it on master. > I hope you now agree with me that we should not do this. The default > should stay NOCOPY =3D false, and any caller that wants otherwise must > explicitly request that by calling code_convert_string. I disagree -- if the callers handle the situation safely, there is no = reason not to to do the change, saving some consing. We do this sort of = code improvement all the time; nothing special about this one. Of course, if you prefer the scenic route, we could add = {en,de}code_file_nocopy and replace {EN,DE}CODE_FILE calls one by one = until they all are done, and arrive at essentially the same code. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Apr 2020 15:37:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158610097119801 (code B ref 40407); Sun, 05 Apr 2020 15:37:02 +0000 Received: (at 40407) by debbugs.gnu.org; 5 Apr 2020 15:36:11 +0000 Received: from localhost ([127.0.0.1]:46498 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL7Ji-00059J-Ua for submit@debbugs.gnu.org; Sun, 05 Apr 2020 11:36:11 -0400 Received: from mail1458c50.megamailservers.eu ([91.136.14.58]:44160 helo=mail267c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL7Jc-00058E-PC for 40407@debbugs.gnu.org; Sun, 05 Apr 2020 11:36:09 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586100958; bh=cZ7ODgbzuWA0tPcKkbiLvxgX83BarzGFyffDOOSvt4g=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=N4QlyecNFBT7cugEMFASj6MHF7+FeWoXmZi/mhzF9g19uyRTGO8o+xqhJGNAOQJKt ra5xwQrXIxWL9awzKa3ZB5RusKInjTrdcuCvsViwk9gzSsf/7sTGjH3JqrbW2FGj0a Z7IWpzV/owR6lxm6zeHAF6z20tJDq3rlBuoqujEU= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail267c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 035FZt74001728; Sun, 5 Apr 2020 15:35:57 +0000 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <048BDA86-F50A-49CF-872A-2C94D1864181@acm.org> Date: Sun, 5 Apr 2020 17:35:55 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83imifv96b.fsf@gnu.org> <42EFD1AE-A96E-4613-A31E-C3723382DC6D@acm.org> <83r1x3tc6c.fsf@gnu.org> <1C9D87C7-57DD-4C61-86FE-99A7095E5085@acm.org> <83imift8fj.fsf@gnu.org> <51EFA20B-3F32-4242-82D5-EA2D2FB2FD3E@acm.org> <834ktyt5ki.fsf@gnu.org> <048BDA86-F50A-49CF-872A-2C94D1864181@acm.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F18.5E89FAAB.0036, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=Cf92G4jl c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=IkcTkHD0fZMA:10 a=M51BFTxLslgA:10 a=N54-gffFAAAA:8 a=Gw_eop-6_nsMFF-lQkcA:9 a=QEXdDO2ut3YA:10 a=6l0D2HzqY3Epnrm8mE3f:22 X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 5 apr. 2020 kl. 17.03 skrev Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= : > Actually they can, as far as I can tell. Have a look yourself. To clarify, calls to {EN,DE}CODE_FILE are probably safe by design since they already return their argument in several cases. Some calls to code_convert_string_norecord are not safe because they are us [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: megamailservers.eu] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 5 apr. 2020 kl. 17.03 skrev Mattias Engdeg=C3=A5rd : > Actually they can, as far as I can tell. Have a look yourself. To clarify, calls to {EN,DE}CODE_FILE are probably safe by design since = they already return their argument in several cases. Some calls to = code_convert_string_norecord are not safe because they are used with = auto lisp strings; I'm going through them to find out just which ones. = This can be done piecemeal. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Apr 2020 15:57:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158610218824305 (code B ref 40407); Sun, 05 Apr 2020 15:57:02 +0000 Received: (at 40407) by debbugs.gnu.org; 5 Apr 2020 15:56:28 +0000 Received: from localhost ([127.0.0.1]:46506 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL7dL-0006Js-SU for submit@debbugs.gnu.org; Sun, 05 Apr 2020 11:56:28 -0400 Received: from eggs.gnu.org ([209.51.188.92]:54433) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL7dK-0006JS-RS for 40407@debbugs.gnu.org; Sun, 05 Apr 2020 11:56:27 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:38835) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jL7dF-0003xM-IK; Sun, 05 Apr 2020 11:56:21 -0400 Received: from [176.228.60.248] (port=4859 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jL7dE-0004wr-D4; Sun, 05 Apr 2020 11:56:21 -0400 Date: Sun, 05 Apr 2020 18:56:13 +0300 Message-Id: <831rp2sz8i.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Sun, 5 Apr 2020 17:35:55 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83imifv96b.fsf@gnu.org> <42EFD1AE-A96E-4613-A31E-C3723382DC6D@acm.org> <83r1x3tc6c.fsf@gnu.org> <1C9D87C7-57DD-4C61-86FE-99A7095E5085@acm.org> <83imift8fj.fsf@gnu.org> <51EFA20B-3F32-4242-82D5-EA2D2FB2FD3E@acm.org> <834ktyt5ki.fsf@gnu.org> <048BDA86-F50A-49CF-872A-2C94D1864181@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Sun, 5 Apr 2020 17:35:55 +0200 > Cc: 40407@debbugs.gnu.org > > 5 apr. 2020 kl. 17.03 skrev Mattias Engdegård : > > > Actually they can, as far as I can tell. Have a look yourself. > > To clarify, calls to {EN,DE}CODE_FILE are probably safe by design since they already return their argument in several cases. In theory, yes. In practice, not really. The cases where they return their argument are those when we didn't yet set up any encoding for file names. This happens only very early into startup, and frankly, we have nothing else to do at that point. Once we do set up default-file-name-coding-system, these macros will never return their argument (unless someone forcefully sets the encoding to nil, in which case they deserve what they get). Do you agree? > Some calls to code_convert_string_norecord are not safe because they are used with auto lisp strings; I'm going through them to find out just which ones. This can be done piecemeal. I'm okay with making the safe cases faster, but we'd need to clearly comment each one, because later changes might make them unsafe. Any code that uses the un-encoded string after encoding it, or the un-decoded string after decoding it, could become broken if these two are the same string. And let's please keep in mind that on most modern platforms in most use cases default-file-name-coding-system is utf-8, so encoding is fast, and thus we don't need to go overboard here. IOW, if there's a doubt, there's no doubt. Thanks. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Apr 2020 16:01:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158610244025392 (code B ref 40407); Sun, 05 Apr 2020 16:01:01 +0000 Received: (at 40407) by debbugs.gnu.org; 5 Apr 2020 16:00:40 +0000 Received: from localhost ([127.0.0.1]:46514 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL7hP-0006bU-Ud for submit@debbugs.gnu.org; Sun, 05 Apr 2020 12:00:40 -0400 Received: from eggs.gnu.org ([209.51.188.92]:54712) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jL7hO-0006au-Ip for 40407@debbugs.gnu.org; Sun, 05 Apr 2020 12:00:38 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:38979) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jL7hJ-0005FY-Bt; Sun, 05 Apr 2020 12:00:33 -0400 Received: from [176.228.60.248] (port=1130 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jL7hC-00074K-MB; Sun, 05 Apr 2020 12:00:32 -0400 Date: Sun, 05 Apr 2020 19:00:17 +0300 Message-Id: <83zhbpsz1q.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <048BDA86-F50A-49CF-872A-2C94D1864181@acm.org> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Sun, 5 Apr 2020 17:03:49 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83imifv96b.fsf@gnu.org> <42EFD1AE-A96E-4613-A31E-C3723382DC6D@acm.org> <83r1x3tc6c.fsf@gnu.org> <1C9D87C7-57DD-4C61-86FE-99A7095E5085@acm.org> <83imift8fj.fsf@gnu.org> <51EFA20B-3F32-4242-82D5-EA2D2FB2FD3E@acm.org> <834ktyt5ki.fsf@gnu.org> <048BDA86-F50A-49CF-872A-2C94D1864181@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Sun, 5 Apr 2020 17:03:49 +0200 > Cc: 40407@debbugs.gnu.org > > > I don't think I understand your line of reasoning here. I don't think > > GC is relevant, and as long as we are talking about file names, the > > first null byte terminates it even though the Lisp string's length > > could be larger. > > It is stated as a reason in Fexpand_file_name for working on copies > of strings; see comments therein. That refers to code that keeps C pointers into string text. This is not our case here: we are talking about Lisp strings, not C pointers into them. > > I hope you now agree with me that we should not do this. The default > > should stay NOCOPY = false, and any caller that wants otherwise must > > explicitly request that by calling code_convert_string. > > I disagree -- if the callers handle the situation safely, there is no reason not to to do the change, saving some consing. We do this sort of code improvement all the time; nothing special about this one. "If the callers handle the situation safely" is not a trivial condition. The programmer will more often than not be unaware of this subtlety, and may not write such safe code. Moreover, the callers will have to handle it safely in the future, or be sure to insist on a copy if not, and these two macros don't give the callers knobs to request that. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: OGAWA Hirofumi Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 06 Apr 2020 10:11:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org, Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.15861678562732 (code B ref 40407); Mon, 06 Apr 2020 10:11:01 +0000 Received: (at 40407) by debbugs.gnu.org; 6 Apr 2020 10:10:56 +0000 Received: from localhost ([127.0.0.1]:47200 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLOiV-0000hz-Mj for submit@debbugs.gnu.org; Mon, 06 Apr 2020 06:10:55 -0400 Received: from mail.parknet.co.jp ([210.171.160.6]:33340) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLOiT-0000hk-00 for 40407@debbugs.gnu.org; Mon, 06 Apr 2020 06:10:54 -0400 Received: from ibmpc.myhome.or.jp (server.parknet.ne.jp [210.171.168.39]) by mail.parknet.co.jp (Postfix) with ESMTPSA id E1A2412F211; Mon, 6 Apr 2020 19:10:50 +0900 (JST) Received: from devron.myhome.or.jp (foobar@devron.myhome.or.jp [192.168.0.3]) by ibmpc.myhome.or.jp (8.15.2/8.15.2/Debian-18) with ESMTPS id 036AAnLO112905 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Mon, 6 Apr 2020 19:10:50 +0900 Received: from devron.myhome.or.jp (foobar@localhost [127.0.0.1]) by devron.myhome.or.jp (8.15.2/8.15.2/Debian-18) with ESMTPS id 036AAn4o554618 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Mon, 6 Apr 2020 19:10:49 +0900 Received: (from hirofumi@localhost) by devron.myhome.or.jp (8.15.2/8.15.2/Submit) id 036AAmvP554617; Mon, 6 Apr 2020 19:10:48 +0900 From: OGAWA Hirofumi References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> Date: Mon, 06 Apr 2020 19:10:48 +0900 In-Reply-To: <835zegwn9y.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 03 Apr 2020 19:24:09 +0300") Message-ID: <87blo46i1j.fsf@mail.parknet.co.jp> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) Eli Zaretskii writes: >> - if (BUFFERP (dst_object)) >> + if (EQ (dst_object, Qt)) >> + { >> + /* Fast path for ASCII-only input and an ASCII-compatible coding: >> + act as identity. */ >> + Lisp_Object attrs = CODING_ID_ATTRS (coding.id); >> + if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs)) >> + && (STRING_MULTIBYTE (string) >> + ? (chars == bytes) : string_ascii_p (string))) >> + return string; While using the latest master branch, I noticed this became the cause of decoding error. The simple reproducible test is, (decode-coding-string "&abc" 'utf-7-imap) => "&abc" like the above result, decoding utf-7-imap didn't work. Because (coding-system-get 'utf-7-imap :ascii-compatible-p) => t. I'm not sure, 'utf-7* should be fixed as non ascii-compatible, or string_ascii_p() should check more strictly. [BTW, (define-coding-system 'utf-7-imap "UTF-7 encoding of Unicode, IMAP version (RFC 2060)" :coding-type 'utf-8 :mnemonic ?u :charset-list '(unicode) :ascii-compatible-p nil ;; <=== added this line :pre-write-conversion 'utf-7-imap-pre-write-conversion :post-read-conversion 'utf-7-imap-post-read-conversion) doesn't work. Because define-coding-system-internal overwrites ascii-compatible-p.] Thanks. -- OGAWA Hirofumi From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 06 Apr 2020 14:22:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: OGAWA Hirofumi , Kenichi Handa Cc: 40407@debbugs.gnu.org, mattiase@acm.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.15861829082833 (code B ref 40407); Mon, 06 Apr 2020 14:22:01 +0000 Received: (at 40407) by debbugs.gnu.org; 6 Apr 2020 14:21:48 +0000 Received: from localhost ([127.0.0.1]:49019 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLSdI-0000jc-Dt for submit@debbugs.gnu.org; Mon, 06 Apr 2020 10:21:48 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44265) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLSdH-0000jB-22 for 40407@debbugs.gnu.org; Mon, 06 Apr 2020 10:21:47 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:59496) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jLSdB-0007jQ-4P; Mon, 06 Apr 2020 10:21:41 -0400 Received: from [176.228.60.248] (port=3241 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jLSdA-0006cq-D3; Mon, 06 Apr 2020 10:21:40 -0400 Date: Mon, 06 Apr 2020 17:21:34 +0300 Message-Id: <835zecsnip.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <87blo46i1j.fsf@mail.parknet.co.jp> (message from OGAWA Hirofumi on Mon, 06 Apr 2020 19:10:48 +0900) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: OGAWA Hirofumi > Cc: Mattias Engdegård , > 40407@debbugs.gnu.org > Date: Mon, 06 Apr 2020 19:10:48 +0900 > > Eli Zaretskii writes: > > >> - if (BUFFERP (dst_object)) > >> + if (EQ (dst_object, Qt)) > >> + { > >> + /* Fast path for ASCII-only input and an ASCII-compatible coding: > >> + act as identity. */ > >> + Lisp_Object attrs = CODING_ID_ATTRS (coding.id); > >> + if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs)) > >> + && (STRING_MULTIBYTE (string) > >> + ? (chars == bytes) : string_ascii_p (string))) > >> + return string; > > While using the latest master branch, I noticed this became the cause of > decoding error. > > The simple reproducible test is, > > (decode-coding-string "&abc" 'utf-7-imap) > => "&abc" > > like the above result, decoding utf-7-imap didn't work. > > Because (coding-system-get 'utf-7-imap :ascii-compatible-p) => t. Thanks. > I'm not sure, 'utf-7* should be fixed as non ascii-compatible, or > string_ascii_p() should check more strictly. The former, since UTF-7 is definitely *not* ASCII-compatible. Does the patch below produce good results? Kenichi, why was coding-type of UTF-7 systems set to 'utf-8'? Wouldn't it be better to set it to 'utf-16'? Or is there some subtlety here that we should be aware of? Do you have any comments on the patch below? Thanks. diff --git a/src/coding.c b/src/coding.c index 97a6eb9..71ff93c 100644 --- a/src/coding.c +++ b/src/coding.c @@ -11301,7 +11301,10 @@ DEFUN ("define-coding-system-internal", Fdefine_coding_system_internal, CHECK_CODING_SYSTEM (val); } ASET (attrs, coding_attr_utf_bom, bom); - if (NILP (bom)) + if (NILP (bom) + /* UTF-7 has :coding-type set to 'utf-8' (why not + 'utf-16'?), but it is definitely NOT ASCII-compatible. */ + && !EQ (name, Qutf_7) && !EQ (name, Qutf_7_imap)) ASET (attrs, coding_attr_ascii_compat, Qt); category = (CONSP (bom) ? coding_category_utf_8_auto @@ -11673,6 +11676,9 @@ syms_of_coding (void) DEFSYM (Qutf_8_unix, "utf-8-unix"); DEFSYM (Qutf_8_emacs, "utf-8-emacs"); + DEFSYM (Qutf_7, "utf-7"); + DEFSYM (Qutf_7_imap, "utf-7-imap"); + #if defined (WINDOWSNT) || defined (CYGWIN) /* No, not utf-16-le: that one has a BOM. */ DEFSYM (Qutf_16le, "utf-16le"); From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 06 Apr 2020 15:57:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org, Kenichi Handa , OGAWA Hirofumi Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158618860822401 (code B ref 40407); Mon, 06 Apr 2020 15:57:02 +0000 Received: (at 40407) by debbugs.gnu.org; 6 Apr 2020 15:56:48 +0000 Received: from localhost ([127.0.0.1]:49089 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLU7D-0005pD-Ni for submit@debbugs.gnu.org; Mon, 06 Apr 2020 11:56:48 -0400 Received: from mail1433c50.megamailservers.eu ([91.136.14.33]:55012 helo=mail263c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLU78-0005oN-So for 40407@debbugs.gnu.org; Mon, 06 Apr 2020 11:56:44 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586188592; bh=YsiqJOgbtsvkafpr5nbTxMfpeXNm0dgsWcU7IkO9NKM=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=bFQaAYyE9Zvh8a/I/BuiEZ3za4TnJTVT/w3gIxVHyVQXw8IFLHD9P1GOkRXnhMgJc 7Cdj9ZMV7QMjf8xjonqfkrfJpQbTYWeDV4AuXYDvr7dOEyqFvPRXrIjgq5+87RztQY +0dNXupK9nIRCihO6IFxENXIfzSTBqKoUv5uDHXM= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail263c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 036FuSdr016171; Mon, 6 Apr 2020 15:56:30 +0000 From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Message-Id: <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> Content-Type: multipart/mixed; boundary="Apple-Mail=_230E54E8-1BE2-40F3-9FB2-C2A479E2BA88" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) Date: Mon, 6 Apr 2020 17:56:27 +0200 In-Reply-To: <835zecsnip.fsf@gnu.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> <835zecsnip.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F24.5E8B5105.0024, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=e6d4tph/ c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=3pAwX9jsd0u1J1Px0qAA:9 a=CjuIK1q_8ugA:10 a=39F48Vm0bRl1VN-4Pd4A:9 a=De_Ol2h6w80A:10 a=tclcd6dtLQvEqt9_mmAA:9 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 6 apr. 2020 kl. 16.21 skrev Eli Zaretskii : > Kenichi, why was coding-type of UTF-7 systems set to 'utf-8'? > Wouldn't it be better to set it to 'utf-16'? Or is there some > subtlety here that we should be aware of? Do you have any comments on [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: gnu.org] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --Apple-Mail=_230E54E8-1BE2-40F3-9FB2-C2A479E2BA88 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii 6 apr. 2020 kl. 16.21 skrev Eli Zaretskii : > Kenichi, why was coding-type of UTF-7 systems set to 'utf-8'? > Wouldn't it be better to set it to 'utf-16'? Or is there some > subtlety here that we should be aware of? Do you have any comments on > the patch below? There is no reason why utf-7[-imap] should have utf-8 as coding-type, is = there? utf-16 is definitely wrong (utf-7* are encoded in ASCII). What = about the patch below instead? By the way, there appears to be another, unrelated bug in utf-7-imap: = According to RFC 2060, all C0 controls are base64-encoded, but in Emacs = some of them are passed through unchanged (CR, LF and TAB). This is = permitted by plain UTF-7 (RFC 1642) but not in the IMAP variant. --Apple-Mail=_230E54E8-1BE2-40F3-9FB2-C2A479E2BA88 Content-Disposition: attachment; filename=utf-7.diff Content-Type: application/octet-stream; x-unix-mode=0644; name="utf-7.diff" Content-Transfer-Encoding: quoted-printable diff=20--git=20a/lisp/international/mule-conf.el=20= b/lisp/international/mule-conf.el=0Aindex=20e6e6135243..c5cfbaeb87=20= 100644=0A---=20a/lisp/international/mule-conf.el=0A+++=20= b/lisp/international/mule-conf.el=0A@@=20-1511,20=20+1511,25=20@@=20= 'iso-safe=0A=20=0A=20(define-coding-system=20'utf-7=0A=20=20=20"UTF-7=20= encoding=20of=20Unicode=20(RFC=202152)."=0A-=20=20:coding-type=20'utf-8=0A= +=20=20:coding-type=20'charset=0A+=20=20:charset-list=20'(ascii)=0A=20=20= =20:mnemonic=20?U=0A=20=20=20:mime-charset=20'utf-7=0A-=20=20= :charset-list=20'(unicode)=0A=20=20=20:pre-write-conversion=20= 'utf-7-pre-write-conversion=0A=20=20=20:post-read-conversion=20= 'utf-7-post-read-conversion)=0A+;;=20Having=20`ascii'=20in=20= :charset-list=20automatically=20sets=20:ascii-compatible-p,=0A+;;=20but=20= UTF-7=20is=20not=20ASCII=20compatible;=20disable.=0A+(coding-system-put=20= 'utf-7=20:ascii-compatible-p=20nil)=0A=20=0A=20(define-coding-system=20= 'utf-7-imap=0A=20=20=20"UTF-7=20encoding=20of=20Unicode,=20IMAP=20= version=20(RFC=202060)"=0A-=20=20:coding-type=20'utf-8=0A+=20=20= :coding-type=20'charset=0A+=20=20:charset-list=20'(ascii)=0A=20=20=20= :mnemonic=20?u=0A-=20=20:charset-list=20'(unicode)=0A=20=20=20= :pre-write-conversion=20'utf-7-imap-pre-write-conversion=0A=20=20=20= :post-read-conversion=20'utf-7-imap-post-read-conversion)=0A+;;=20See=20= comment=20for=20utf-7=20above.=0A+(coding-system-put=20'utf-7-imap=20= :ascii-compatible-p=20nil)=0A=20=0A=20;;=20Use=20us-ascii=20for=20= terminal=20output=20if=20some=20other=20coding=20system=20is=20not=0A=20= ;;=20specified=20explicitly.=0Adiff=20--git=20= a/test/lisp/international/mule-tests.el=20= b/test/lisp/international/mule-tests.el=0Aindex=2091e3c2279f..b5fbb4ab8e=20= 100644=0A---=20a/test/lisp/international/mule-tests.el=0A+++=20= b/test/lisp/international/mule-tests.el=0A@@=20-48,6=20+48,19=20@@=20= mule-cmds--test-universal-coding-system-argument=0A=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20(append=20(kbd=20"C-x=20= RET=20c=20u=20t=20f=20-=208=20RET=20C-u=20C-u=20c=20a=20b=20RET")=20= nil)))=0A=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= (read-string=20"prompt:")))))=0A=20=0A+(ert-deftest=20mule-utf-7=20()=0A= +=20=20;;=20utf-7=20and=20utf-7-imap=20are=20not=20ASCII-compatible.=0A+=20= =20(should-not=20(coding-system-get=20'utf-7=20:ascii-compatible-p))=0A+=20= =20(should-not=20(coding-system-get=20'utf-7-imap=20= :ascii-compatible-p))=0A+=20=20;;=20Invariant=20ASCII=20subset.=0A+=20=20= (let=20((s=20(apply=20#'string=20(append=20(number-sequence=20#x20=20= #x25)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20(number-sequence=20#x27=20= #x7e)))))=0A+=20=20=20=20(should=20(equal=20(encode-coding-string=20s=20= 'utf-7-imap)=20s))=0A+=20=20=20=20(should=20(equal=20= (decode-coding-string=20s=20'utf-7-imap)=20s)))=0A+=20=20;;=20Escaped=20= ampersand.=0A+=20=20(should=20(equal=20(encode-coding-string=20"a&bcd"=20= 'utf-7-imap)=20"a&-bcd"))=0A+=20=20(should=20(equal=20= (decode-coding-string=20"a&-bcd"=20'utf-7-imap)=20"a&bcd")))=0A+=0A=20;;=20= Stop=20"Local=20Variables"=20above=20causing=20confusion=20when=20= visiting=20this=20file.=0A=20=0C=0A=20=0A= --Apple-Mail=_230E54E8-1BE2-40F3-9FB2-C2A479E2BA88 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii --Apple-Mail=_230E54E8-1BE2-40F3-9FB2-C2A479E2BA88-- From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 06 Apr 2020 16:34:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org, handa@gnu.org, hirofumi@mail.parknet.co.jp Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158619081130310 (code B ref 40407); Mon, 06 Apr 2020 16:34:02 +0000 Received: (at 40407) by debbugs.gnu.org; 6 Apr 2020 16:33:31 +0000 Received: from localhost ([127.0.0.1]:49140 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLUgl-0007sn-8x for submit@debbugs.gnu.org; Mon, 06 Apr 2020 12:33:31 -0400 Received: from eggs.gnu.org ([209.51.188.92]:37007) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLUgj-0007sL-Bf for 40407@debbugs.gnu.org; Mon, 06 Apr 2020 12:33:30 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:34634) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jLUgd-00020y-Ej; Mon, 06 Apr 2020 12:33:23 -0400 Received: from [176.228.60.248] (port=3364 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jLUgc-0002TG-9U; Mon, 06 Apr 2020 12:33:22 -0400 Date: Mon, 06 Apr 2020 19:33:16 +0300 Message-Id: <83zhbor2ur.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Mon, 6 Apr 2020 17:56:27 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> <835zecsnip.fsf@gnu.org> <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Mon, 6 Apr 2020 17:56:27 +0200 > Cc: OGAWA Hirofumi , > Kenichi Handa , 40407@debbugs.gnu.org > > > Kenichi, why was coding-type of UTF-7 systems set to 'utf-8'? > > Wouldn't it be better to set it to 'utf-16'? Or is there some > > subtlety here that we should be aware of? Do you have any comments on > > the patch below? > > There is no reason why utf-7[-imap] should have utf-8 as coding-type, is there? I think it might be just some convenience thing: utf-7 and utf-8 have something in common that made it convenient to treat them the same in the internal routines. Or maybe it's just an accident. > utf-16 is definitely wrong (utf-7* are encoded in ASCII). Why do you think the ASCII encoding contradicts the utf-16 coding-type? > What about the patch below instead? I don't think 'charset' is the right type for this encoding (any reason why you've chosen it?), but I will let Handa-san comment. Defining coding-systems is a black art which I don't think I ever mastered. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 06 Apr 2020 16:56:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org, handa@gnu.org, hirofumi@mail.parknet.co.jp Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.15861921442370 (code B ref 40407); Mon, 06 Apr 2020 16:56:02 +0000 Received: (at 40407) by debbugs.gnu.org; 6 Apr 2020 16:55:44 +0000 Received: from localhost ([127.0.0.1]:49152 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLV2F-0000c8-FR for submit@debbugs.gnu.org; Mon, 06 Apr 2020 12:55:43 -0400 Received: from mail208c50.megamailservers.eu ([91.136.10.218]:55040 helo=mail194c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLV2C-0000bb-Sq for 40407@debbugs.gnu.org; Mon, 06 Apr 2020 12:55:42 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586192134; bh=tS6xzKtTHnTVrXFZw9+AldNlVwiuIQB4vLC7LgwKWCc=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=anRE5aSEb/0mhaFYjETobvMWJnKD4C35d+TQL1myQ2b/20e2wqzsFYwNV3dNCYY9+ 5OiWfw/m9AnBtd6TBMz0LMBRyUl1QX0FlDonp8Czzlcg5UICe/r89XnF3ub2+UD/1p mef1rtnkAr9ya28lzWV+Y1PN9bXqD6/PG/+YrTTo= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail194c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 036GtV2K021646; Mon, 6 Apr 2020 16:55:33 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <83zhbor2ur.fsf@gnu.org> Date: Mon, 6 Apr 2020 18:55:30 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <329AEABE-0D33-4324-B697-FBEA9340E6BB@acm.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> <835zecsnip.fsf@gnu.org> <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> <83zhbor2ur.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F26.5E8B5EE0.00B7, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=KsozJleN c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=r8PFI-Of08nG_EPcs_AA:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 6 apr. 2020 kl. 18.33 skrev Eli Zaretskii : > I think it might be just some convenience thing: utf-7 and utf-8 have > something in common that made it convenient to treat them the same in > the internal routines. Or maybe it's just an accident. There is nothing common between utf-7 and utf-8 at all (apart from a = subset of ASCII being encoded in the same way, and the fact that both = encode the Unicode repertoire). > Why do you think the ASCII encoding contradicts the utf-16 > coding-type? Because :coding type is the first stage of decoding, or the last stage = of encoding. It reflects the low-level structure of the encoded data: = using utf-16 as :coding-type implies that utf-7 is encoded into 16-bit = parcels, but it's not -- the result of utf-7-imap encoding is a sequence = of ASCII bytes. (UTF-16 plays a part in an intermediary step for some = values before they are base64-encoded, but that's not visible in the = final byte stream.) > I don't think 'charset' is the right type for this encoding (any > reason why you've chosen it?), but I will let Handa-san comment. We could use 'raw-text' as well but that implies that any byte value = could be part of an utf-7[-imap] text, which is incorrect. In fact, utf-7-imap only uses codes 0x20-0x7e (utf-7 is allowed to use a = few C0 controls too, as mentioned). Arguably the heuristics of define-coding-system-internal are somewhat = inscrutable. There seems to be leaks between layers -- = ascii-compatible-p is an end-to-end property and cannot really be set = the way it is by that function. But since it is, fixing it afterwards = should be the correct way. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 06 Apr 2020 17:19:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org, handa@gnu.org, hirofumi@mail.parknet.co.jp Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.15861934986982 (code B ref 40407); Mon, 06 Apr 2020 17:19:02 +0000 Received: (at 40407) by debbugs.gnu.org; 6 Apr 2020 17:18:18 +0000 Received: from localhost ([127.0.0.1]:49160 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLVO6-0001oW-GE for submit@debbugs.gnu.org; Mon, 06 Apr 2020 13:18:18 -0400 Received: from eggs.gnu.org ([209.51.188.92]:43175) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLVO5-0001o9-IR for 40407@debbugs.gnu.org; Mon, 06 Apr 2020 13:18:18 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:35426) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jLVNz-0001ic-QI; Mon, 06 Apr 2020 13:18:11 -0400 Received: from [176.228.60.248] (port=2430 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jLVNy-00081Q-KT; Mon, 06 Apr 2020 13:18:11 -0400 Date: Mon, 06 Apr 2020 20:18:05 +0300 Message-Id: <83wo6sr0s2.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <329AEABE-0D33-4324-B697-FBEA9340E6BB@acm.org> (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Mon, 6 Apr 2020 18:55:30 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> <835zecsnip.fsf@gnu.org> <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> <83zhbor2ur.fsf@gnu.org> <329AEABE-0D33-4324-B697-FBEA9340E6BB@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Mon, 6 Apr 2020 18:55:30 +0200 > Cc: hirofumi@mail.parknet.co.jp, handa@gnu.org, 40407@debbugs.gnu.org > > 6 apr. 2020 kl. 18.33 skrev Eli Zaretskii : > > > I think it might be just some convenience thing: utf-7 and utf-8 have > > something in common that made it convenient to treat them the same in > > the internal routines. Or maybe it's just an accident. > > There is nothing common between utf-7 and utf-8 at all (apart from a subset of ASCII being encoded in the same way, and the fact that both encode the Unicode repertoire). By "in common" in this context I meant from the POV of internal treating of the two encodings. > > I don't think 'charset' is the right type for this encoding (any > > reason why you've chosen it?), but I will let Handa-san comment. > > We could use 'raw-text' as well but that implies that any byte value could be part of an utf-7[-imap] text, which is incorrect. > In fact, utf-7-imap only uses codes 0x20-0x7e (utf-7 is allowed to use a few C0 controls too, as mentioned). > > Arguably the heuristics of define-coding-system-internal are somewhat inscrutable. There seems to be leaks between layers -- ascii-compatible-p is an end-to-end property and cannot really be set the way it is by that function. But since it is, fixing it afterwards should be the correct way. I prefer to wait for Handa-san's response, and meanwhile install the least disruptive change, which just fixes the one aspect that got broken. Call me a coward, if you wish. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 06 Apr 2020 17:50:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org, handa@gnu.org, hirofumi@mail.parknet.co.jp Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158619537113325 (code B ref 40407); Mon, 06 Apr 2020 17:50:02 +0000 Received: (at 40407) by debbugs.gnu.org; 6 Apr 2020 17:49:31 +0000 Received: from localhost ([127.0.0.1]:49181 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLVsJ-0003Sp-Ep for submit@debbugs.gnu.org; Mon, 06 Apr 2020 13:49:31 -0400 Received: from mail205c50.megamailservers.eu ([91.136.10.215]:60004 helo=mail193c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLVsG-0003SL-05 for 40407@debbugs.gnu.org; Mon, 06 Apr 2020 13:49:29 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586195363; bh=BloUKu4FK80JLdDC0UOTmTiTITIxc99CP/1KPcWGqx0=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=mrahF0ovxpSMLbregYFf8OloeWZlXsN+MutE3ctfgBQ7gMhy4vLGsbt6bXhEYstlW jfx2zLRiypcRLCN/62oIESmFqKNO9FfGFMGLglHozCZkqC1GULoY2oqurZuRcm420p MkOnYJcqn0Ff63R2w1jnkW/9Vpy11mdqTzUHUwqE= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail193c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 036HnJga024260; Mon, 6 Apr 2020 17:49:21 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <83wo6sr0s2.fsf@gnu.org> Date: Mon, 6 Apr 2020 19:49:19 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> <835zecsnip.fsf@gnu.org> <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> <83zhbor2ur.fsf@gnu.org> <329AEABE-0D33-4324-B697-FBEA9340E6BB@acm.org> <83wo6sr0s2.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F16.5E8B6B7D.0072, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=cM2eTWWN c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=Hk0bA-YSVASESqetRa8A:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 6 apr. 2020 kl. 19.18 skrev Eli Zaretskii : > By "in common" in this context I meant from the POV of internal > treating of the two encodings. So did I. > I prefer to wait for Handa-san's response, and meanwhile install the > least disruptive change, which just fixes the one aspect that got > broken. Call me a coward, if you wish. If so, the least disruptive change by far would be the two calls to = coding-system-put in my patch (along with the tests, of course). From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 06 Apr 2020 18:14:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158619683018193 (code B ref 40407); Mon, 06 Apr 2020 18:14:02 +0000 Received: (at 40407) by debbugs.gnu.org; 6 Apr 2020 18:13:50 +0000 Received: from localhost ([127.0.0.1]:49192 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLWFp-0004jN-NE for submit@debbugs.gnu.org; Mon, 06 Apr 2020 14:13:50 -0400 Received: from mail205c50.megamailservers.eu ([91.136.10.215]:39162 helo=mail193c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLWFn-0004j2-PQ for 40407@debbugs.gnu.org; Mon, 06 Apr 2020 14:13:48 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586196825; bh=XHwEtGUucXjSocJ2hP1rjqkHzZjyUTssDu+JQ102Z3U=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=X2Y9WZiS/9vMKDRcp+WrUP+B6To62gOHvae8k89UHOe+g1hkOEwDzJI/66BRQXFfM kqBmB5OgcnbtxWewA1QF/aNWVPncm29II8JmT9bobjCd0QyYyk4bB7OEGFDvo1FUHg m52Mq+/Jp+1XzdmV0yko3rX1VYRDLSijKEHkojkU= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail193c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 036IDhTi020096; Mon, 6 Apr 2020 18:13:45 +0000 From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Message-Id: <7A9EBE60-9CA3-4EC7-8B62-E5157A5423FB@acm.org> Content-Type: multipart/mixed; boundary="Apple-Mail=_7CB163B6-025E-42F6-9BF5-B7EAC04B41AD" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) Date: Mon, 6 Apr 2020 20:13:43 +0200 In-Reply-To: <831rp2sz8i.fsf@gnu.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <83imifv96b.fsf@gnu.org> <42EFD1AE-A96E-4613-A31E-C3723382DC6D@acm.org> <83r1x3tc6c.fsf@gnu.org> <1C9D87C7-57DD-4C61-86FE-99A7095E5085@acm.org> <83imift8fj.fsf@gnu.org> <51EFA20B-3F32-4242-82D5-EA2D2FB2FD3E@acm.org> <834ktyt5ki.fsf@gnu.org> <048BDA86-F50A-49CF-872A-2C94D1864181@acm.org> <831rp2sz8i.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F19.5E8B7159.006F, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=cM2eTWWN c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=rstMWfujXf1L_qdg40IA:9 a=CjuIK1q_8ugA:10 a=ND3vsFg5IG7zdj3qtFEA:9 a=B2y7HmGcmWMA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --Apple-Mail=_7CB163B6-025E-42F6-9BF5-B7EAC04B41AD Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii 5 apr. 2020 kl. 17.56 skrev Eli Zaretskii : > Once we do set up default-file-name-coding-system, these macros will > never return their argument (unless someone forcefully sets the > encoding to nil, in which case they deserve what they get). Do you > agree? Thank you, and yes, I do agree partly: ENCODE_FILE is the identity for = all unibyte strings no matter the coding system in use. However, my point (which I didn't do a very good job explaining) was = that if either ENCODE_FILE or DECODE_FILE are called with the assumption = that they return a new string, that is at least a latent bug. Thus I went through them all once again, and found a few questionable = calls that I'd like to fix. They rely on Fexpand_file_name returning a = new string, which may or may not be true now but we would be better = without such assumptions. (I also stumbled on a potential GC-related = bug.) Patch attached! With these fixed, nothing prevents those two functions from using = no-copy semantics. I agree this approach is better and safer than going = straight for code_convert_string_norecord in one pass. --Apple-Mail=_7CB163B6-025E-42F6-9BF5-B7EAC04B41AD Content-Disposition: attachment; filename=0001-Don-t-rely-on-copying-in-EN-DE-CODE_FILE.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Don-t-rely-on-copying-in-EN-DE-CODE_FILE.patch" Content-Transfer-Encoding: quoted-printable =46rom=20ff62a3874890810823f79dac1273ebdd214ba529=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Mon,=206=20Apr=202020=2015:20:08=20+0200=0A= Subject:=20[PATCH]=20Don't=20rely=20on=20copying=20in=20{EN,DE}CODE_FILE=0A= =0ACallers=20of=20ENCODE_FILE=20and=20DECODE_FILE=20should=20not=20= assume=20that=20these=0Afunctions=20always=20return=20a=20new=20string=20= (bug#40407).=0A=0A*=20src/w32fns.c=20(Fw32_shell_execute):=0A*=20= src/w32proc.c=20(Fw32_application_type):=0ASink=20taking=20the=20address=20= of=20a=20Lisp=20string=20past=20GC=20points.=0ACopy=20values=20returned=20= from=20ENCODE_FILE=20before=20mutating=20them.=0A---=0A=20src/w32fns.c=20= =20|=204=20++--=0A=20src/w32proc.c=20|=202=20+-=0A=202=20files=20= changed,=203=20insertions(+),=203=20deletions(-)=0A=0Adiff=20--git=20= a/src/w32fns.c=20b/src/w32fns.c=0Aindex=209bb4e27b01..8d714f0b8d=20= 100644=0A---=20a/src/w32fns.c=0A+++=20b/src/w32fns.c=0A@@=20-8258,7=20= +8258,6=20@@=20parameters=20(e.g.,=20\"printto\"=20requires=20the=20= printer=20address).=20=20Otherwise,=0A=20=20=20/*=20Encode=20filename,=20= current=20directory=20and=20parameters.=20=20*/=0A=20=20=20current_dir=20= =3D=20GUI_ENCODE_FILE=20(current_dir);=0A=20=20=20document=20=3D=20= GUI_ENCODE_FILE=20(document);=0A-=20=20doc_w=20=3D=20GUI_SDATA=20= (document);=0A=20=20=20if=20(STRINGP=20(parameters))=0A=20=20=20=20=20{=0A= =20=20=20=20=20=20=20parameters=20=3D=20GUI_ENCODE_SYSTEM=20= (parameters);=0A@@=20-8269,6=20+8268,7=20@@=20parameters=20(e.g.,=20= \"printto\"=20requires=20the=20printer=20address).=20=20Otherwise,=0A=20=20= =20=20=20=20=20operation=20=3D=20GUI_ENCODE_SYSTEM=20(operation);=0A=20=20= =20=20=20=20=20ops_w=20=3D=20GUI_SDATA=20(operation);=0A=20=20=20=20=20}=0A= +=20=20doc_w=20=3D=20GUI_SDATA=20(document);=0A=20=20=20result=20=3D=20= (intptr_t)=20ShellExecuteW=20(NULL,=20ops_w,=20doc_w,=20params_w,=0A=20=09= =09=09=09=20=20=20=20=20GUI_SDATA=20(current_dir),=0A=20=09=09=09=09=20=20= =20=20=20(FIXNUMP=20(show_flag)=0A@@=20-8353,7=20+8353,7=20@@=20= parameters=20(e.g.,=20\"printto\"=20requires=20the=20printer=20address).=20= =20Otherwise,=0A=20=20=20handler=20=3D=20Ffind_file_name_handler=20= (absdoc,=20Qfile_exists_p);=0A=20=20=20if=20(NILP=20(handler))=0A=20=20=20= =20=20{=0A-=20=20=20=20=20=20Lisp_Object=20absdoc_encoded=20=3D=20= ENCODE_FILE=20(absdoc);=0A+=20=20=20=20=20=20Lisp_Object=20= absdoc_encoded=20=3D=20Fcopy_sequence=20(ENCODE_FILE=20(absdoc));=0A=20=0A= =20=20=20=20=20=20=20if=20(faccessat=20(AT_FDCWD,=20SSDATA=20= (absdoc_encoded),=20F_OK,=20AT_EACCESS)=20=3D=3D=200)=0A=20=09{=0Adiff=20= --git=20a/src/w32proc.c=20b/src/w32proc.c=0Aindex=20= de33726905..16e32e4c58=20100644=0A---=20a/src/w32proc.c=0A+++=20= b/src/w32proc.c=0A@@=20-3231,7=20+3231,7=20@@=20DEFUN=20= ("w32-application-type",=20Fw32_application_type,=0A=20=20=20char=20= *progname,=20progname_a[MAX_PATH];=0A=20=0A=20=20=20program=20=3D=20= Fexpand_file_name=20(program,=20Qnil);=0A-=20=20encoded_progname=20=3D=20= ENCODE_FILE=20(program);=0A+=20=20encoded_progname=20=3D=20= Fcopy_sequence=20(ENCODE_FILE=20(program));=0A=20=20=20progname=20=3D=20= SSDATA=20(encoded_progname);=0A=20=20=20unixtodos_filename=20(progname);=0A= =20=20=20filename_to_ansi=20(progname,=20progname_a);=0A--=20=0A2.21.1=20= (Apple=20Git-122.3)=0A=0A= --Apple-Mail=_7CB163B6-025E-42F6-9BF5-B7EAC04B41AD-- From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 06 Apr 2020 18:21:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org, handa@gnu.org, hirofumi@mail.parknet.co.jp Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158619722919592 (code B ref 40407); Mon, 06 Apr 2020 18:21:02 +0000 Received: (at 40407) by debbugs.gnu.org; 6 Apr 2020 18:20:29 +0000 Received: from localhost ([127.0.0.1]:49197 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLWMH-00055u-HV for submit@debbugs.gnu.org; Mon, 06 Apr 2020 14:20:29 -0400 Received: from eggs.gnu.org ([209.51.188.92]:56889) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLWMF-00055S-Ca for 40407@debbugs.gnu.org; Mon, 06 Apr 2020 14:20:27 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:36658) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jLWM9-0001VZ-9i; Mon, 06 Apr 2020 14:20:21 -0400 Received: from [176.228.60.248] (port=2234 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jLWM8-0000UR-Di; Mon, 06 Apr 2020 14:20:21 -0400 Date: Mon, 06 Apr 2020 21:20:14 +0300 Message-Id: <83tv1wqxwh.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Mon, 6 Apr 2020 19:49:19 +0200) References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> <835zecsnip.fsf@gnu.org> <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> <83zhbor2ur.fsf@gnu.org> <329AEABE-0D33-4324-B697-FBEA9340E6BB@acm.org> <83wo6sr0s2.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: Mattias Engdegård > Date: Mon, 6 Apr 2020 19:49:19 +0200 > Cc: hirofumi@mail.parknet.co.jp, handa@gnu.org, 40407@debbugs.gnu.org > > > I prefer to wait for Handa-san's response, and meanwhile install the > > least disruptive change, which just fixes the one aspect that got > > broken. Call me a coward, if you wish. > > If so, the least disruptive change by far would be the two calls to coding-system-put in my patch (along with the tests, of course). Fine with me, but please say something about why we use 'put' instead of specifying :ascii-compatible-p directly in the coding-system's definition, and also add a FIXME there, since I hope this is not the final word on that matter. Thanks. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: OGAWA Hirofumi Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 06 Apr 2020 18:35:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org, Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= , handa@gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158619808122403 (code B ref 40407); Mon, 06 Apr 2020 18:35:01 +0000 Received: (at 40407) by debbugs.gnu.org; 6 Apr 2020 18:34:41 +0000 Received: from localhost ([127.0.0.1]:49209 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLWa1-0005pG-5d for submit@debbugs.gnu.org; Mon, 06 Apr 2020 14:34:41 -0400 Received: from mail.parknet.co.jp ([210.171.160.6]:33390) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLWZz-0005oz-Ah for 40407@debbugs.gnu.org; Mon, 06 Apr 2020 14:34:40 -0400 Received: from ibmpc.myhome.or.jp (server.parknet.ne.jp [210.171.168.39]) by mail.parknet.co.jp (Postfix) with ESMTPSA id 3653B12F211; Tue, 7 Apr 2020 03:34:37 +0900 (JST) Received: from devron.myhome.or.jp (foobar@devron.myhome.or.jp [192.168.0.3]) by ibmpc.myhome.or.jp (8.15.2/8.15.2/Debian-18) with ESMTPS id 036IYZoB128869 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 7 Apr 2020 03:34:36 +0900 Received: from devron.myhome.or.jp (foobar@localhost [127.0.0.1]) by devron.myhome.or.jp (8.15.2/8.15.2/Debian-18) with ESMTPS id 036IYZUK600819 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 7 Apr 2020 03:34:35 +0900 Received: (from hirofumi@localhost) by devron.myhome.or.jp (8.15.2/8.15.2/Submit) id 036IYYLw600818; Tue, 7 Apr 2020 03:34:34 +0900 From: OGAWA Hirofumi References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> <835zecsnip.fsf@gnu.org> <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> <83zhbor2ur.fsf@gnu.org> <329AEABE-0D33-4324-B697-FBEA9340E6BB@acm.org> <83wo6sr0s2.fsf@gnu.org> <83tv1wqxwh.fsf@gnu.org> Date: Tue, 07 Apr 2020 03:34:34 +0900 In-Reply-To: <83tv1wqxwh.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 06 Apr 2020 21:20:14 +0300") Message-ID: <87lfn84g5h.fsf@mail.parknet.co.jp> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) Eli Zaretskii writes: >> > I prefer to wait for Handa-san's response, and meanwhile install the >> > least disruptive change, which just fixes the one aspect that got >> > broken. Call me a coward, if you wish. >>=20 >> If so, the least disruptive change by far would be the two calls to codi= ng-system-put in my patch (along with the tests, of course). > > Fine with me, but please say something about why we use 'put' instead > of specifying :ascii-compatible-p directly in the coding-system's > definition, and also add a FIXME there, since I hope this is not the > final word on that matter. BTW,=20 Mattias Engdeg=C3=A5rd writes: > (define-coding-system 'utf-7-imap > "UTF-7 encoding of Unicode, IMAP version (RFC 2060)" > - :coding-type 'utf-8 > + :coding-type 'charset > + :charset-list '(ascii) > :mnemonic ?u > - :charset-list '(unicode) > :pre-write-conversion 'utf-7-imap-pre-write-conversion > :post-read-conversion 'utf-7-imap-post-read-conversion) > +;; See comment for utf-7 above. > +(coding-system-put 'utf-7-imap :ascii-compatible-p nil) (check-coding-systems-region "=E3=81=82" nil '(utf-7-imap)) =3D> ((utf-7-imap 0)) It says "cannot encodable by utf-7-imap", so looks like ":charset-list '(ascii)" doesn't work at least. Thanks. --=20 OGAWA Hirofumi From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 06 Apr 2020 21:58:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: OGAWA Hirofumi Cc: 40407@debbugs.gnu.org, handa@gnu.org, Eli Zaretskii Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158621025131165 (code B ref 40407); Mon, 06 Apr 2020 21:58:01 +0000 Received: (at 40407) by debbugs.gnu.org; 6 Apr 2020 21:57:31 +0000 Received: from localhost ([127.0.0.1]:49340 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLZkJ-00086Z-8v for submit@debbugs.gnu.org; Mon, 06 Apr 2020 17:57:31 -0400 Received: from mail1467c50.megamailservers.eu ([91.136.14.67]:37642 helo=mail268c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jLZkD-00085r-1f for 40407@debbugs.gnu.org; Mon, 06 Apr 2020 17:57:29 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586210238; bh=AnUL/X69w6bykXkaCm17XfqY/6Abhr9WeoulR8FOmdY=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=omflxqS8FXINAIQQ4E6mCZdd3uJiYkQ4LGlLzTFXRJMT/+EqYxcYdMWHXW183cVdp crUgrgRnJ8OuXPIb/gVlCdahF5H8SMtX++5V5wmkDY3+wMDAel0HmMcqYI/+UiG512 dcB56GIkM9DhKP4DauWXLApe5cpIPixvsniz3Xhw= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail268c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 036LvEDb019335; Mon, 6 Apr 2020 21:57:16 +0000 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <87lfn84g5h.fsf@mail.parknet.co.jp> Date: Mon, 6 Apr 2020 23:57:14 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> <835zecsnip.fsf@gnu.org> <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> <83zhbor2ur.fsf@gnu.org> <329AEABE-0D33-4324-B697-FBEA9340E6BB@acm.org> <83wo6sr0s2.fsf@gnu.org> <83tv1wqxwh.fsf@gnu.org> <87lfn84g5h.fsf@mail.parknet.co.jp> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F28.5E8BA5A1.0005, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=BZ+mLYl2 c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=IkcTkHD0fZMA:10 a=M51BFTxLslgA:10 a=Y6QHBM77bHVGD1eRjjcA:9 a=QEXdDO2ut3YA:10 X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 6 apr. 2020 kl. 20.34 skrev OGAWA Hirofumi : > (check-coding-systems-region "=?UTF-8?Q?=E3=81=82?=" nil '(utf-7-imap)) > => ((utf-7-imap 0)) > > It says "cannot encodable by utf-7-imap", so looks like ":charset-list > '(ascii)" doesn't work at least. Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: megamailservers.eu] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 6 apr. 2020 kl. 20.34 skrev OGAWA Hirofumi = : > (check-coding-systems-region "=E3=81=82" nil '(utf-7-imap)) > =3D> ((utf-7-imap 0)) >=20 > It says "cannot encodable by utf-7-imap", so looks like ":charset-list > '(ascii)" doesn't work at least. Thanks for catching that! The documentation doesn't explain the role of = :charset-list for a :coding-type of 'utf-8, but a minimal patch seemed = to work. Your example has been added to the test. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 09 Apr 2020 11:04:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: OGAWA Hirofumi Cc: 40407@debbugs.gnu.org, handa@gnu.org, Eli Zaretskii Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.15864302406227 (code B ref 40407); Thu, 09 Apr 2020 11:04:02 +0000 Received: (at 40407) by debbugs.gnu.org; 9 Apr 2020 11:04:00 +0000 Received: from localhost ([127.0.0.1]:53302 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jMUyW-0001cN-BY for submit@debbugs.gnu.org; Thu, 09 Apr 2020 07:04:00 -0400 Received: from mail85c50.megamailservers.eu ([91.136.10.95]:48606 helo=mail18c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jMUyT-0001c7-Sg for 40407@debbugs.gnu.org; Thu, 09 Apr 2020 07:03:59 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586430235; bh=CfRNZgFthgk4zFKJtbmVrYrpKdrHyCpzT8z4Cvkj5b0=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=jnXLB4o+i3sDAq2DaoXuED4tq7SrlFVn1o/97kLbivIkz1lA+oSySTb0FPQd7pgld fR59hLdu4Nb3XptTfGPQrl2fGkHoIXsQgihMgEnAXQTzObT9A7Zp3AoOrs6FYubldq FXsIK5qGJKj893l+h06HMl9o7Qoqe5El17ITDHz4= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail18c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 039B3pan002789; Thu, 9 Apr 2020 11:03:54 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <87lfn84g5h.fsf@mail.parknet.co.jp> Date: Thu, 9 Apr 2020 13:03:51 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <31072A44-E59A-47A9-83DE-CDD747BC6105@acm.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> <835zecsnip.fsf@gnu.org> <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> <83zhbor2ur.fsf@gnu.org> <329AEABE-0D33-4324-B697-FBEA9340E6BB@acm.org> <83wo6sr0s2.fsf@gnu.org> <83tv1wqxwh.fsf@gnu.org> <87lfn84g5h.fsf@mail.parknet.co.jp> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F25.5E8F00E3.00B7, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=K8Zc4BeI c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=r4AsZkI_4oUJEcBIHNwA:9 a=CjuIK1q_8ugA:10 X-Spam-Score: 1.2 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Eli, thank you very much for thinking about EOL conversion; obviously I didn't. I fixed a couple of glitches: a variable was used uninitialised, the logic didn't quite work for both unibyte and multib [...] Content analysis details: (1.2 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: megamailservers.eu] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.3 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Eli, thank you very much for thinking about EOL conversion; obviously I = didn't. I fixed a couple of glitches: a variable was used uninitialised, = the logic didn't quite work for both unibyte and multibyte, and unless = I'm mistaken it's LF that we should look for when encoding, not CR? = Anyway, hope you don't mind. The alternative would be to skip the no-conversion shortcut whenever EOL = conversion applies, but memchr is quite fast and it's all likely to be = in D1$ by that time. I also need to thank Ogawa-san again for drawing my attention to = check-coding-systems-region which crashes Emacs (SIGABRT) if given an = invalid encoding; fixed. An exhaustive search for other encodings that erroneously were marked as = ASCII compatible found only one (chinese-hz); now fixed. The calls to {ENCODE,DECODE}_FILE messily mutating the returned string = have now also been fixed, along with a potential GC bug. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Kazuhiro Ito Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 09 Apr 2020 14:10:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.15864413537788 (code B ref 40407); Thu, 09 Apr 2020 14:10:01 +0000 Received: (at 40407) by debbugs.gnu.org; 9 Apr 2020 14:09:13 +0000 Received: from localhost ([127.0.0.1]:54277 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jMXrl-00021Y-Lc for submit@debbugs.gnu.org; Thu, 09 Apr 2020 10:09:13 -0400 Received: from snd00010.auone-net.jp ([111.86.247.10]:44897 helo=dmta0007.auone-net.jp) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jMXrj-00021N-6t for 40407@debbugs.gnu.org; Thu, 09 Apr 2020 10:09:13 -0400 Received: from kzhr.d1.dion.ne.jp by dmta0007.auone-net.jp with ESMTP id <20200409140908385.CJCJ.97964.kzhr.d1.dion.ne.jp@dmta0007.auone-net.jp>; Thu, 9 Apr 2020 23:09:08 +0900 Date: Thu, 09 Apr 2020 23:09:06 +0900 Message-ID: <868sj4u4xp.wl--xmue@d1.dion.ne.jp> From: Kazuhiro Ito In-Reply-To: <31072A44-E59A-47A9-83DE-CDD747BC6105@acm.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> <835zecsnip.fsf@gnu.org> <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> <83zhbor2ur.fsf@gnu.org> <329AEABE-0D33-4324-B697-FBEA9340E6BB@acm.org> <83wo6sr0s2.fsf@gnu.org> <83tv1wqxwh.fsf@gnu.org> <87lfn84g5h.fsf@mail.parknet.co.jp> <31072A44-E59A-47A9-83DE-CDD747BC6105@acm.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?Q?Goj=C5=8D?=) APEL/10.8 EasyPG/1.0.0 Emacs/28.0 (x86_64-w64-mingw32) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On Thu, 09 Apr 2020 20:03:51 +0900, Mattias Engdeg=E5rd wrote: >=20 > Eli, thank you very much for thinking about EOL conversion; > obviously I didn't. I fixed a couple of glitches: a variable was > used uninitialised, the logic didn't quite work for both unibyte and > multibyte, and unless I'm mistaken it's LF that we should look for > when encoding, not CR? Anyway, hope you don't mind. I noticed that last-coding-system-used was not set when the fast path was used. (let ((string "ABCD\r\nEFGH") inhibit-eol-conversion) (decode-coding-string string 'raw-text-dos) (decode-coding-string string 'raw-text-unix) last-coding-system-used) -> raw-text-dos --=20 Kazuhiro Ito From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 09 Apr 2020 14:23:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Kazuhiro Ito Cc: 40407@debbugs.gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.15864421679038 (code B ref 40407); Thu, 09 Apr 2020 14:23:02 +0000 Received: (at 40407) by debbugs.gnu.org; 9 Apr 2020 14:22:47 +0000 Received: from localhost ([127.0.0.1]:54286 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jMY4s-0002Lh-Rq for submit@debbugs.gnu.org; Thu, 09 Apr 2020 10:22:47 -0400 Received: from mail150c50.megamailservers.eu ([91.136.10.160]:53808 helo=mail50c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jMY4q-0002LY-Vn for 40407@debbugs.gnu.org; Thu, 09 Apr 2020 10:22:45 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586442156; bh=PhFzGT48zJDW8cMZeX+3xd3nQb/jfPcdcTsT9rXTaYU=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=JwlJM+bRPCW+AGABkWOwMWuF9AySKHFePewXSQ1DMFqBQqxxAu9mbxPP1TGyTf2rT EGLa10TK5gWAyUscZ+HM0cOlEFU0X3YT2Zw4noWdQ9y9NoQSp+ZWzt+NMA4lpZlf5v oUJPtfQi0RygTCAWwFHKk+T8VkdAHiekFMxnyJqQ= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail50c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 039EMXj8020544; Thu, 9 Apr 2020 14:22:36 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <868sj4u4xp.wl--xmue@d1.dion.ne.jp> Date: Thu, 9 Apr 2020 16:22:33 +0200 Content-Transfer-Encoding: 7bit Message-Id: <877B7B84-7C9B-476E-8C30-2B2052055A88@acm.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> <835zecsnip.fsf@gnu.org> <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> <83zhbor2ur.fsf@gnu.org> <329AEABE-0D33-4324-B697-FBEA9340E6BB@acm.org> <83wo6sr0s2.fsf@gnu.org> <83tv1wqxwh.fsf@gnu.org> <87lfn84g5h.fsf@mail.parknet.co.jp> <31072A44-E59A-47A9-83DE-CDD747BC6105@acm.org> <868sj4u4xp.wl--xmue@d1.dion.ne.jp> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F28.5E8F2F8D.0089, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=NoevjPVJ c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=JXnT0QRYsxyK4pOQXz0A:9 a=CjuIK1q_8ugA:10 a=1Z973rmrBnYA:10 X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 9 apr. 2020 kl. 16.09 skrev Kazuhiro Ito : > I noticed that last-coding-system-used was not set when the fast path > was used. A keen observation -- thank you! Now fixed. From unknown Sun Jun 22 11:38:25 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Subject: bug#40407: closed (Re: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE) Message-ID: References: <115604B7-52EE-455E-BA13-999A861F802D@acm.org> <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> X-Gnu-PR-Message: they-closed 40407 X-Gnu-PR-Package: emacs X-Gnu-PR-Keywords: patch Reply-To: 40407@debbugs.gnu.org Date: Sat, 11 Apr 2020 15:10:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1586617802-13972-1" This is a multi-part message in MIME format... ------------=_1586617802-13972-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #40407: [PATCH] slow ENCODE_FILE and DECODE_FILE which was filed against the emacs package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 40407@debbugs.gnu.org. --=20 40407: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D40407 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1586617802-13972-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 40407-done) by debbugs.gnu.org; 11 Apr 2020 15:09:38 +0000 Received: from localhost ([127.0.0.1]:56965 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jNHlK-0003ci-5C for submit@debbugs.gnu.org; Sat, 11 Apr 2020 11:09:38 -0400 Received: from mail178c50.megamailservers.eu ([91.136.10.188]:60392 helo=mail70c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jNHlG-0003cX-4s for 40407-done@debbugs.gnu.org; Sat, 11 Apr 2020 11:09:37 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1586617772; bh=am/6DbN3iTxcizIKRmEFjsxUZe0ZLhnOoOPK6woibyk=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=m79q1WkWVBxEj03Mm4YnZKsQGB1/9gZxiaS6NV2zFPdBGqcxhFkMU+Z3yMzmd+yk+ AzOwHAn3/cHU6WHNU/emQ7bH1BBWD3IelJrCN9QMfYdmxOtFqStKtPsWciXseH7IRp RQJQR0WhNf4qfOT4bbnE4+7Fyk9/a0zNgk+EoHQs= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail70c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 03BF9R2q029554; Sat, 11 Apr 2020 15:09:29 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) Subject: Re: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <31072A44-E59A-47A9-83DE-CDD747BC6105@acm.org> Date: Sat, 11 Apr 2020 17:09:27 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <115604B7-52EE-455E-BA13-999A861F802D@acm.org> References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> <835zegwn9y.fsf@gnu.org> <87blo46i1j.fsf@mail.parknet.co.jp> <835zecsnip.fsf@gnu.org> <5D4B264A-C43B-4CEE-91DE-760AEBE80671@acm.org> <83zhbor2ur.fsf@gnu.org> <329AEABE-0D33-4324-B697-FBEA9340E6BB@acm.org> <83wo6sr0s2.fsf@gnu.org> <83tv1wqxwh.fsf@gnu.org> <87lfn84g5h.fsf@mail.parknet.co.jp> <31072A44-E59A-47A9-83DE-CDD747BC6105@acm.org> To: 40407-done@debbugs.gnu.org X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F23.5E91DD82.002C, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=OKBZIhSB c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=-t2cK8KLmQkA1kf4DskA:9 a=CjuIK1q_8ugA:10 X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 40407-done Cc: Kenichi Handa , Eli Zaretskii , OGAWA Hirofumi X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) I think we are done here -- now that all calls to ENCODE_FILE and = DECODE_FILE have been checked to be safe for no-copy semantics, there is = no need to copy in the ASCII identity case; pushed to master. ------------=_1586617802-13972-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 3 Apr 2020 16:10:41 +0000 Received: from localhost ([127.0.0.1]:43066 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKOu1-0005QV-JR for submit@debbugs.gnu.org; Fri, 03 Apr 2020 12:10:41 -0400 Received: from lists.gnu.org ([209.51.188.17]:50276) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jKOty-0005QD-4r for submit@debbugs.gnu.org; Fri, 03 Apr 2020 12:10:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34168) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jKOtw-0005PP-2g for bug-gnu-emacs@gnu.org; Fri, 03 Apr 2020 12:10:38 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: * X-Spam-Status: No, score=1.1 required=5.0 tests=BAYES_50,KHOP_HELO_FCRDNS, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jKOtu-0000dC-TK for bug-gnu-emacs@gnu.org; Fri, 03 Apr 2020 12:10:35 -0400 Received: from mail1447c50.megamailservers.eu ([91.136.14.47]:37348 helo=mail265c50.megamailservers.eu) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1jKOtu-0000Mj-CX for bug-gnu-emacs@gnu.org; Fri, 03 Apr 2020 12:10:34 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1585923525; bh=cmmkHNgz5EDJE04LBORiH5MIRDrgZwXY9fIVJzfO3n4=; h=From:Subject:Date:To:From; b=bFc72WnjN+V4z9VqCidsNLjVMlz5jcSqXEPv3oVVQ2VaR8D0YO20MftrKHdQr47Hd WshkA2jSx6M6KDRDRrlxofuaxS9Tzgf9SgtJ6JN0Y4TuqYKNuZZdSeyawhXJzjPimK /DHagAqlJYTzcQYQMusLLqtzzGes3caQ11D1qD6s= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail265c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 033EIhIi027027 for ; Fri, 3 Apr 2020 14:18:45 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Content-Type: multipart/mixed; boundary="Apple-Mail=_DA8A3D29-F208-4082-ACC1-88BC9E8B1B54" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) Subject: [PATCH] slow ENCODE_FILE and DECODE_FILE Message-Id: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> Date: Fri, 3 Apr 2020 16:18:43 +0200 To: bug-gnu-emacs@gnu.org X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F18.5E874595.006F, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=D5w51cZj c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=M51BFTxLslgA:10 a=KYCzAwbNy5S3BsPW-U0A:9 a=CjuIK1q_8ugA:10 a=TqocWJiu5xKD2_rvUcIA:9 a=B2y7HmGcmWMA:10 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-Received-From: 91.136.14.47 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --Apple-Mail=_DA8A3D29-F208-4082-ACC1-88BC9E8B1B54 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii ENCODE_FILE and DECODE_FILE turn out to be surprisingly slow, and = allocate copious amounts of memory, to the point that they often turn up = in both memory and cpu profiles. (This is on macOS; I haven't checked = the situation elsewhere.) For instance, a single call to file-relative-name, with ASCII-only = arguments, manages to allocate 140 KiB. There are several conversion = steps each involving creating temporary buffers as well as the = compilation and execution of very large "quick-check" regexps. Example: (progn (require 'profiler) (profiler-reset) (garbage-collect) (profiler-start 'mem) (file-relative-name "abc") (profiler-stop) (profiler-report)) This applies to just about every function dealing with files or file = names. The attached patch is somewhat conservatively written but at least a = starting point. It reduces the memory consumption by file-relative-name = in the example above to zero. Perhaps we can assume that file names = codings are always ASCII-compatible; if so, the shortcut can be taken in = encode_file_name and decode_file_name directly. There is already a hack in encode_file_name that assumes that no unibyte = string ever needs encoding; if so, the shortcut could perhaps be = extended to decode_file_name and simplified. --Apple-Mail=_DA8A3D29-F208-4082-ACC1-88BC9E8B1B54 Content-Disposition: attachment; filename=0001-Avoid-expensive-recoding-for-ASCII-identity-cases.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Avoid-expensive-recoding-for-ASCII-identity-cases.patch" Content-Transfer-Encoding: quoted-printable =46rom=20dca8b997d3e7c36667e12f1c77fc6ffed7d8f555=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Fri,=203=20Apr=202020=2016:01:01=20+0200=0A= Subject:=20[PATCH]=20Avoid=20expensive=20recoding=20for=20ASCII=20= identity=20cases=0A=0AOptimise=20for=20the=20common=20case=20of=20= encoding=20or=20decoding=20an=20ASCII-only=0Astring=20using=20an=20= ASCII-compatible=20coding,=20for=20file=20names=20in=20particular.=0A=0A= *=20src/coding.c=20(string_ascii_p):=20New=20function.=0A= (code_convert_string):=20Return=20the=20input=20string=20for=20= ASCII-only=20inputs=0Aand=20ASCII-compatible=20codings.=0A---=0A=20= src/coding.c=20|=2023=20++++++++++++++++++++++-=0A=201=20file=20changed,=20= 22=20insertions(+),=201=20deletion(-)=0A=0Adiff=20--git=20a/src/coding.c=20= b/src/coding.c=0Aindex=200bea2a0c2b..9a17fafb05=20100644=0A---=20= a/src/coding.c=0A+++=20b/src/coding.c=0A@@=20-9471,6=20+9471,17=20@@=20= used=20(which=20may=20be=20different=20from=20CODING-SYSTEM=20if=20= CODING-SYSTEM=20is=0A=20=20=20return=20code_convert_region=20(start,=20= end,=20coding_system,=20destination,=201,=200);=0A=20}=0A=20=0A+/*=20= Whether=20a=20(unibyte)=20string=20only=20contains=20chars=20in=20the=20= 0..127=20range.=20=20*/=0A+static=20bool=0A+string_ascii_p=20= (Lisp_Object=20str)=0A+{=0A+=20=20ptrdiff_t=20nbytes=20=3D=20SBYTES=20= (str);=0A+=20=20for=20(ptrdiff_t=20i=20=3D=200;=20i=20<=20nbytes;=20i++)=0A= +=20=20=20=20if=20(SREF=20(str,=20i)=20>=20127)=0A+=20=20=20=20=20=20= return=20false;=0A+=20=20return=20true;=0A+}=0A+=0A=20Lisp_Object=0A=20= code_convert_string=20(Lisp_Object=20string,=20Lisp_Object=20= coding_system,=0A=20=09=09=20=20=20=20=20Lisp_Object=20dst_object,=20= bool=20encodep,=20bool=20nocopy,=0A@@=20-9502,7=20+9513,17=20@@=20= code_convert_string=20(Lisp_Object=20string,=20Lisp_Object=20= coding_system,=0A=20=20=20chars=20=3D=20SCHARS=20(string);=0A=20=20=20= bytes=20=3D=20SBYTES=20(string);=0A=20=0A-=20=20if=20(BUFFERP=20= (dst_object))=0A+=20=20if=20(EQ=20(dst_object,=20Qt))=0A+=20=20=20=20{=0A= +=20=20=20=20=20=20/*=20Fast=20path=20for=20ASCII-only=20input=20and=20= an=20ASCII-compatible=20coding:=0A+=20=20=20=20=20=20=20=20=20act=20as=20= identity.=20=20*/=0A+=20=20=20=20=20=20Lisp_Object=20attrs=20=3D=20= CODING_ID_ATTRS=20(coding.id);=0A+=20=20=20=20=20=20if=20(!=20NILP=20= (CODING_ATTR_ASCII_COMPAT=20(attrs))=0A+=20=20=20=20=20=20=20=20=20=20&&=20= (STRING_MULTIBYTE=20(string)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= ?=20(chars=20=3D=3D=20bytes)=20:=20string_ascii_p=20(string)))=0A+=20=20=20= =20=20=20=20=20return=20string;=0A+=20=20=20=20}=0A+=20=20else=20if=20= (BUFFERP=20(dst_object))=0A=20=20=20=20=20{=0A=20=20=20=20=20=20=20= struct=20buffer=20*buf=20=3D=20XBUFFER=20(dst_object);=0A=20=20=20=20=20=20= =20ptrdiff_t=20buf_pt=20=3D=20BUF_PT=20(buf);=0A--=20=0A2.21.1=20(Apple=20= Git-122.3)=0A=0A= --Apple-Mail=_DA8A3D29-F208-4082-ACC1-88BC9E8B1B54-- ------------=_1586617802-13972-1-- From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE References: <805F9723-8298-4FD7-A47B-1E683721A5B0@acm.org> Resent-From: handa Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 16 Apr 2020 13:12:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org, mattiase@acm.org, hirofumi@mail.parknet.co.jp, handa@gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.158704271630040 (code B ref 40407); Thu, 16 Apr 2020 13:12:02 +0000 Received: (at 40407) by debbugs.gnu.org; 16 Apr 2020 13:11:56 +0000 Received: from localhost ([127.0.0.1]:37733 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jP4JA-0007oS-86 for submit@debbugs.gnu.org; Thu, 16 Apr 2020 09:11:56 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44517) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jP4J8-0007oF-B6 for 40407@debbugs.gnu.org; Thu, 16 Apr 2020 09:11:55 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:59578) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jP4J2-00006n-64; Thu, 16 Apr 2020 09:11:48 -0400 Received: from fl1-60-236-80-213.iba.mesh.ad.jp ([60.236.80.213]:56581 helo=shatin) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jP4Iy-0003K8-9d; Thu, 16 Apr 2020 09:11:47 -0400 Received: from handa by shatin with local (Exim 4.90_1) (envelope-from ) id 1jP4Is-000DDI-02; Thu, 16 Apr 2020 22:11:38 +0900 From: handa In-Reply-To: <835zecsnip.fsf@gnu.org> (message from Eli Zaretskii on Mon, 06 Apr 2020 17:21:34 +0300) Date: Thu, 16 Apr 2020 22:11:37 +0900 Message-ID: <87mu7b60dy.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -1.5 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.5 (--) In article <83wo6sr0s2.fsf@gnu.org>, Eli Zaretskii writes: > > > I don't think 'charset' is the right type for this encoding (any > > > reason why you've chosen it?), but I will let Handa-san comment. > > > > We could use 'raw-text' as well but that implies that any byte value could be part of an utf-7[-imap] text, which is incorrect. > > In fact, utf-7-imap only uses codes 0x20-0x7e (utf-7 is allowed to use a few C0 controls too, as mentioned). I don't remember why utf-7 has coding type utf-8. As main decoding/encoding routines of utf-7 are by Lisp (in utf-7.el which was contributed not by me), perhaps, any other ASCII transparent types was ok. It seems that we should introduce a new type for such a coding system. > > Arguably the heuristics of define-coding-system-internal are somewhat inscrutable. There seems to be leaks between layers -- ascii-compatible-p is an end-to-end property and cannot really be set the way it is by that function. But since it is, fixing it afterwards should be the correct way. > I prefer to wait for Handa-san's response, and meanwhile install the > least disruptive change, which just fixes the one aspect that got > broken. Call me a coward, if you wish. I think Mattias' patch is good. --- K. Handa handa@gnu.org From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 16 Apr 2020 13:45:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: handa Cc: 40407@debbugs.gnu.org, mattiase@acm.org, hirofumi@mail.parknet.co.jp, handa@gnu.org Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.1587044670510 (code B ref 40407); Thu, 16 Apr 2020 13:45:01 +0000 Received: (at 40407) by debbugs.gnu.org; 16 Apr 2020 13:44:30 +0000 Received: from localhost ([127.0.0.1]:37751 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jP4og-000089-GW for submit@debbugs.gnu.org; Thu, 16 Apr 2020 09:44:30 -0400 Received: from eggs.gnu.org ([209.51.188.92]:48117) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jP4of-00007y-2f for 40407@debbugs.gnu.org; Thu, 16 Apr 2020 09:44:29 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:60292) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jP4oY-0006U3-UL; Thu, 16 Apr 2020 09:44:22 -0400 Received: from [176.228.60.248] (port=4312 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jP4oX-0007ih-UG; Thu, 16 Apr 2020 09:44:22 -0400 Date: Thu, 16 Apr 2020 16:44:07 +0300 Message-Id: <83k12feeag.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <87mu7b60dy.fsf@gnu.org> (message from handa on Thu, 16 Apr 2020 22:11:37 +0900) References: <87mu7b60dy.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -1.5 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.5 (--) > From: handa > Cc: hirofumi@mail.parknet.co.jp, mattiase@acm.org, 40407@debbugs.gnu.org, > handa@gnu.org > Date: Thu, 16 Apr 2020 22:11:37 +0900 > > > I prefer to wait for Handa-san's response, and meanwhile install the > > least disruptive change, which just fixes the one aspect that got > > broken. Call me a coward, if you wish. > > I think Mattias' patch is good. Including making the coding-type of utf-7 'charset'? I think that didn't work, see an earlier message in this discussion: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40407#107 And could you please tell more about the conditions for a coding-system to be a candidate for 'charset' coding-type? what exactly is such a coding-system supposed to do/provide/support? I don't think this is documented anywhere, and most/all of the coding-systems that have this type are simple single-byte encodings that support just 256 codepoints (which is not what utf-7 is). Thanks. From unknown Sun Jun 22 11:38:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 16 Apr 2020 14:00:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 40407 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 40407@debbugs.gnu.org, handa , hirofumi@mail.parknet.co.jp Received: via spool by 40407-submit@debbugs.gnu.org id=B40407.15870455712954 (code B ref 40407); Thu, 16 Apr 2020 14:00:02 +0000 Received: (at 40407) by debbugs.gnu.org; 16 Apr 2020 13:59:31 +0000 Received: from localhost ([127.0.0.1]:39019 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jP53C-0000lZ-OX for submit@debbugs.gnu.org; Thu, 16 Apr 2020 09:59:30 -0400 Received: from mail1476c50.megamailservers.eu ([91.136.14.76]:56652 helo=mail118c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jP538-0000lJ-Eg for 40407@debbugs.gnu.org; Thu, 16 Apr 2020 09:59:29 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1587045559; bh=vO/nz92JmcjmWEmLqkOAcSzWkK380x80hQIX63/Xxpg=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=gQAIRn6ViOpHwlHEeTKPtAiiagbBqm+2NChR53iDMDh/8BAQ3DCiwSjEHUAQoWAMV 93bChd1rUAh4fOw7XwblvgikaexMbSVvdU5BHnt6EVRlidF1MTaWIaYrcT+8jQ1W4z RCa4gDeOaNcqfdsh4Em+HAm+eQnMuQqgOhMuM42Y= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail118c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 03GDxGA5018943; Thu, 16 Apr 2020 13:59:18 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <83k12feeag.fsf@gnu.org> Date: Thu, 16 Apr 2020 15:59:16 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <87mu7b60dy.fsf@gnu.org> <83k12feeag.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.14) X-CTCH-RefID: str=0001.0A782F29.5E986464.0087, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=KaGsTjQD c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=_30Xwveg8VLLKIQIrT0A:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 16 apr. 2020 kl. 15.44 skrev Eli Zaretskii : > Including making the coding-type of utf-7 'charset'? I think that > didn't work, see an earlier message in this discussion: >=20 > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D40407#107 Indeed, 'charset' was probably not the right choice. Perhaps we should = revisit the decision to set :ascii-compatible-p automatically, since it = is an end-to-end property that depends on the semantics of all parts in = the coding chain, including the provided conversion functions and = translation tables. The caller is in a much better position to know = whether that property should be set.