From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 17 10:12:09 2020 Received: (at submit) by debbugs.gnu.org; 17 Aug 2020 14:12:09 +0000 Received: from localhost ([127.0.0.1]:32810 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7frt-0006i7-FR for submit@debbugs.gnu.org; Mon, 17 Aug 2020 10:12:09 -0400 Received: from lists.gnu.org ([209.51.188.17]:52844) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7fro-0006hu-QN for submit@debbugs.gnu.org; Mon, 17 Aug 2020 10:12:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33812) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k7fro-0006OO-M3 for bug-gnu-emacs@gnu.org; Mon, 17 Aug 2020 10:12:04 -0400 Received: from mail208c50.megamailservers.eu ([91.136.10.218]:34164 helo=mail194c50.megamailservers.eu) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k7frm-0002nY-Qu for bug-gnu-emacs@gnu.org; Mon, 17 Aug 2020 10:12:04 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1597673519; bh=HPu8Tct+Ww9MiEKyK52ACctlM3C6v2af/KpPdeCECCU=; h=From:Subject:Date:Cc:To:From; b=FfWHwdtHpF0Sh94925eMLzwvbz68oEC03IKkQjcS1t8//HMGaEkrd/QsNvNccCKl7 +O+4HQqOre4P3bowtax/Zwi7qpKK/M9gaCZfPEZ8AIvbE+IASR8i3tv9jhW9oe+H2r 9lbaIgdSgyawl8Mzk80uISfMWq4MUNpG8JHFrTVk= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail194c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07HEBqpD028942; Mon, 17 Aug 2020 14:11:58 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Content-Type: multipart/mixed; boundary="Apple-Mail=_7D2C7DE5-27F7-4A73-B54D-103F8253896E" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: [PATCH] Non-Unicode frame title crashes Emacs on macOS Message-Id: Date: Mon, 17 Aug 2020 16:11:52 +0200 To: bug-gnu-emacs@gnu.org X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F22.5F3A902F.000A, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=KsozJleN c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=M51BFTxLslgA:10 a=bjBGVWMq7zkxmT2rhq8A:9 a=CjuIK1q_8ugA:10 a=98GMzcCnwbVNXY-RbesA:9 a=B2y7HmGcmWMA:10 X-Origin-Country: SE Received-SPF: softfail client-ip=91.136.10.218; envelope-from=mattiase@acm.org; helo=mail194c50.megamailservers.eu X-detected-operating-system: by eggs.gnu.org: First seen = 2020/08/17 10:12:00 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x (no timestamps) [generic] X-Spam_score_int: -11 X-Spam_score: -1.2 X-Spam_bar: - X-Spam_report: (-1.2 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, URIBL_BLOCKED=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit Cc: Alan Third X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --Apple-Mail=_7D2C7DE5-27F7-4A73-B54D-103F8253896E Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Setting a frame title that contains non-Unicode characters causes a = crash in the NS backend. (Other platforms may or may not deal with it = appropriately -- if you have the opportunity to test, please report.) Since the title is typically derived from the buffer name, this is = easily reproduced by (rename-buffer "n\351") The crash occurs in ns_set_name_internal: encoded_name =3D ENCODE_UTF_8 (name); Here encoded_name is still "n\351" (a 2 byte unibyte string), because = the \351 couldn't be encoded. str =3D [NSString stringWithUTF8String: SSDATA (encoded_name)]; Now str is nil since "n\351" isn't valid UTF-8. [[view window] setTitle: str]; Here we get an NS crash because nil isn't a valid setTitle: argument. Proposed patch attached. I didn't find any obvious way to encode an = Emacs string into valid UTF-8 (with bad parts replaced) so a new = function was written. The corresponding Lisp function was marked = internal because it's only there for test purposes, but it could of = course be promoted to non-internal if someone wants it. --Apple-Mail=_7D2C7DE5-27F7-4A73-B54D-103F8253896E Content-Disposition: attachment; filename=0001-Fix-NS-crash-on-invalid-frame-title-string.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Fix-NS-crash-on-invalid-frame-title-string.patch" Content-Transfer-Encoding: quoted-printable =46rom=2013b43b826a7f7f539484babc275cd9a19a64da9e=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Mon,=2017=20Aug=202020=2015:37:33=20+0200=0A= Subject:=20[PATCH]=20Fix=20NS=20crash=20on=20invalid=20frame=20title=20= string=0A=0AInstead=20of=20blindly=20assuming=20that=20Emacs=20strings=20= are=20valid=20UTF-8,=20which=0Athey=20are=20not,=20convert=20them=20in=20= a=20more=20careful=20way=20using=20U+FFFD=20for=0Areplacing=20invalid=20= values.=0A=0A*=20src/coding.c=20(string_to_valid_utf_8)=0A= Finternal_string_to_valid_utf_8):=20New=20functions.=0A*=20src/coding.h:=20= Prototype.=0A*=20src/nsfns.m=20(ns_set_name_internal):=20Use=20= string_to_valid_utf_8.=0A*=20test/src/coding-tests.el=20= (coding-string-to-valid-utf-8):=20New=20test.=0A---=0A=20src/coding.c=20=20= =20=20=20=20=20=20=20=20=20=20=20|=2056=20= ++++++++++++++++++++++++++++++++++++++++=0A=20src/coding.h=20=20=20=20=20= =20=20=20=20=20=20=20=20|=20=202=20++=0A=20src/nsfns.m=20=20=20=20=20=20=20= =20=20=20=20=20=20=20|=20=206=20++---=0A=20test/src/coding-tests.el=20|=20= 14=20++++++++++=0A=204=20files=20changed,=2074=20insertions(+),=204=20= deletions(-)=0A=0Adiff=20--git=20a/src/coding.c=20b/src/coding.c=0Aindex=20= 51bd441de9..65493b07ac=20100644=0A---=20a/src/coding.c=0A+++=20= b/src/coding.c=0A@@=20-9564,6=20+9564,61=20@@=20= code_convert_string_norecord=20(Lisp_Object=20string,=20Lisp_Object=20= coding_system,=0A=20}=0A=20=0A=20=0A+/*=20Convert=20STRING=20to=20a=20= pure=20Unicode=20string.=0A+=20=20=20Non-Unicode=20values=20are=20= substituted=20with=20U+FFFD=20REPLACEMENT=20CHARACTER.=0A+=20=20=20= Return=20a=20unibyte=20or=20multibyte=20string,=20possibly=20STRING=20= itself,=0A+=20=20=20whose=20SDATA=20is=20guaranteed=20to=20be=20UTF-8.=20= =20*/=0A+Lisp_Object=0A+string_to_valid_utf_8=20(Lisp_Object=20string)=0A= +{=0A+=20=20if=20(string_ascii_p=20(string))=0A+=20=20=20=20return=20= string;=0A+=20=20if=20(!STRING_MULTIBYTE=20(string))=0A+=20=20=20=20= string=20=3D=20string_to_multibyte=20(string);=0A+=0A+=20=20/*=20Now=20= STRING=20is=20multibyte.=20=20*/=0A+=20=20unsigned=20char=20*buf=20=3D=20= NULL;=0A+=20=20unsigned=20char=20*d=20=3D=20NULL;=0A+=20=20unsigned=20= char=20*s=20=3D=20SDATA=20(string);=0A+=20=20unsigned=20char=20*end=20=3D=20= s=20+=20SBYTES=20(string);=0A+=20=20while=20(s=20<=20end)=0A+=20=20=20=20= {=0A+=20=20=20=20=20=20int=20len;=0A+=20=20=20=20=20=20int=20c=20=3D=20= string_char_and_length=20(s,=20&len);=0A+=20=20=20=20=20=20if=20(c=20>=20= 0x10ffff=20||=20char_surrogate_p=20(c))=0A+=20=20=20=20=20=20=20=20{=0A+=20= =20=20=20=20=20=20=20=20=20/*=20Not=20valid=20for=20UTF-8.=20=20*/=0A+=20= =20=20=20=20=20=20=20=20=20if=20(!d)=0A+=20=20=20=20=20=20=20=20=20=20=20= =20{=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20buf=20=3D=20xmalloc=20= (4=20*=20SCHARS=20(string));=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= ptrdiff_t=20n=20=3D=20s=20-=20SDATA=20(string);=0A+=20=20=20=20=20=20=20=20= =20=20=20=20=20=20memcpy=20(buf,=20SDATA=20(string),=20n);=0A+=20=20=20=20= =20=20=20=20=20=20=20=20=20=20d=20=3D=20buf=20+=20n;=0A+=20=20=20=20=20=20= =20=20=20=20=20=20}=0A+=20=20=20=20=20=20=20=20=20=20*d++=20=3D=200357;=20= =20=20=20=20=20=20=20=20=20/*=20Use=20U+FFFD.=20=20*/=0A+=20=20=20=20=20=20= =20=20=20=20*d++=20=3D=200277;=0A+=20=20=20=20=20=20=20=20=20=20*d++=20=3D= =200275;=0A+=20=20=20=20=20=20=20=20=20=20s=20+=3D=20len;=0A+=20=20=20=20= =20=20=20=20}=0A+=20=20=20=20=20=20else=20if=20(d)=0A+=20=20=20=20=20=20=20= =20do=20*d++=20=3D=20*s++;=20while=20(--len);=0A+=20=20=20=20=20=20else=0A= +=20=20=20=20=20=20=20=20s=20+=3D=20len;=0A+=20=20=20=20}=0A+=20=20= Lisp_Object=20ret=20=3D=20buf=20?=20make_multibyte_string=20(buf,=20= SCHARS=20(string),=20d=20-=20buf)=0A+=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20:=20string;=0A+=20=20xfree=20(buf);=0A= +=20=20return=20ret;=0A+}=0A+=0A+DEFUN=20= ("internal-string-to-valid-utf-8",=20Finternal_string_to_valid_utf_8,=0A= +=20=20=20=20=20=20=20Sinternal_string_to_valid_utf_8,=201,=201,=200,=0A= +=20=20=20=20=20=20=20doc:=20=20/*=20Internal=20use=20only.=20=20*/)=0A+=20= =20=20=20=20(Lisp_Object=20string)=0A+{=0A+=20=20return=20= string_to_valid_utf_8=20(string);=0A+}=0A+=0A=20/*=20Return=20the=20gap=20= address=20of=20BUFFER.=20=20If=20the=20gap=20size=20is=20less=20than=0A=20= =20=20=20NBYTES,=20enlarge=20the=20gap=20in=20advance.=20=20*/=0A=20=0A= @@=20-11811,6=20+11866,7=20@@=20syms_of_coding=20(void)=0A=20=20=20= defsubr=20(&Scoding_system_aliases);=0A=20=20=20defsubr=20= (&Scoding_system_eol_type);=0A=20=20=20defsubr=20= (&Scoding_system_priority_list);=0A+=20=20defsubr=20= (&Sinternal_string_to_valid_utf_8);=0A=20=0A=20=20=20DEFVAR_LISP=20= ("coding-system-list",=20Vcoding_system_list,=0A=20=09=20=20=20=20=20=20=20= doc:=20/*=20List=20of=20coding=20systems.=0Adiff=20--git=20= a/src/coding.h=20b/src/coding.h=0Aindex=20c2a7b2a00f..98f00a1731=20= 100644=0A---=20a/src/coding.h=0A+++=20b/src/coding.h=0A@@=20-709,6=20= +709,8=20@@=20#define=20UTF_16_LOW_SURROGATE_P(val)=20\=0A=20extern=20= void=20encode_coding_object=20(struct=20coding_system=20*,=0A=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20Lisp_Object,=20ptrdiff_t,=20ptrdiff_t,=0A=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20ptrdiff_t,=20ptrdiff_t,=20Lisp_Object);=0A+extern=20= Lisp_Object=20string_to_valid_utf_8=20(Lisp_Object);=0A+=0A=20/*=20= Defined=20in=20this=20file.=20=20*/=0A=20INLINE=20int=20= surrogates_to_codepoint=20(int,=20int);=0A=20=0Adiff=20--git=20= a/src/nsfns.m=20b/src/nsfns.m=0Aindex=20628233ea0d..3e84568991=20100644=0A= ---=20a/src/nsfns.m=0A+++=20b/src/nsfns.m=0A@@=20-405,9=20+405,7=20@@=20= Turn=20the=20input=20menu=20(an=20NSMenu)=20into=20a=20lisp=20list=20for=20= tracking=20on=20lisp=20side.=0A=20=20=20NSString=20*str;=0A=20=20=20= NSView=20*view=20=3D=20FRAME_NS_VIEW=20(f);=0A=20=0A-=0A-=20=20= encoded_name=20=3D=20ENCODE_UTF_8=20(name);=0A-=0A+=20=20encoded_name=20= =3D=20string_to_valid_utf_8=20(name);=0A=20=20=20str=20=3D=20[NSString=20= stringWithUTF8String:=20SSDATA=20(encoded_name)];=0A=20=0A=20=0A@@=20= -418,7=20+416,7=20@@=20Turn=20the=20input=20menu=20(an=20NSMenu)=20into=20= a=20lisp=20list=20for=20tracking=20on=20lisp=20side.=0A=20=20=20if=20= (!STRINGP=20(f->icon_name))=0A=20=20=20=20=20encoded_icon_name=20=3D=20= encoded_name;=0A=20=20=20else=0A-=20=20=20=20encoded_icon_name=20=3D=20= ENCODE_UTF_8=20(f->icon_name);=0A+=20=20=20=20encoded_icon_name=20=3D=20= string_to_valid_utf_8=20(f->icon_name);=0A=20=0A=20=20=20str=20=3D=20= [NSString=20stringWithUTF8String:=20SSDATA=20(encoded_icon_name)];=0A=20=0A= diff=20--git=20a/test/src/coding-tests.el=20b/test/src/coding-tests.el=0A= index=20c438ae22ce..f53f63eb48=20100644=0A---=20= a/test/src/coding-tests.el=0A+++=20b/test/src/coding-tests.el=0A@@=20= -429,6=20+429,20=20@@=20coding-check-coding-systems-region=0A=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20'((iso-latin-1=203)=20= (us-ascii=201=203))))=0A=20=20=20(should-error=20= (check-coding-systems-region=20"=C3=A5"=20nil=20'(bad-coding-system))))=0A= =20=0A+(ert-deftest=20coding-string-to-valid-utf-8=20()=0A+=20=20(let=20= ((empty=20"")=0A+=20=20=20=20=20=20=20=20(valid-uni=20"Alpha")=0A+=20=20=20= =20=20=20=20=20(valid-multi=20"m\001=C3=BC=E2=88=AB=F0=9D=94=BB"))=0A+=20= =20=20=20(should=20(eq=20(internal-string-to-valid-utf-8=20empty)=20= empty))=0A+=20=20=20=20(should=20(eq=20(internal-string-to-valid-utf-8=20= valid-uni)=20valid-uni))=0A+=20=20=20=20(should=20(eq=20= (internal-string-to-valid-utf-8=20valid-multi)=20valid-multi)))=0A+=20=20= (should=20(equal=20(internal-string-to-valid-utf-8=20= "unpaired\ud9a3surrogate")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20"unpaired\ufffdsurrogate"))=0A+=20=20(should=20(equal=20= (internal-string-to-valid-utf-8=20"raw\200\377bytes")=0A+=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20"raw\ufffd\ufffdbytes"))=0A+=20=20= (should=20(equal=20(internal-string-to-valid-utf-8=20= "all=C2=A7\300at\udffeonce")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20"all=C2=A7\ufffdat\ufffdonce")))=0A+=0A=20;;=20Local=20= Variables:=0A=20;;=20byte-compile-warnings:=20(not=20obsolete)=0A=20;;=20= End:=0A--=20=0A2.21.1=20(Apple=20Git-122.3)=0A=0A= --Apple-Mail=_7D2C7DE5-27F7-4A73-B54D-103F8253896E-- From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 17 10:54:56 2020 Received: (at 42904) by debbugs.gnu.org; 17 Aug 2020 14:54:56 +0000 Received: from localhost ([127.0.0.1]:32862 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7gXI-0007jo-5Z for submit@debbugs.gnu.org; Mon, 17 Aug 2020 10:54:56 -0400 Received: from mail-lj1-f172.google.com ([209.85.208.172]:34236) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7gXF-0007jY-Cs for 42904@debbugs.gnu.org; Mon, 17 Aug 2020 10:54:54 -0400 Received: by mail-lj1-f172.google.com with SMTP id y2so8798862ljc.1 for <42904@debbugs.gnu.org>; Mon, 17 Aug 2020 07:54:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-transfer-encoding; bh=Lj2YhiRdO34qL8gccD8jljbkq17URlCiKag7gB2cNi0=; b=oy1VSAfgonqthptxlM4iAmivg5GBjdKyT9w+A5lMkNS7QvFIWUcQF0hSzE7PAk5KGR XrH7zOsEzbtga9ShYoxvwtl3rxeu/kyoDtGzpfoWv0xRM3kI6hu45GEsg8jOXR9q+4Xc tYmGjaxnigGTy/peTZ14zW6ilHDqQUL+zeHQsxaO7kvqtevAmfEx0sw9WqS4CkuDhDIB wFtPSmkkhvuy3MnTttchOxFtgO9eHJm+T+ehYGD0FpokousevKFv0zoTdotWTVGmsJkc uuHjX2D5o26ptk5D8aUjqIgVwkO+s3C7d06u6/L8b9SDv+MSwVAhqn8tTb0ii6+r02FX mZkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version:content-transfer-encoding; bh=Lj2YhiRdO34qL8gccD8jljbkq17URlCiKag7gB2cNi0=; b=ex2q+291o+fgq03MPKD6faprxBbjr4vZbKf/JF+gIGAd1x8ARcOuQMwFyo05C6jGyz Pm6lYmQgaswYn7XgV8P3jKhGf8j9cqkMejK8uWWtnX487JCYEfeH0LTkiaDdbM+E6QAK 0wU9DhHYsmGHcAlJkn/xzAs6iytupvIq+MfXQxMUoxWZ8qoJs5tc8hm9beC8bZCPDFPm RAACBuyhCHWrbLpaA+2oJimSuWlHI/YTBexGbzCx70qgjNjWlzQkUEqMI3ZHqFdL06Mz y49+RFDnSrQRxrFqAlgAv7Q713gzY6gG29cvRuA2DYpjWGXT4yP6RsMQ/TbeYLhRyXtv fgFw== X-Gm-Message-State: AOAM530YndWTGiGDA+JMzKb/4jUqAHwX4VOGFWOYnpxPcggjXPGBtIdp 3FBf92CTaYS18WAjluzdliw= X-Google-Smtp-Source: ABdhPJxsMhW0WwOLHwNVxTduePfpcan1+iCqVXc/QcgustQUAwKgRFOx51NmTicM0IDiafYhoc+0Rw== X-Received: by 2002:a2e:5852:: with SMTP id x18mr7136254ljd.132.1597676087163; Mon, 17 Aug 2020 07:54:47 -0700 (PDT) Received: from 10-1-1-139.office.area (102-26-207-82.ip.ukrtel.net. [82.207.26.102]) by smtp.gmail.com with ESMTPSA id u9sm4995827ljh.20.2020.08.17.07.54.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Aug 2020 07:54:46 -0700 (PDT) From: Andrii Kolomoiets To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS References: Date: Mon, 17 Aug 2020 17:54:37 +0300 In-Reply-To: ("Mattias =?utf-8?Q?Engdeg=C3=A5rd=22's?= message of "Mon, 17 Aug 2020 16:11:52 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (darwin) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, Alan Third X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Mattias Engdeg=C3=A5rd writes: > Setting a frame title that contains non-Unicode characters causes a crash= in the NS backend. (Other platforms may or may not deal with it appropriat= ely -- if you have the opportunity to test, please report.) > > Since the title is typically derived from the buffer name, this is easily= reproduced by > > (rename-buffer "n\351") Looks like this is related to bug#41184 From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 17 11:55:37 2020 Received: (at 42904) by debbugs.gnu.org; 17 Aug 2020 15:55:37 +0000 Received: from localhost ([127.0.0.1]:32991 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7hU1-00033o-JV for submit@debbugs.gnu.org; Mon, 17 Aug 2020 11:55:37 -0400 Received: from mail1451c50.megamailservers.eu ([91.136.14.51]:45016 helo=mail266c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7hTy-00033Q-SY; Mon, 17 Aug 2020 11:55:35 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1597679727; bh=6Dpgp/VkxhRU6gWetNZZC0pO2xeFXHuCeLlsTwv4MyA=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=gpllAphQfK/M8EbAVUWt9nnhXkLlPNEUFNKf/5WU4gDKBUt58mKn3yqmTuNx+FFSa VK31vPNIUvk7BIQpNEJElziFlNy2BA8Mn66fKjf7T+Zau09o6BKm3aURbJhQbNvjO3 rXIqvNvfdapMkEVon18fZQ18YHRkaKhmlcvaadjA= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail266c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07HFtOp9032674; Mon, 17 Aug 2020 15:55:26 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: Date: Mon, 17 Aug 2020 17:55:24 +0200 Content-Transfer-Encoding: 7bit Message-Id: <94BF1D54-A583-4769-861B-A0F61DD46884@acm.org> References: To: Andrii Kolomoiets X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F21.5F3AA86F.002E, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=A5MSwJeG c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=pGLkceISAAAA:8 a=uD9aJCuiJnaWNNPx5wgA:9 a=CjuIK1q_8ugA:10 a=iqwqPk6-bcIA:10 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: merge 42094 41184 17 aug. 2020 kl. 16.54 skrev Andrii Kolomoiets : > Looks like this is related to bug#41184 Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, Alan Third X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) merge 42094 41184 17 aug. 2020 kl. 16.54 skrev Andrii Kolomoiets : > Looks like this is related to bug#41184 Indeed it's the same bug, thank you! From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 17 11:56:01 2020 Received: (at 42904) by debbugs.gnu.org; 17 Aug 2020 15:56:01 +0000 Received: from localhost ([127.0.0.1]:32994 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7hUO-00034Y-TA for submit@debbugs.gnu.org; Mon, 17 Aug 2020 11:56:01 -0400 Received: from eggs.gnu.org ([209.51.188.92]:52152) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7hUN-00034K-A1 for 42904@debbugs.gnu.org; Mon, 17 Aug 2020 11:56:00 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:58978) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k7hUH-0001PY-0Z; Mon, 17 Aug 2020 11:55:53 -0400 Received: from [176.228.60.248] (port=3530 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1k7hUG-0006LN-4n; Mon, 17 Aug 2020 11:55:52 -0400 Date: Mon, 17 Aug 2020 18:55:36 +0300 Message-Id: <83lfidgtc7.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-Reply-To: (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Mon, 17 Aug 2020 16:11:52 +0200) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS References: MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Mattias EngdegÃ¥rd > Date: Mon, 17 Aug 2020 16:11:52 +0200 > Cc: Alan Third > > Proposed patch attached. I didn't find any obvious way to encode an > Emacs string into valid UTF-8 (with bad parts replaced) so a new > function was written. Is something wrong with encode_string_utf_8? It has arguments that allow you to replace invalid bytes into the likes of u+FFFD. Or did I misunderstand the problem you are facing? From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 17 12:11:59 2020 Received: (at 42904) by debbugs.gnu.org; 17 Aug 2020 16:11:59 +0000 Received: from localhost ([127.0.0.1]:33001 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7hjr-0005j3-E8 for submit@debbugs.gnu.org; Mon, 17 Aug 2020 12:11:59 -0400 Received: from mail211c50.megamailservers.eu ([91.136.10.221]:58100 helo=mail194c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7hjn-0005it-A3 for 42904@debbugs.gnu.org; Mon, 17 Aug 2020 12:11:57 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1597680713; bh=oUUg/Xh9sWgAsK0E8jR3JqkdJdHB/D7HoQ562skpkNA=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=PNeROUUrsphxaUFI3rl3/wL2TjKdnrUOSr5f7BP70IjutoIsdLa+CsC6apDrANhLn iuPYZmH1JYtHFS8HtohLf4JfXEOmxVaMETxbb9xemiy49lE78/w0rfNe1A+aSVWMk3 i1FkBvEJsuAeOSPpzdYI4/r502vRwV2P9vznwWKo= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail194c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07HGBp8D031890; Mon, 17 Aug 2020 16:11:52 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <83lfidgtc7.fsf@gnu.org> Date: Mon, 17 Aug 2020 18:11:50 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <83lfidgtc7.fsf@gnu.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F1D.5F3AAC49.0024, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=KsozJleN c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=0vTScc02Lc1lj7EGjAEA:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Origin-Country: SE X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 17 aug. 2020 kl. 17.55 skrev Eli Zaretskii : > Is something wrong with encode_string_utf_8? It has arguments that > allow you to replace invalid bytes into the likes of u+FFFD. Or did I > misunderstand the problem you are facing? No, that's a valid question. I did try that function first, but it had = too many quirks: doesn't accept a unibyte non-ASCII string, sometimes = replaces valid characters, doesn't always output UTF-8... It was easier = to write a new function which encapsulates the common usage case. In = addition, the new function is short and simple enough that it can easily = be verified to be correct; encode_string_utf_8 is big and complex. In addition, it seems likely that the same problem exists elsewhere and = it's useful to have a function which solves it right away. From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 17 12:13:17 2020 Received: (at control) by debbugs.gnu.org; 17 Aug 2020 16:13:17 +0000 Received: from localhost ([127.0.0.1]:33011 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7hl6-0005lm-W5 for submit@debbugs.gnu.org; Mon, 17 Aug 2020 12:13:17 -0400 Received: from mail1475c50.megamailservers.eu ([91.136.14.75]:54842 helo=mail118c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7hl5-0005lZ-Vp for control@debbugs.gnu.org; Mon, 17 Aug 2020 12:13:16 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1597680789; bh=bLxzPI+a3F3ulGmtWzhkZRUxs156ThVhmTDT4uUp67w=; h=From:Subject:Date:References:To:In-Reply-To:From; b=h23Pu4DlqO0wTIvGFm4vkUxpmuw/TJJTDTQn+15GX6V043AxFS3qKoLu3z4n3xvDC o2h63jzR3yQh9U/Ao8vpu6bI0Jvoq1/Y2KEDiIznWXqfb+1MaunJNhmLA6TLrYKt9m 5z+/NEawxixfGexY6/+unPpr30P+JimRRvuwY8+c= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail118c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07HGD79Q021526 for ; Mon, 17 Aug 2020 16:13:09 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS Date: Mon, 17 Aug 2020 18:13:07 +0200 References: <94BF1D54-A583-4769-861B-A0F61DD46884@acm.org> To: control@debbugs.gnu.org In-Reply-To: <94BF1D54-A583-4769-861B-A0F61DD46884@acm.org> Message-Id: X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F24.5F3AAC95.0053, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=KaGsTjQD c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=eRi6dWCqCHiN-Xc4KvMA:9 a=CjuIK1q_8ugA:10 X-Origin-Country: SE X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) merge 42904 41184 From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 17 13:06:20 2020 Received: (at 42904) by debbugs.gnu.org; 17 Aug 2020 17:06:20 +0000 Received: from localhost ([127.0.0.1]:33097 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7iaS-0000qI-Cb for submit@debbugs.gnu.org; Mon, 17 Aug 2020 13:06:20 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44392) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7iaQ-0000q5-8G for 42904@debbugs.gnu.org; Mon, 17 Aug 2020 13:06:18 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:60866) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k7iaK-0002hu-KS; Mon, 17 Aug 2020 13:06:12 -0400 Received: from [176.228.60.248] (port=3861 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1k7iaJ-0005NU-Sj; Mon, 17 Aug 2020 13:06:12 -0400 Date: Mon, 17 Aug 2020 20:05:58 +0300 Message-Id: <838sedgq2x.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-Reply-To: (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Mon, 17 Aug 2020 18:11:50 +0200) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS References: <83lfidgtc7.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Mattias EngdegÃ¥rd > Date: Mon, 17 Aug 2020 18:11:50 +0200 > Cc: 42904@debbugs.gnu.org, alan@idiocy.org > > 17 aug. 2020 kl. 17.55 skrev Eli Zaretskii : > > > Is something wrong with encode_string_utf_8? It has arguments that > > allow you to replace invalid bytes into the likes of u+FFFD. Or did I > > misunderstand the problem you are facing? > > No, that's a valid question. I did try that function first, but it had too many quirks: doesn't accept a unibyte non-ASCII string, sometimes replaces valid characters, doesn't always output UTF-8... It was easier to write a new function which encapsulates the common usage case. In addition, the new function is short and simple enough that it can easily be verified to be correct; encode_string_utf_8 is big and complex. Well, it is always easier to special-case some use case, but we have general APIs for a reason. In particular, having several similar but subtly different functions is confusing and causes mistakes. And you seem to be saying that encode_string_utf_8 doesn't work as advertised, which means it should be fixed. So I would prefer to use encode_string_utf_8 if reasonably practical. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 17 14:48:22 2020 Received: (at 42904) by debbugs.gnu.org; 17 Aug 2020 18:48:22 +0000 Received: from localhost ([127.0.0.1]:33175 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7kBC-0005Qd-7d for submit@debbugs.gnu.org; Mon, 17 Aug 2020 14:48:22 -0400 Received: from mail1458c50.megamailservers.eu ([91.136.14.58]:58150 helo=mail267c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7kB8-0005QM-7e for 42904@debbugs.gnu.org; Mon, 17 Aug 2020 14:48:20 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1597690091; bh=4TJLZMro8n/fuVR2SzSf74s7lppVgBmdDgUWpTVO+ss=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=eEBboL02DjP9UjGVTuCXKhQX0oxXHe//FJIHWNHq5gGaaVOB9QTY70FKbB/neiKbo dTQdxORE/QQVj1QGEL5ccitXMo973Of799jNMT5ndJkElKwqzcYaum9O/UOFynBH6N 6S60fpv2sQMF2U30N6TpVfGEpCfXbwW7EbufMqOM= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail267c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07HIm9xG006274; Mon, 17 Aug 2020 18:48:10 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <838sedgq2x.fsf@gnu.org> Date: Mon, 17 Aug 2020 20:48:08 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F1D.5F3AD0EB.007A, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=Cf92G4jl c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=S-69H9ipgdBFCCBJcgIA:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 17 aug. 2020 kl. 19.05 skrev Eli Zaretskii : > Well, it is always easier to special-case some use case, but we have > general APIs for a reason. In particular, having several similar but > subtly different functions is confusing and causes mista [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 17 aug. 2020 kl. 19.05 skrev Eli Zaretskii : > Well, it is always easier to special-case some use case, but we have > general APIs for a reason. In particular, having several similar but > subtly different functions is confusing and causes mistakes. The new function is much simpler and easier to use than = encode_string_utf_8 precisely for that reason: to avoid confusion and = mistakes, both of which I got in spades when trying to use it. > And you seem to be saying that encode_string_utf_8 doesn't work as > advertised, which means it should be fixed. Actually I don't know exactly how it is supposed to work so I wouldn't = even say that. It's probably fine code but it's not for me, not in this = case. > So I would prefer to use encode_string_utf_8 if reasonably practical. Well, it doesn't seem to be reasonably practical. In order to fix a bug, = I prefer not having to fix some unrelated but complex code, especially = when it is unclear how and if that code really can and/or should be = 'fixed', and exactly what that would entail. Now if, after the proposed patch has been applied, someone wants to = refactor so that string_to_valid_utf_8 disappears or becomes implemented = in terms of something else, then that's perfectly fine, as long as the = bug remains fixed. From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 17 15:56:24 2020 Received: (at 42904) by debbugs.gnu.org; 17 Aug 2020 19:56:24 +0000 Received: from localhost ([127.0.0.1]:33226 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7lF2-00073w-GO for submit@debbugs.gnu.org; Mon, 17 Aug 2020 15:56:24 -0400 Received: from mailout-l3b-97.contactoffice.com ([212.3.242.97]:43912) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7lF0-00073i-W9 for 42904@debbugs.gnu.org; Mon, 17 Aug 2020 15:56:23 -0400 Received: from smtpauth1.co-bxl (smtpauth1.co-bxl [10.2.0.15]) by mailout-l3b-97.contactoffice.com (Postfix) with ESMTP id 8349B227; Mon, 17 Aug 2020 21:56:16 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1597694176; s=20200222-6h9o; d=idiocy.org; i=alan@idiocy.org; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:Content-Type:Content-Transfer-Encoding:In-Reply-To; l=1317; bh=xPeYLyIC5/wSRe4CYSHZJ/dlZ7FY8tA3i6HZycfGbHs=; b=jGcNquf18gwzkMirIZpFajVgVYjuWol6NYkiB44VVkXRcDGD4NDq1w1dRjE0uAx1 ouPSdVjBU5WMnYPqridRf4P9Xh3iX6Q2mzv4Js7VO0OjiCkgjunchZGnDPBlYNpjsiw iMTlTdrKW4B/38wNWRTTezWk7L+QoPnuv7O3qziWetxN4ohc6ftydhG5Oo7lh6EWMFF n+TgjKq5JwkS+pcx+z7XcR5lkWYVKGhvIhN58NvKhC/GHMOCEzTdINydOtKEPCYk/hu HKfSo8bUXLfYjclbsgUnZRBnwRmQB1ELyIfDR5yeQyCYOF4EKQJau788ra/KUpDtZ3C qLd5hz7UeQ== Received: by smtp.mailfence.com with ESMTPA ; Mon, 17 Aug 2020 21:56:12 +0200 (CEST) Received: by breton.holly.idiocy.org (Postfix, from userid 501) id BEBF8202496C67; Mon, 17 Aug 2020 20:56:10 +0100 (BST) Date: Mon, 17 Aug 2020 21:56:13 +0200 (CEST) From: Alan Third To: Mattias =?iso-8859-1?Q?Engdeg=E5rd?= Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS Message-ID: <20200817195610.GA70682@breton.holly.idiocy.org> Mail-Followup-To: Alan Third , Mattias =?iso-8859-1?Q?Engdeg=E5rd?= , Eli Zaretskii , 42904@debbugs.gnu.org References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> X-Spam-Flag: NO X-Spam-Status: No, hits=-2.9 required=4.7 symbols=ALL_TRUSTED, BAYES_00 device=10.2.0.1 X-ContactOffice-Account: com:241649512 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, Eli Zaretskii X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) On Mon, Aug 17, 2020 at 08:48:08PM +0200, Mattias Engdegård wrote: > 17 aug. 2020 kl. 19.05 skrev Eli Zaretskii : > > > Well, it is always easier to special-case some use case, but we have > > general APIs for a reason. In particular, having several similar but > > subtly different functions is confusing and causes mistakes. > > The new function is much simpler and easier to use than > encode_string_utf_8 precisely for that reason: to avoid confusion > and mistakes, both of which I got in spades when trying to use it. Sorry if this is a stupid question, but would using UTF-16 be easier? This appears to work (although I'm sure it's not the right way to do this): modified src/nsfns.m @@ -405,11 +405,10 @@ Turn the input menu (an NSMenu) into a lisp list for tracking on lisp side. NSString *str; NSView *view = FRAME_NS_VIEW (f); + encoded_name = code_convert_string_norecord (name, Qutf_16le, 1); - encoded_name = ENCODE_UTF_8 (name); - - str = [NSString stringWithUTF8String: SSDATA (encoded_name)]; - + str = [NSString stringWithCharacters: (const unichar *) SDATA (encoded_name) + length: SBYTES (encoded_name) / sizeof (unichar)]; /* Don't change the name if it's already NAME. */ if (! [[[view window] title] isEqualToString: str]) -- Alan Third From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 18 04:07:41 2020 Received: (at 42904) by debbugs.gnu.org; 18 Aug 2020 08:07:41 +0000 Received: from localhost ([127.0.0.1]:33929 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7wej-0008Ls-KS for submit@debbugs.gnu.org; Tue, 18 Aug 2020 04:07:41 -0400 Received: from mail1435c50.megamailservers.eu ([91.136.14.35]:36464 helo=mail263c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7weg-0008La-T0 for 42904@debbugs.gnu.org; Tue, 18 Aug 2020 04:07:40 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1597738050; bh=9cfJKmNbYM3xUlTe7WiWAtwn9iaFPqkB9WbfjeMJjdM=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=iYmwpWCL+1iRYCgNXYFIyN3qxoRICOhR+ukh6rRSt+T02ehdJPnejKXnbLAmPHlsW pScu3LuoRPDkkLboYG9A2wvETqHkYDr6bWr+P48fZVALTe81Mgh3+R34OfQK3iQj0n FG4CCmtlxPq+aGu+rPrDOzEk3WjjcNa7qDlaK4IE= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail263c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07I87Rqx031544; Tue, 18 Aug 2020 08:07:29 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <20200817195610.GA70682@breton.holly.idiocy.org> Date: Tue, 18 Aug 2020 10:07:27 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> To: Alan Third X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F1E.5F3B8C42.004D, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=e6d4tph/ c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=hIj89exaAAAA:8 a=LNjq3U4WhDRkdH6gGKgA:9 a=CjuIK1q_8ugA:10 a=lS9wXHQM5UdnNJ4u63Ry:22 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 17 aug. 2020 kl. 21.56 skrev Alan Third : > Sorry if this is a stupid question, but would using UTF-16 be easier? > This appears to work (although I'm sure it's not the right way to do this): Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, Eli Zaretskii X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 17 aug. 2020 kl. 21.56 skrev Alan Third : > Sorry if this is a stupid question, but would using UTF-16 be easier? > This appears to work (although I'm sure it's not the right way to do = this): A good question! It has the advantage of requiring no new code, but is = slightly inferior in that raw bytes are not replaced with U+FFFD but = with spaces; we should probably set :default-char to #xfffd for the = utf-16 coding systems. Unpaired surrogates are handled splendidly by accident: the conversion = to UTF-16 preserves them, perhaps incorrectly so, but the NS libs = display them as a distinctive and very suggestive glyph. Apparently = [NSString stringWithCharacters:] doesn't perform any validation at all. On the other hand, I think we still do need a subroutine for converting = to UTF-8 for passing strings to system code where graceful handling of = invalid encoding cannot be assumed, as there appears to be nothing in = Emacs that can do this. > + encoded_name =3D code_convert_string_norecord (name, Qutf_16le, 1); Presumably this should be utf_16be on big-endian platforms. We still = support PowerPC macOS, don't we? > + str =3D [NSString stringWithCharacters: (const unichar *) SDATA = (encoded_name) Is SDATA guaranteed to be 16-bit aligned? Doesn't matter on x86 or = PowerPC, but strictly speaking... From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 18 04:43:22 2020 Received: (at 42904) by debbugs.gnu.org; 18 Aug 2020 08:43:22 +0000 Received: from localhost ([127.0.0.1]:33945 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7xDF-0000lR-U3 for submit@debbugs.gnu.org; Tue, 18 Aug 2020 04:43:22 -0400 Received: from mailout-l3b-97.contactoffice.com ([212.3.242.97]:33570) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k7xDE-0000lF-0N for 42904@debbugs.gnu.org; Tue, 18 Aug 2020 04:43:21 -0400 Received: from smtpauth1.co-bxl (smtpauth1.co-bxl [10.2.0.15]) by mailout-l3b-97.contactoffice.com (Postfix) with ESMTP id 3AF99223; Tue, 18 Aug 2020 10:43:14 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1597740194; s=20200222-6h9o; d=idiocy.org; i=alan@idiocy.org; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:Content-Type:Content-Transfer-Encoding:In-Reply-To; l=991; bh=71X479819LWT43b9m1LbKaVIXEVt68myV1cCRpovc28=; b=k0KicsiK9bqglcJaNxi7VMfZ91GALbOxWyJSAuWGFNJRIzgNhzhq/o+4UjRFdl8b vnK/mKjwMp3QqoEA3Oa4Ux3PbRC4283K1EVQSJhxvYfQq7r0pnH8DUgzrVPpA+Tu8zh 6XQfXuWu4pSEAbUg63J8RJzo3guVCa50fAQj40XpegwihaD8twOwfjzhlNdsitz55tL BdRby8HasMFDoVT+VfZGNLC/KXerBUGuxnKXrKhWT9EUlwSqFmv8VZ/Dxu15TsGtsEt 7oyDPG9sDhx/vHZYc/fuPcLgBQ3AY6xNESQN8HPpgOjKv9sbgNMCSFxBhPSulDubtwO xzEUlBhWIA== Received: by smtp.mailfence.com with ESMTPA ; Tue, 18 Aug 2020 10:43:09 +0200 (CEST) Received: by breton.holly.idiocy.org (Postfix, from userid 501) id 1AB0C20249B13F; Tue, 18 Aug 2020 09:43:06 +0100 (BST) Date: Tue, 18 Aug 2020 10:43:10 +0200 (CEST) From: Alan Third To: Mattias =?iso-8859-1?Q?Engdeg=E5rd?= Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS Message-ID: <20200818084306.GA89999@breton.holly.idiocy.org> Mail-Followup-To: Alan Third , Mattias =?iso-8859-1?Q?Engdeg=E5rd?= , Eli Zaretskii , 42904@debbugs.gnu.org References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> X-Spam-Flag: NO X-Spam-Status: No, hits=-2.9 required=4.7 symbols=ALL_TRUSTED, BAYES_00 device=10.2.0.20 X-ContactOffice-Account: com:241649512 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, Eli Zaretskii X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) On Tue, Aug 18, 2020 at 10:07:27AM +0200, Mattias Engdegård wrote: > 17 aug. 2020 kl. 21.56 skrev Alan Third : > > > + encoded_name = code_convert_string_norecord (name, Qutf_16le, 1); > > Presumably this should be utf_16be on big-endian platforms. We still support PowerPC macOS, don't we? No, however I imagine we support GNUstep on big endian systems. > > + str = [NSString stringWithCharacters: (const unichar *) SDATA (encoded_name) > > Is SDATA guaranteed to be 16-bit aligned? Doesn't matter on x86 or > PowerPC, but strictly speaking... I've no idea, I adapted the code from make_multibyte_string in alloc.c, and one of it's callers (although I can't remember which right now). I'm expecting Eli to appear and tell me this is the entirely wrong way of doing this. ;) Anyway, as I understand it the internal representation of NS strings are UTF-16, so the conversion through UTF-8 seems a bit of a waste if we can go direct. -- Alan Third From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 18 07:48:21 2020 Received: (at 42904) by debbugs.gnu.org; 18 Aug 2020 11:48:21 +0000 Received: from localhost ([127.0.0.1]:34352 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k806H-0005vg-3N for submit@debbugs.gnu.org; Tue, 18 Aug 2020 07:48:21 -0400 Received: from mail177c50.megamailservers.eu ([91.136.10.187]:50560 helo=mail51c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k806D-0005vU-QV for 42904@debbugs.gnu.org; Tue, 18 Aug 2020 07:48:19 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1597751294; bh=k0kQgH68HQ5N2IQWfZwEi8aHh+IYFlimNCo6op6AZng=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=MN09LCUwm5QMaLP3KsHjJZr0DkiGSUfhA19a6EAmb8Ucr8OaPrHUmnwYdQNga4WSM Sf+bKEHU+7dIeWuAwqkxJAHH/liTmx9YBMtqM4m/4ALvuxtvKiDdN050ZnEm0DbfGR RDa8QReodHtqPnB+Tid1EbRpNaGIczJWUkrvVJOY= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail51c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07IBmAXL028334; Tue, 18 Aug 2020 11:48:13 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Message-Id: <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> Content-Type: multipart/mixed; boundary="Apple-Mail=_24D051E2-E0BF-46A4-A406-355502745148" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS Date: Tue, 18 Aug 2020 13:48:10 +0200 In-Reply-To: <20200818084306.GA89999@breton.holly.idiocy.org> To: Alan Third References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F27.5F3BBFFE.008C, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=MOMeZ/Rl c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=M51BFTxLslgA:10 a=hIj89exaAAAA:8 a=94QdGpDQX0ZQNFkfVvsA:9 a=CjuIK1q_8ugA:10 a=aLiXyXexwL43tN057eoA:9 a=B2y7HmGcmWMA:10 a=lS9wXHQM5UdnNJ4u63Ry:22 X-Origin-Country: SE X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, Eli Zaretskii X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --Apple-Mail=_24D051E2-E0BF-46A4-A406-355502745148 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii 18 aug. 2020 kl. 10.43 skrev Alan Third : >> Presumably this should be utf_16be on big-endian platforms. We still = support PowerPC macOS, don't we? >=20 > No, however I imagine we support GNUstep on big endian systems. Well then, it's easy to deal with. >>> + str =3D [NSString stringWithCharacters: (const unichar *) SDATA = (encoded_name) >>=20 >> Is SDATA guaranteed to be 16-bit aligned? Doesn't matter on x86 or >> PowerPC, but strictly speaking... >=20 > I've no idea, I adapted the code from make_multibyte_string in > alloc.c, and one of it's callers (although I can't remember which > right now). I'm expecting Eli to appear and tell me this is the > entirely wrong way of doing this. ;) Data alignment trapping is optional on 64-bit ARM but I'd be surprised = if macOS enabled it. It might be hazardous for all the GNUStep-on-MIPS = workstations. > Anyway, as I understand it the internal representation of NS strings > are UTF-16, so the conversion through UTF-8 seems a bit of a waste if > we can go direct. Maybe, but the conversion to UTF-16 then has to be done on the Emacs = side instead, probably less efficiently than in the NS libs. It's = probably a wash. Anyway, here is an alternative patch using your method. Tell us what you = think. --Apple-Mail=_24D051E2-E0BF-46A4-A406-355502745148 Content-Disposition: attachment; filename*0=0001-UTF-16-Fix-NS-crash-on-invalid-frame-title-string-bug-42904; filename*1=.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-UTF-16-Fix-NS-crash-on-invalid-frame-title-string-bug-42904.patch" Content-Transfer-Encoding: quoted-printable =46rom=209858f0409cc83eefcc153109e180c230868d20a5=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Tue,=2018=20Aug=202020=2012:58:12=20+0200=0A= Subject:=20[PATCH]=20Fix=20NS=20crash=20on=20invalid=20frame=20title=20= string=20(bug#42904)=0A=0AInstead=20of=20blindly=20assuming=20that=20all=20= Emacs=20strings=20are=20valid=20UTF-8,=0Awhich=20they=20are=20not,=20use=20= UTF-16=20because=20this=20conversion=20seems=20to=20be=0Amore=20robust=20= (after=20making=20sure=20that=20they=20use=20the=20correct=20replacement=0A= character).=20=20Unpaired=20surrogates=20will=20still=20go=20through=20= to=20the=20NSString=0Aobjects,=20but=20the=20NS=20libs=20handle=20them=20= gracefully.=0A=0A*=20src/nsfns.m=20(string_to_nsstring):=20New=20= function.=0A(ns_set_name_internal):=20Use=20string_to_nsstring.=0A*=20= lisp/international/mule-conf.el=20(utf-16le,=20utf-16be)=0A= (utf-16le-with-signature,=20utf-16be-with-signature,=20utf-16):=0AUse=20= U+FFFD=20as=20:default-char=20instead=20of=20ASCII=20space.=0A*=20= test/src/coding-tests.el=20(coding-utf-16-replacement-char):=20New=20= test.=0A---=0A=20lisp/international/mule-conf.el=20|=20=205=20+++++=0A=20= src/nsfns.m=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= |=2034=20+++++++++++++++++++--------------=0A=20test/src/coding-tests.el=20= =20=20=20=20=20=20=20|=2012=20++++++++++++=0A=203=20files=20changed,=20= 37=20insertions(+),=2014=20deletions(-)=0A=0Adiff=20--git=20= a/lisp/international/mule-conf.el=20b/lisp/international/mule-conf.el=0A= index=20edda79ba4e..b9acafc158=20100644=0A---=20= a/lisp/international/mule-conf.el=0A+++=20= b/lisp/international/mule-conf.el=0A@@=20-1336,6=20+1336,7=20@@=20= 'utf-16le=0A=20=20=20:mnemonic=20?U=0A=20=20=20:charset-list=20= '(unicode)=0A=20=20=20:endian=20'little=0A+=20=20:default-char=20#xfffd=0A= =20=20=20:mime-text-unsuitable=20t=0A=20=20=20:mime-charset=20'utf-16le)=0A= =20=0A@@=20-1345,6=20+1346,7=20@@=20'utf-16be=0A=20=20=20:mnemonic=20?U=0A= =20=20=20:charset-list=20'(unicode)=0A=20=20=20:endian=20'big=0A+=20=20= :default-char=20#xfffd=0A=20=20=20:mime-text-unsuitable=20t=0A=20=20=20= :mime-charset=20'utf-16be)=0A=20=0A@@=20-1355,6=20+1357,7=20@@=20= 'utf-16le-with-signature=0A=20=20=20:charset-list=20'(unicode)=0A=20=20=20= :bom=20t=0A=20=20=20:endian=20'little=0A+=20=20:default-char=20#xfffd=0A=20= =20=20:mime-text-unsuitable=20t=0A=20=20=20:mime-charset=20'utf-16)=0A=20= =0A@@=20-1365,6=20+1368,7=20@@=20'utf-16be-with-signature=0A=20=20=20= :charset-list=20'(unicode)=0A=20=20=20:bom=20t=0A=20=20=20:endian=20'big=0A= +=20=20:default-char=20#xfffd=0A=20=20=20:mime-text-unsuitable=20t=0A=20=20= =20:mime-charset=20'utf-16)=0A=20=0A@@=20-1375,6=20+1379,7=20@@=20= 'utf-16=0A=20=20=20:charset-list=20'(unicode)=0A=20=20=20:bom=20= '(utf-16le-with-signature=20.=20utf-16be-with-signature)=0A=20=20=20= :endian=20'big=0A+=20=20:default-char=20#xfffd=0A=20=20=20= :mime-text-unsuitable=20t=0A=20=20=20:mime-charset=20'utf-16)=0A=20=0A= diff=20--git=20a/src/nsfns.m=20b/src/nsfns.m=0Aindex=20= 628233ea0d..e514585dc1=20100644=0A---=20a/src/nsfns.m=0A+++=20= b/src/nsfns.m=0A@@=20-398,29=20+398,35=20@@=20Turn=20the=20input=20menu=20= (an=20NSMenu)=20into=20a=20lisp=20list=20for=20tracking=20on=20lisp=20= side.=0A=20=20=20=20=20=20=20=20=20[NSString=20stringWithUTF8String:=20= SSDATA=20(arg)]];=0A=20}=0A=20=0A+static=20NSString=20*=0A= +string_to_nsstring=20(Lisp_Object=20string)=0A+{=0A+=20=20Lisp_Object=20= coding=20=3D=0A+#ifdef=20WORDS_BIGENDIAN=0A+=20=20=20=20Qutf_16be=0A= +#else=0A+=20=20=20=20Qutf_16le=0A+#endif=0A+=20=20=20=20;=0A+=20=20= Lisp_Object=20utf16=20=3D=20code_convert_string_norecord=20(string,=20= coding,=20true);=0A+=20=20/*=20Here=20we=20somewhat=20precariously=20= assume=20that=20SDATA=20is=2016-bit=20aligned,=0A+=20=20=20=20=20or=20= that=20unaligned=20access=20is=20OK.=20=20*/=0A+=20=20return=20[NSString=20= stringWithCharacters:=20(const=20unichar=20*)=20SDATA=20(utf16)=0A+=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20length:=20SBYTES=20(utf16)=20/=20sizeof=20(unichar)];=0A= +}=0A+=0A=20static=20void=0A=20ns_set_name_internal=20(struct=20frame=20= *f,=20Lisp_Object=20name)=0A=20{=0A-=20=20Lisp_Object=20encoded_name,=20= encoded_icon_name;=0A-=20=20NSString=20*str;=0A=20=20=20NSView=20*view=20= =3D=20FRAME_NS_VIEW=20(f);=0A-=0A-=0A-=20=20encoded_name=20=3D=20= ENCODE_UTF_8=20(name);=0A-=0A-=20=20str=20=3D=20[NSString=20= stringWithUTF8String:=20SSDATA=20(encoded_name)];=0A-=0A+=20=20NSString=20= *str=20=3D=20string_to_nsstring=20(name);=0A=20=0A=20=20=20/*=20Don't=20= change=20the=20name=20if=20it's=20already=20NAME.=20=20*/=0A=20=20=20if=20= (!=20[[[view=20window]=20title]=20isEqualToString:=20str])=0A=20=20=20=20= =20[[view=20window]=20setTitle:=20str];=0A=20=0A-=20=20if=20(!STRINGP=20= (f->icon_name))=0A-=20=20=20=20encoded_icon_name=20=3D=20encoded_name;=0A= -=20=20else=0A-=20=20=20=20encoded_icon_name=20=3D=20ENCODE_UTF_8=20= (f->icon_name);=0A-=0A-=20=20str=20=3D=20[NSString=20= stringWithUTF8String:=20SSDATA=20(encoded_icon_name)];=0A+=20=20if=20= (STRINGP=20(f->icon_name))=0A+=20=20=20=20str=20=3D=20string_to_nsstring=20= (f->icon_name);=0A=20=0A=20=20=20if=20([[view=20window]=20= miniwindowTitle]=0A=20=20=20=20=20=20=20&&=20!=20[[[view=20window]=20= miniwindowTitle]=20isEqualToString:=20str])=0Adiff=20--git=20= a/test/src/coding-tests.el=20b/test/src/coding-tests.el=0Aindex=20= c438ae22ce..8b0adf0ad8=20100644=0A---=20a/test/src/coding-tests.el=0A+++=20= b/test/src/coding-tests.el=0A@@=20-429,6=20+429,18=20@@=20= coding-check-coding-systems-region=0A=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20'((iso-latin-1=203)=20(us-ascii=201=203))))=0A=20=20=20= (should-error=20(check-coding-systems-region=20"=C3=A5"=20nil=20= '(bad-coding-system))))=0A=20=0A+(ert-deftest=20= coding-utf-16-replacement-char=20()=0A+=20=20(should=20(equal=20= (encode-coding-string=20"A\351B"=20'utf-16be)=0A+=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20(unibyte-string=200=20?A=20#xff=20#xfd=200=20= ?B)))=0A+=20=20(should=20(equal=20(encode-coding-string=20"A\351B"=20= 'utf-16le)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= (unibyte-string=20?A=200=20#xfd=20#xff=20?B=200)))=0A+=20=20(should=20= (equal=20(encode-coding-string=20"A\ud8b6B=CE=A3\227D=F0=9D=84=9E"=20= 'utf-16be)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= (unibyte-string=200=20?A=20#xd8=20#xb6=200=20?B=20#x03=20#xa3=20#xff=20= #xfd=200=20?D=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20#xd8=20#x34=20#xdd=20#x1e)))=0A= +=20=20(should=20(equal=20(encode-coding-string=20"A\ud8b6B=CE=A3\227D=F0=9D= =84=9E"=20'utf-16le)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= (unibyte-string=20?A=200=20#xb6=20#xd8=20?B=200=20#xa3=20#x03=20#xfd=20= #xff=20?D=200=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20#x34=20#xd8=20#x1e=20#xdd))))=0A= +=0A=20;;=20Local=20Variables:=0A=20;;=20byte-compile-warnings:=20(not=20= obsolete)=0A=20;;=20End:=0A--=20=0A2.21.1=20(Apple=20Git-122.3)=0A=0A= --Apple-Mail=_24D051E2-E0BF-46A4-A406-355502745148-- From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 18 08:22:40 2020 Received: (at 42904) by debbugs.gnu.org; 18 Aug 2020 12:22:40 +0000 Received: from localhost ([127.0.0.1]:34454 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k80dU-0006qB-5f for submit@debbugs.gnu.org; Tue, 18 Aug 2020 08:22:40 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50050) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k80dS-0006py-Mi for 42904@debbugs.gnu.org; Tue, 18 Aug 2020 08:22:39 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:51104) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k80dN-0001wK-52; Tue, 18 Aug 2020 08:22:33 -0400 Received: from [176.228.60.248] (port=3591 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1k80dM-0005q8-1k; Tue, 18 Aug 2020 08:22:32 -0400 Date: Tue, 18 Aug 2020 15:22:11 +0300 Message-Id: <83ft8kf8jw.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Tue, 18 Aug 2020 13:48:10 +0200) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Mattias EngdegÃ¥rd > Date: Tue, 18 Aug 2020 13:48:10 +0200 > Cc: Eli Zaretskii , 42904@debbugs.gnu.org > > Anyway, here is an alternative patch using your method. Tell us what you think. Thanks, but I don't think we should modify the :default-char attribute of the UTF-* encodings as part of this change. It's a separate issue, and is a backward-incompatible change of sorts. For instance, we should consider what this will do to display on TTY frames that don't support Unicode. So I think we should discuss this issue separately before we make such a change. Why is it a problem to display a space instead of invalid bytes in this case? From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 18 08:24:24 2020 Received: (at 42904) by debbugs.gnu.org; 18 Aug 2020 12:24:24 +0000 Received: from localhost ([127.0.0.1]:34458 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k80fA-0006sy-Ha for submit@debbugs.gnu.org; Tue, 18 Aug 2020 08:24:24 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50448) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k80f9-0006sn-Hn for 42904@debbugs.gnu.org; Tue, 18 Aug 2020 08:24:23 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:51111) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k80f4-00026S-A5; Tue, 18 Aug 2020 08:24:18 -0400 Received: from [176.228.60.248] (port=3708 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1k80f3-0005xq-Li; Tue, 18 Aug 2020 08:24:18 -0400 Date: Tue, 18 Aug 2020 15:24:03 +0300 Message-Id: <83eeo4f8gs.fsf@gnu.org> From: Eli Zaretskii To: Alan Third In-Reply-To: <20200818084306.GA89999@breton.holly.idiocy.org> (message from Alan Third on Tue, 18 Aug 2020 10:43:10 +0200 (CEST)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, mattiase@acm.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Tue, 18 Aug 2020 10:43:10 +0200 (CEST) > From: Alan Third > Cc: Eli Zaretskii , 42904@debbugs.gnu.org > > > Is SDATA guaranteed to be 16-bit aligned? Doesn't matter on x86 or > > PowerPC, but strictly speaking... > > I've no idea, I adapted the code from make_multibyte_string in > alloc.c, and one of it's callers (although I can't remember which > right now). I'm expecting Eli to appear and tell me this is the > entirely wrong way of doing this. ;) It isn't wrong (and there's no need to worry about alignment in this case, AFAIK). Thanks. From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 18 10:11:11 2020 Received: (at 42904) by debbugs.gnu.org; 18 Aug 2020 14:11:11 +0000 Received: from localhost ([127.0.0.1]:36567 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k82KU-00021j-Tx for submit@debbugs.gnu.org; Tue, 18 Aug 2020 10:11:11 -0400 Received: from mail154c50.megamailservers.eu ([91.136.10.164]:41242 helo=mail50c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k82KR-00021Y-S4 for 42904@debbugs.gnu.org; Tue, 18 Aug 2020 10:11:10 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1597759865; bh=6z5mp3NjIeBrcaRk9EFjMuVDLKP90MpS3TiOvDoElzU=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=pTAFgxl89iFNga74QlA/qoxKwDqQpxSZb8HjoT4rAwaZ1znsd6eZxbEkZu/yyXqJ2 oPTIM/EfW4JMPcyUHd4COdwo15NXTK1yWPXwiNHzntmRSBVAEyzTGgjNpP3sFosTqh Uwk+/3nNE8VW9bUc3xe3+hqTVDo7FrdlqBNoHAYE= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail50c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07IEB2Vk012805; Tue, 18 Aug 2020 14:11:04 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <83eeo4f8gs.fsf@gnu.org> Date: Tue, 18 Aug 2020 16:11:02 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <83eeo4f8gs.fsf@gnu.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F28.5F3BE179.00A8, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=NoevjPVJ c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=mhFmQku-uZFPWoOYklsA:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 18 aug. 2020 kl. 14.24 skrev Eli Zaretskii : > It isn't wrong (and there's no need to worry about alignment in this > case, AFAIK). Do you mean that SDATA is guaranteed to be aligned, or that no NS platforms that Emacs runs on (or is likely to run on in the near future, such as macOS on arm64) trap on unaligned? Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, Alan Third X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 18 aug. 2020 kl. 14.24 skrev Eli Zaretskii : > It isn't wrong (and there's no need to worry about alignment in this > case, AFAIK). Do you mean that SDATA is guaranteed to be aligned, or that no NS = platforms that Emacs runs on (or is likely to run on in the near future, = such as macOS on arm64) trap on unaligned? > Thanks, but I don't think we should modify the :default-char attribute > of the UTF-* encodings as part of this change. It's a separate issue, > and is a backward-incompatible change of sorts. For instance, we > should consider what this will do to display on TTY frames that don't > support Unicode. So I think we should discuss this issue separately > before we make such a change. Yes, we can certainly make it a separate change. All bug fixes are = backward-incompatible in some respect; it is not reasonable to depend on = non-Unicode characters being translated to spaces when converted to = UTF-16 since that is neither documented nor reasonably expected = behaviour. > Why is it a problem to display a space instead of invalid bytes in > this case? A problem is not necessary for a change to be desirable. The Unicode = replacement character clearly indicates that something could not be = encoded correctly, and the exact position for it; it's universally = recognised and valuable for users and developers alike. Space is the = default value for :default-char, and that it isn't U+FFFD for UTF-16 (or = other Unicode encodings) is a clear bug, since that is the correct = character to use for that purpose. My guess is that space was chosen as default because it's a character = that occurs in all coding systems, but it is clearly wrong for UTF-16. = 'us-ascii' uses '?' for :default-char, which is a better character in = that repertoire. From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 18 10:40:59 2020 Received: (at 42904) by debbugs.gnu.org; 18 Aug 2020 14:40:59 +0000 Received: from localhost ([127.0.0.1]:36601 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k82nK-0004x7-Rh for submit@debbugs.gnu.org; Tue, 18 Aug 2020 10:40:59 -0400 Received: from eggs.gnu.org ([209.51.188.92]:59268) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k82nH-0004ws-12 for 42904@debbugs.gnu.org; Tue, 18 Aug 2020 10:40:57 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:53069) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k82nA-0003Mv-AD; Tue, 18 Aug 2020 10:40:48 -0400 Received: from [176.228.60.248] (port=4509 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1k82n9-0001aG-Aq; Tue, 18 Aug 2020 10:40:47 -0400 Date: Tue, 18 Aug 2020 17:40:33 +0300 Message-Id: <83a6ysf25a.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-Reply-To: (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Tue, 18 Aug 2020 16:11:02 +0200) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <83eeo4f8gs.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Mattias EngdegÃ¥rd > Date: Tue, 18 Aug 2020 16:11:02 +0200 > Cc: Alan Third , 42904@debbugs.gnu.org > > 18 aug. 2020 kl. 14.24 skrev Eli Zaretskii : > > > It isn't wrong (and there's no need to worry about alignment in this > > case, AFAIK). > > Do you mean that SDATA is guaranteed to be aligned, or that no NS platforms that Emacs runs on (or is likely to run on in the near future, such as macOS on arm64) trap on unaligned? Both, AFAIK. > it is not reasonable to depend on non-Unicode characters being translated to spaces We are not the only program which does that, but. > > Why is it a problem to display a space instead of invalid bytes in > > this case? > > A problem is not necessary for a change to be desirable. The Unicode replacement character clearly indicates that something could not be encoded correctly, and the exact position for it But that character only makes sense when it can be displayed, because otherwise no one will realize what was the problem. Anyway, this discussion should be on emacs-devel, not as part of an unrelated bug report. From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 18 11:21:20 2020 Received: (at 42904) by debbugs.gnu.org; 18 Aug 2020 15:21:20 +0000 Received: from localhost ([127.0.0.1]:36700 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k83QO-0001yE-Au for submit@debbugs.gnu.org; Tue, 18 Aug 2020 11:21:20 -0400 Received: from mail78c50.megamailservers.eu ([91.136.10.88]:54486 helo=mail70c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k83QM-0001y5-V2 for 42904@debbugs.gnu.org; Tue, 18 Aug 2020 11:21:19 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1597764077; bh=UjF2MMYREBgi8tWdyy0xsBg3Vo3/2i4s2cJ6ZH0zs24=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=nB4zR55g/sLbVevoRHQ54L92+yHgnQbxkjlsgejWzCnDuZuDQ9YICroxq2yDm2YxW JbokFsENMliKNFOnQH/2F9Wn5qhwN5IAM9qitaf74JCeJdUSYQX17aTALkrEGgJSWr pKNwT3lnqg4/F8gTIpQZnYt/lQ559eFsu9k3A2Zw= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail70c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07IFLEUl017979; Tue, 18 Aug 2020 15:21:16 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <83a6ysf25a.fsf@gnu.org> Date: Tue, 18 Aug 2020 17:21:13 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <2249E5B8-723E-434B-B926-41B90C30FC01@acm.org> References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <83eeo4f8gs.fsf@gnu.org> <83a6ysf25a.fsf@gnu.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F16.5F3BF1ED.0004, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=OKBZIhSB c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=A5RHpZ3Hd1uI49-YnMgA:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 18 aug. 2020 kl. 16.40 skrev Eli Zaretskii : >>> It isn't wrong (and there's no need to worry about alignment in this >>> case, AFAIK). >> >> Do you mean that SDATA is guaranteed to be aligned, or that no NS platforms that Emacs runs on (or is l [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 18 aug. 2020 kl. 16.40 skrev Eli Zaretskii : >>> It isn't wrong (and there's no need to worry about alignment in this >>> case, AFAIK). >>=20 >> Do you mean that SDATA is guaranteed to be aligned, or that no NS = platforms that Emacs runs on (or is likely to run on in the near future, = such as macOS on arm64) trap on unaligned? >=20 > Both, AFAIK. Thank you. I'm still wary about making such assumptions but I suppose we = commit worse sins. > But that character only makes sense when it can be displayed, because > otherwise no one will realize what was the problem. Certainly. Given that TTYs aren't typically used for displaying UTF-16, = and that the Unicode-capable terminals I've tried seem to show just fine = (in case someone converts the UTF-16 back to UTF-8), I think we are = reasonably safe. In any case it's no different from not being able to show an any other = character. > Anyway, this discussion should be on emacs-devel, not as part of an > unrelated bug report. Not unrelated at all, but by all means. From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 18 13:28:39 2020 Received: (at 42904) by debbugs.gnu.org; 18 Aug 2020 17:28:39 +0000 Received: from localhost ([127.0.0.1]:36862 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k85Pb-0003KO-AI for submit@debbugs.gnu.org; Tue, 18 Aug 2020 13:28:39 -0400 Received: from mailout-l3b-97.contactoffice.com ([212.3.242.97]:34466) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k85PY-0003K9-40 for 42904@debbugs.gnu.org; Tue, 18 Aug 2020 13:28:38 -0400 Received: from smtpauth1.co-bxl (smtpauth1.co-bxl [10.2.0.15]) by mailout-l3b-97.contactoffice.com (Postfix) with ESMTP id A20DB55D; Tue, 18 Aug 2020 19:28:29 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1597771709; s=20200222-6h9o; d=idiocy.org; i=alan@idiocy.org; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:Content-Type:Content-Transfer-Encoding:In-Reply-To; l=1137; bh=Kr+a76xGp6z51Pq+s/RdUZgLecgvG6gYmbB2N8i6ALE=; b=Zqt8HSOH3HLQzVaAUNlcg4xMnT2RmL7+W5PqFjH15BaS1lEOhJ2Dt9ilrGgDcfZK bmZ4bD7LYgPDaIHi2i2hj02TSXsrS8l7br2EDE8YLSuMYDN+C9sunk3ML+6Q+wFYUfO Xxms7Z/gydQpKjiyPcqSt0EnDh2momPiQ1V2JxXoB065U6jb+afO9ovxUQhat6rA+vJ sPVZ5Igi5NyeDqB/Rj39iPWbqZCjhaaAzXFHq3182xgWjeI8ZaTD59DtmzzzX7P/kkH X9Z/EOURCrqlt4x9oz9nZpuCd9UM+c7tyEgxMhgMhtASiqsYm1T4+FBYX6XRd3qtU3M FQlX0DfUsw== Received: by smtp.mailfence.com with ESMTPA ; Tue, 18 Aug 2020 19:28:25 +0200 (CEST) Received: by breton.holly.idiocy.org (Postfix, from userid 501) id 5FB6820249DF17; Tue, 18 Aug 2020 18:28:24 +0100 (BST) Date: Tue, 18 Aug 2020 19:28:26 +0200 (CEST) From: Alan Third To: Mattias =?iso-8859-1?Q?Engdeg=E5rd?= Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS Message-ID: <20200818172824.GA90575@breton.holly.idiocy.org> Mail-Followup-To: Alan Third , Mattias =?iso-8859-1?Q?Engdeg=E5rd?= , Eli Zaretskii , 42904@debbugs.gnu.org References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> X-Spam-Flag: NO X-Spam-Status: No, hits=-2.9 required=4.7 symbols=ALL_TRUSTED, BAYES_00 device=10.2.0.1 X-ContactOffice-Account: com:241649512 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, Eli Zaretskii X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) On Tue, Aug 18, 2020 at 01:48:10PM +0200, Mattias Engdegård wrote: > 18 aug. 2020 kl. 10.43 skrev Alan Third : > > Anyway, as I understand it the internal representation of NS strings > > are UTF-16, so the conversion through UTF-8 seems a bit of a waste if > > we can go direct. > > Maybe, but the conversion to UTF-16 then has to be done on the Emacs > side instead, probably less efficiently than in the NS libs. It's > probably a wash. > > Anyway, here is an alternative patch using your method. Tell us what > you think. Looks good to me. The only thought I have is that perhaps we should consider extending NSString to handle these lisp strings rather than making it a separate function? We could provide a method to convert to a lisp string as well, although that's not as complex. I believe using categories would do it without us having to create a new EmacsString class or similar. I don't know if this is worth it because I don't know if we really need these clean conversions elsewhere, but the neatness of newStr = [NSString withLispObject:str]; appeals. :) -- Alan Third From debbugs-submit-bounces@debbugs.gnu.org Thu Aug 20 05:27:10 2020 Received: (at 42904) by debbugs.gnu.org; 20 Aug 2020 09:27:10 +0000 Received: from localhost ([127.0.0.1]:41677 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k8gqk-0004wF-88 for submit@debbugs.gnu.org; Thu, 20 Aug 2020 05:27:10 -0400 Received: from mail223c50.megamailservers.eu ([91.136.10.233]:39026 helo=mail33c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k8gqh-0004w5-CR for 42904@debbugs.gnu.org; Thu, 20 Aug 2020 05:27:08 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1597915625; bh=qPiJjpIqM5GNpnOY2vPapVF75YN9aPW4qQD0WzSVrxE=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=jg+WZhweHuwZWPAW3y9Q59T1H5wGE1PFfjFvcy0UtClJLWYFAMKOIWTBALTDTXTQb 1VHA5SRz0LYfnsEwn1bzNcqAPmAgPvWhCl6cHvxmPlRYKiBoa4nEX8dC8GOqz1QJag YLhfCfb2WW+ZfMYaYr88FwrdTiacTpsJiwN0iIpw= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail33c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07K9R262013423; Thu, 20 Aug 2020 09:27:03 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Message-Id: Content-Type: multipart/mixed; boundary="Apple-Mail=_A5B67B93-D3CC-4BE3-A104-C62A3F50E586" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS Date: Thu, 20 Aug 2020 11:27:01 +0200 In-Reply-To: <20200818172824.GA90575@breton.holly.idiocy.org> To: Alan Third References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> <20200818172824.GA90575@breton.holly.idiocy.org> X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F1B.5F3E41E9.0021, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=eaJDgIMH c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=M51BFTxLslgA:10 a=hIj89exaAAAA:8 a=N54-gffFAAAA:8 a=tbnNHArpMZd4OzpFu9wA:9 a=QEXdDO2ut3YA:10 a=pe7jvfrG0Bplct7ro9IA:9 a=B2y7HmGcmWMA:10 a=lS9wXHQM5UdnNJ4u63Ry:22 a=6l0D2HzqY3Epnrm8mE3f:22 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 18 aug. 2020 kl. 19.28 skrev Alan Third : > Looks good to me. The only thought I have is that perhaps we should > consider extending NSString to handle these lisp strings rather than > making it a separate function? We could provide a method [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, Eli Zaretskii X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --Apple-Mail=_A5B67B93-D3CC-4BE3-A104-C62A3F50E586 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 18 aug. 2020 kl. 19.28 skrev Alan Third : > Looks good to me. The only thought I have is that perhaps we should > consider extending NSString to handle these lisp strings rather than > making it a separate function? We could provide a method to convert to > a lisp string as well, although that's not as complex. >=20 > I believe using categories would do it without us having to create a > new EmacsString class or similar. Fun, I hadn't done that before! Of course we should. As it happens I just enjoyed the HOPL paper about the history of = Objective-C (https://dl.acm.org/doi/10.1145/3386332). An excellent read = in general, and it has some history about how the categories came about. Here is an updated patch: it is now self-contained and does not change = anything outside the NS backend. There is a minor imperfection: the incoming name string can actually be = miscoded if it contains both non-ASCII characters and raw bytes. As an = example, consider (rename-buffer "a=C3=A9b\300") In xdisp.c:12497, the Lisp name string is created using make_string = which decides that the above multibyte string should really be unibyte, = and that confuses the converter. It is of no great consequence, but it = makes the result look messier than it should have: "a=EF=BF=BD=EF=BF=BDb=EF= =BF=BD=EF=BF=BDc" instead of "a=C3=A9b=EF=BF=BDc". --Apple-Mail=_A5B67B93-D3CC-4BE3-A104-C62A3F50E586 Content-Disposition: attachment; filename=0001-Fix-NS-crash-on-invalid-frame-title-string-bug-42904.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Fix-NS-crash-on-invalid-frame-title-string-bug-42904.patch" Content-Transfer-Encoding: quoted-printable =46rom=2018c2f2d8aee5ca6708232477d12dc1d884c75235=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Tue,=2018=20Aug=202020=2012:58:12=20+0200=0A= Subject:=20[PATCH]=20Fix=20NS=20crash=20on=20invalid=20frame=20title=20= string=20(bug#42904)=0A=0AInstead=20of=20blindly=20assuming=20that=20all=20= Emacs=20strings=20are=20valid=20UTF-8,=0Awhich=20they=20are=20not,=20use=20= a=20more=20careful=20conversion=20going=20via=20UTF-16=0Awhich=20is=20= what=20NSString=20uses=20internally.=20=20Unpaired=20surrogates=20will=0A= still=20go=20through=20to=20the=20NSString=20objects,=20but=20the=20NS=20= libs=20handle=20them=0Agracefully.=0A=0A*=20src/nsterm.h=20= (EmacsString):=20New=20category.=0A*=20src/nsfns.m=20= (all_nonzero_ascii):=20New=20helper=20function.=0A([NSString=20= stringWithLispString:]):=20New=20method.=0A(ns_set_name_internal):=20Use=20= new=20conversion=20method.=0A---=0A=20src/nsfns.m=20=20|=2065=20= +++++++++++++++++++++++++++++++++++++++++-----------=0A=20src/nsterm.h=20= |=20=205=20++++=0A=202=20files=20changed,=2056=20insertions(+),=2014=20= deletions(-)=0A=0Adiff=20--git=20a/src/nsfns.m=20b/src/nsfns.m=0Aindex=20= 628233ea0d..8357b8c290=20100644=0A---=20a/src/nsfns.m=0A+++=20= b/src/nsfns.m=0A@@=20-398,29=20+398,66=20@@=20Turn=20the=20input=20menu=20= (an=20NSMenu)=20into=20a=20lisp=20list=20for=20tracking=20on=20lisp=20= side.=0A=20=20=20=20=20=20=20=20=20[NSString=20stringWithUTF8String:=20= SSDATA=20(arg)]];=0A=20}=0A=20=0A+/*=20Whether=20N=20bytes=20at=20STR=20= are=20in=20the=20[0,127]=20range.=20=20*/=0A+static=20bool=0A= +all_nonzero_ascii=20(unsigned=20char=20*str,=20ptrdiff_t=20n)=0A+{=0A+=20= =20for=20(ptrdiff_t=20i=20=3D=200;=20i=20<=20n;=20i++)=0A+=20=20=20=20if=20= (str[i]=20<=201=20||=20str[i]=20>=20127)=0A+=20=20=20=20=20=20return=20= false;=0A+=20=20return=20true;=0A+}=0A+=0A+@implementation=20NSString=20= (EmacsString)=0A+/*=20Make=20an=20NSString=20from=20a=20Lisp=20string.=20= =20*/=0A++=20(NSString=20*)stringWithLispString:(Lisp_Object)string=0A+{=0A= +=20=20/*=20Shortcut=20for=20the=20common=20case.=20=20*/=0A+=20=20if=20= (all_nonzero_ascii=20(SDATA=20(string),=20SBYTES=20(string)))=0A+=20=20=20= =20return=20[NSString=20stringWithCString:=20SSDATA=20(string)=0A+=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20encoding:=20NSASCIIStringEncoding];=0A+=20=20string=20=3D=20= string_to_multibyte=20(string);=0A+=0A+=20=20/*=20Now=20the=20string=20= is=20multibyte;=20convert=20to=20UTF-16.=20=20*/=0A+=20=20unichar=20= *chars=20=3D=20xmalloc=20(4=20*=20SCHARS=20(string));=0A+=20=20unichar=20= *d=20=3D=20chars;=0A+=20=20const=20unsigned=20char=20*s=20=3D=20SDATA=20= (string);=0A+=20=20const=20unsigned=20char=20*end=20=3D=20s=20+=20SBYTES=20= (string);=0A+=20=20while=20(s=20<=20end)=0A+=20=20=20=20{=0A+=20=20=20=20= =20=20int=20c=20=3D=20string_char_advance=20(&s);=0A+=20=20=20=20=20=20= /*=20We=20pass=20unpaired=20surrogates=20through,=20because=20they=20are=20= typically=0A+=20=20=20=20=20=20=20=20=20handled=20fairly=20well=20by=20= the=20NS=20libraries=20(displayed=20with=20distinct=0A+=20=20=20=20=20=20= =20=20=20glyphs=20etc).=20=20*/=0A+=20=20=20=20=20=20if=20(c=20<=3D=20= 0xffff)=0A+=20=20=20=20=20=20=20=20*d++=20=3D=20c;=0A+=20=20=20=20=20=20= else=20if=20(c=20<=3D=200x10ffff)=0A+=20=20=20=20=20=20=20=20{=0A+=20=20=20= =20=20=20=20=20=20=20*d++=20=3D=200xd800=20+=20(c=20&=200x3ff);=0A+=20=20= =20=20=20=20=20=20=20=20*d++=20=3D=200xdc00=20+=20((c=20-=200x10000)=20= >>=2010);=0A+=20=20=20=20=20=20=20=20}=0A+=20=20=20=20=20=20else=0A+=20=20= =20=20=20=20=20=20*d++=20=3D=200xfffd;=20=20=20=20=20=20=20=20=20=20/*=20= Not=20valid=20for=20UTF-16.=20=20*/=0A+=20=20=20=20}=0A+=20=20NSString=20= *str=20=3D=20[NSString=20stringWithCharacters:=20chars=0A+=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20length:=20d=20-=20chars];=0A+=20=20= xfree=20(chars);=0A+=20=20return=20str;=0A+}=0A+@end=0A+=0A=20static=20= void=0A=20ns_set_name_internal=20(struct=20frame=20*f,=20Lisp_Object=20= name)=0A=20{=0A-=20=20Lisp_Object=20encoded_name,=20encoded_icon_name;=0A= -=20=20NSString=20*str;=0A=20=20=20NSView=20*view=20=3D=20FRAME_NS_VIEW=20= (f);=0A-=0A-=0A-=20=20encoded_name=20=3D=20ENCODE_UTF_8=20(name);=0A-=0A= -=20=20str=20=3D=20[NSString=20stringWithUTF8String:=20SSDATA=20= (encoded_name)];=0A-=0A+=20=20NSString=20*str=20=3D=20[NSString=20= stringWithLispString:=20name];=0A=20=0A=20=20=20/*=20Don't=20change=20= the=20name=20if=20it's=20already=20NAME.=20=20*/=0A=20=20=20if=20(!=20= [[[view=20window]=20title]=20isEqualToString:=20str])=0A=20=20=20=20=20= [[view=20window]=20setTitle:=20str];=0A=20=0A-=20=20if=20(!STRINGP=20= (f->icon_name))=0A-=20=20=20=20encoded_icon_name=20=3D=20encoded_name;=0A= -=20=20else=0A-=20=20=20=20encoded_icon_name=20=3D=20ENCODE_UTF_8=20= (f->icon_name);=0A-=0A-=20=20str=20=3D=20[NSString=20= stringWithUTF8String:=20SSDATA=20(encoded_icon_name)];=0A+=20=20if=20= (STRINGP=20(f->icon_name))=0A+=20=20=20=20str=20=3D=20[NSString=20= stringWithLispString:=20f->icon_name];=0A=20=0A=20=20=20if=20([[view=20= window]=20miniwindowTitle]=0A=20=20=20=20=20=20=20&&=20!=20[[[view=20= window]=20miniwindowTitle]=20isEqualToString:=20str])=0Adiff=20--git=20= a/src/nsterm.h=20b/src/nsterm.h=0Aindex=20a511fef5b9..ab868ed344=20= 100644=0A---=20a/src/nsterm.h=0A+++=20b/src/nsterm.h=0A@@=20-361,6=20= +361,11=20@@=20#define=20NS_DRAW_TO_BUFFER=201=0A=20=0A=20@end=0A=20=0A+=0A= +@interface=20NSString=20(EmacsString)=0A++=20(NSString=20= *)stringWithLispString:(Lisp_Object)string;=0A+@end=0A+=0A=20/*=20= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A= =20=0A=20=20=20=20The=20Emacs=20application=0A--=20=0A2.21.1=20(Apple=20= Git-122.3)=0A=0A= --Apple-Mail=_A5B67B93-D3CC-4BE3-A104-C62A3F50E586-- From debbugs-submit-bounces@debbugs.gnu.org Thu Aug 20 09:24:23 2020 Received: (at 42904) by debbugs.gnu.org; 20 Aug 2020 13:24:23 +0000 Received: from localhost ([127.0.0.1]:41965 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k8kYJ-0006dn-AX for submit@debbugs.gnu.org; Thu, 20 Aug 2020 09:24:23 -0400 Received: from eggs.gnu.org ([209.51.188.92]:53662) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k8kYH-0006dX-57 for 42904@debbugs.gnu.org; Thu, 20 Aug 2020 09:24:21 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:40781) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k8kYB-00010K-Gp; Thu, 20 Aug 2020 09:24:15 -0400 Received: from [176.228.60.248] (port=1306 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1k8kYA-0003b3-R4; Thu, 20 Aug 2020 09:24:15 -0400 Date: Thu, 20 Aug 2020 16:24:06 +0300 Message-Id: <83h7sxcux5.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-Reply-To: (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Thu, 20 Aug 2020 11:27:01 +0200) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> <20200818172824.GA90575@breton.holly.idiocy.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Mattias EngdegÃ¥rd > Date: Thu, 20 Aug 2020 11:27:01 +0200 > Cc: Eli Zaretskii , 42904@debbugs.gnu.org > > There is a minor imperfection: the incoming name string can actually be miscoded if it contains both non-ASCII characters and raw bytes. As an example, consider > > (rename-buffer "aéb\300") > > In xdisp.c:12497, the Lisp name string is created using make_string which decides that the above multibyte string should really be unibyte, and that confuses the converter. It is of no great consequence, but it makes the result look messier than it should have: "a��b��c" instead of "aéb�c". What would you like xdisp.c to do instead in this case? If there's an alternative way of dealing with such frame titles that is better in some sense, we could either adopt it for all platforms, or only for NS. From debbugs-submit-bounces@debbugs.gnu.org Thu Aug 20 09:24:46 2020 Received: (at 42904) by debbugs.gnu.org; 20 Aug 2020 13:24:47 +0000 Received: from localhost ([127.0.0.1]:41968 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k8kYg-0006eP-KS for submit@debbugs.gnu.org; Thu, 20 Aug 2020 09:24:46 -0400 Received: from mailout-l3b-97.contactoffice.com ([212.3.242.97]:60840) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k8kYf-0006eC-Bp for 42904@debbugs.gnu.org; Thu, 20 Aug 2020 09:24:46 -0400 Received: from smtpauth1.co-bxl (smtpauth1.co-bxl [10.2.0.15]) by mailout-l3b-97.contactoffice.com (Postfix) with ESMTP id E81043E8; Thu, 20 Aug 2020 15:24:38 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1597929878; s=20200222-6h9o; d=idiocy.org; i=alan@idiocy.org; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:Content-Type:Content-Transfer-Encoding:In-Reply-To; l=1827; bh=4Of7uwhlyw9E/l+42PMjfHjAauWrySY444SlNVfkzXg=; b=AIOQO7nwUXTThxbVUYsdLdKSCVBlOH6OZjThCbcrUrNTadE7SI/Q3b6qntglM41k U/V2OE/sSt4RWXWENCisg4xMK8qZsFywqnIgv+k7YOnnDBgfI7HHrN49aRl4vRoFKBW g7VEjwgthkkcEq40gd87Jt583RRp3WZrk/G8ysrXTIjWMzp0CIruld/G/jGDrmH3DR+ qwsJW7F88NGc60BX37s+Zn3ncRJrKr8SaSRKMOUmAyg1/tjg4orwh8TXaCbgu1qqL8F hi1Bv9DmQnREH6xg0VX/FoQsnZU/4sHonY8vlmEvdN/4HqWIVtwGxDP9LU+fvnT2uT/ rS/PNWY/+g== Received: by smtp.mailfence.com with ESMTPA ; Thu, 20 Aug 2020 15:24:35 +0200 (CEST) Received: by breton.holly.idiocy.org (Postfix, from userid 501) id 22BCD2024AA321; Thu, 20 Aug 2020 14:24:32 +0100 (BST) Date: Thu, 20 Aug 2020 15:24:36 +0200 (CEST) From: Alan Third To: Mattias =?iso-8859-1?Q?Engdeg=E5rd?= Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS Message-ID: <20200820132432.GA38852@breton.holly.idiocy.org> Mail-Followup-To: Alan Third , Mattias =?iso-8859-1?Q?Engdeg=E5rd?= , Eli Zaretskii , 42904@debbugs.gnu.org References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> <20200818172824.GA90575@breton.holly.idiocy.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Flag: NO X-Spam-Status: No, hits=-2.9 required=4.7 symbols=ALL_TRUSTED, BAYES_00 device=10.2.0.20 X-ContactOffice-Account: com:241649512 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, Eli Zaretskii X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) On Thu, Aug 20, 2020 at 11:27:01AM +0200, Mattias EngdegÃ¥rd wrote: > 18 aug. 2020 kl. 19.28 skrev Alan Third : > > > Looks good to me. The only thought I have is that perhaps we should > > consider extending NSString to handle these lisp strings rather than > > making it a separate function? We could provide a method to convert to > > a lisp string as well, although that's not as complex. > > > > I believe using categories would do it without us having to create a > > new EmacsString class or similar. > > Fun, I hadn't done that before! Of course we should. > > As it happens I just enjoyed the HOPL paper about the history of > Objective-C (https://dl.acm.org/doi/10.1145/3386332). An excellent > read in general, and it has some history about how the categories > came about. I haven't seen that before and am just reading through it now. Thanks. > Here is an updated patch: it is now self-contained and does not > change anything outside the NS backend. It looks good to me. The only thing I'd like you to change is to move the implementation down to the "Class implementations" part of nsfns.m. > There is a minor imperfection: the incoming name string can actually > be miscoded if it contains both non-ASCII characters and raw bytes. > As an example, consider > > (rename-buffer "aéb\300") > > In xdisp.c:12497, the Lisp name string is created using make_string > which decides that the above multibyte string should really be > unibyte, and that confuses the converter. It is of no great > consequence, but it makes the result look messier than it should > have: "a��b��c" instead of "aéb�c". I think we can live with that, it's definitely better than a crash and seems reasonable given that the input is junk. Thanks! -- Alan Third From debbugs-submit-bounces@debbugs.gnu.org Thu Aug 20 13:44:22 2020 Received: (at 42904-done) by debbugs.gnu.org; 20 Aug 2020 17:44:22 +0000 Received: from localhost ([127.0.0.1]:43894 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k8obu-0004P1-Bm for submit@debbugs.gnu.org; Thu, 20 Aug 2020 13:44:22 -0400 Received: from mail76c50.megamailservers.eu ([91.136.10.86]:33678 helo=mail70c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k8obp-0004On-T7 for 42904-done@debbugs.gnu.org; Thu, 20 Aug 2020 13:44:21 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1597945454; bh=EVYmNrDUqge6YJlrrqfsCtH96umbA2xc5m6ihdZNj/I=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=ZnpdoOdLbPtw+7jNbRUYxEGchbrWuo5unT83nikaahqFQiisil0rDG9HOeWqvHQjX akLzrlwYRfshT7tSeRk84dY5VCV9y+ib5ERs5o0abgPpvei1QR9xFEecC2bHnd75Tk O194JfEtwweYjhpCbHtlKHQ8J0VTCOESMPFLqhJA= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail70c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07KHiB0I004263; Thu, 20 Aug 2020 17:44:13 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <20200820132432.GA38852@breton.holly.idiocy.org> Date: Thu, 20 Aug 2020 19:44:11 +0200 Content-Transfer-Encoding: 7bit Message-Id: References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> <20200818172824.GA90575@breton.holly.idiocy.org> <20200820132432.GA38852@breton.holly.idiocy.org> To: Alan Third X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F1A.5F3EB66E.0073, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=OKBZIhSB c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=hIj89exaAAAA:8 a=kErvNtY1--JhOoKpMrsA:9 a=CjuIK1q_8ugA:10 a=LZ7w871ZH3oA:10 a=lS9wXHQM5UdnNJ4u63Ry:22 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 20 aug. 2020 kl. 15.24 skrev Alan Third : > It looks good to me. The only thing I'd like you to change is to move > the implementation down to the "Class implementations" part of > nsfns.m. Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 42904-done Cc: 42904-done@debbugs.gnu.org, Eli Zaretskii X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 20 aug. 2020 kl. 15.24 skrev Alan Third : > It looks good to me. The only thing I'd like you to change is to move > the implementation down to the "Class implementations" part of > nsfns.m. Moved, and pushed. Thank you! From debbugs-submit-bounces@debbugs.gnu.org Thu Aug 20 14:46:30 2020 Received: (at 42904) by debbugs.gnu.org; 20 Aug 2020 18:46:30 +0000 Received: from localhost ([127.0.0.1]:44106 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k8pa1-0003Kz-Mo for submit@debbugs.gnu.org; Thu, 20 Aug 2020 14:46:30 -0400 Received: from mail1448c50.megamailservers.eu ([91.136.14.48]:41126 helo=mail265c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k8pZz-0003ED-GS for 42904@debbugs.gnu.org; Thu, 20 Aug 2020 14:46:28 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1597949180; bh=B2Yuc34gN2AhfrsvknSKcIDUuzOHSJBsTI+PiSxAQKM=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=dZBNjafnvFfgxOPAtWk+vY3UMHASG2IABP4nkCy31apQ5ea5zhu8RldXQT9Tdc2uA MhdD4N4Km3zTTiov6HI+nr/ps0I2d8LT4xM03jFq9/V4pADwMVZtStTQvNJ2Oej3Iy +U8UJqwHbmUfywzYLPhUHHqZEf9BmMHZtUWZhRXc= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail265c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07KIkHnP001847; Thu, 20 Aug 2020 18:46:19 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <83h7sxcux5.fsf@gnu.org> Date: Thu, 20 Aug 2020 20:46:17 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <5719A3A9-06A2-42AF-A290-726D96B6E6F1@acm.org> References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> <20200818172824.GA90575@breton.holly.idiocy.org> <83h7sxcux5.fsf@gnu.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F1D.5F3EC4FC.0049, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=D5w51cZj c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=udPJVzaD4-sKt6oXm6cA:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 20 aug. 2020 kl. 15.24 skrev Eli Zaretskii : > What would you like xdisp.c to do instead in this case? If there's an > alternative way of dealing with such frame titles that is better in > some sense, we could either adopt it for all platforms, [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 20 aug. 2020 kl. 15.24 skrev Eli Zaretskii : > What would you like xdisp.c to do instead in this case? If there's an > alternative way of dealing with such frame titles that is better in > some sense, we could either adopt it for all platforms, or only for > NS. Not sure how to deal with it, but maybe it's just a matter of settling = on multibyte representation when building the title (as in = mode_line_noprop_buf and so on)? I presume that the current ambiguity = comes from when there were good reasons to build these strings in = various unibyte encodings, but maybe it isn't motivated today? If it is at all any trouble at all, just leave it as it is. On the other = hand, perhaps we have found a way to simplify the code by accident. From debbugs-submit-bounces@debbugs.gnu.org Thu Aug 20 15:13:28 2020 Received: (at 42904) by debbugs.gnu.org; 20 Aug 2020 19:13:28 +0000 Received: from localhost ([127.0.0.1]:44155 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k8q07-0007EM-PH for submit@debbugs.gnu.org; Thu, 20 Aug 2020 15:13:28 -0400 Received: from eggs.gnu.org ([209.51.188.92]:33648) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k8q04-0007E7-I0 for 42904@debbugs.gnu.org; Thu, 20 Aug 2020 15:13:26 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46487) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k8pzy-0007SE-87; Thu, 20 Aug 2020 15:13:18 -0400 Received: from [176.228.60.248] (port=3175 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1k8pzx-0003Ln-7f; Thu, 20 Aug 2020 15:13:17 -0400 Date: Thu, 20 Aug 2020 22:13:09 +0300 Message-Id: <834koxcere.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <5719A3A9-06A2-42AF-A290-726D96B6E6F1@acm.org> (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Thu, 20 Aug 2020 20:46:17 +0200) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> <20200818172824.GA90575@breton.holly.idiocy.org> <83h7sxcux5.fsf@gnu.org> <5719A3A9-06A2-42AF-A290-726D96B6E6F1@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Mattias EngdegÃ¥rd > Date: Thu, 20 Aug 2020 20:46:17 +0200 > Cc: alan@idiocy.org, 42904@debbugs.gnu.org > > 20 aug. 2020 kl. 15.24 skrev Eli Zaretskii : > > > What would you like xdisp.c to do instead in this case? If there's an > > alternative way of dealing with such frame titles that is better in > > some sense, we could either adopt it for all platforms, or only for > > NS. > > Not sure how to deal with it, but maybe it's just a matter of settling on multibyte representation when building the title (as in mode_line_noprop_buf and so on)? I don't think I understand. mode_line_noprop_buf gets the bytes, and then we call make_string on it, so the result is the same as the one you'd like to avoid. Or am I missing something? By "settling on multibyte representation", do you mean that we should convert raw bytes to their multibyte form? Or do you mean something else? > I presume that the current ambiguity comes from when there were good reasons to build these strings in various unibyte encodings, but maybe it isn't motivated today? Again, what would you like to have instead? Would calling str_as_multibyte do what you want? The reason we build a unibyte string is that the presence of raw bytes generally means a unibyte string is desired; it's a heuristic. It is also the simplest thing to do in this case, and always works because it doesn't change the byte sequence of the original string. > If it is at all any trouble at all, just leave it as it is. On the other hand, perhaps we have found a way to simplify the code by accident. See above: maybe str_as_multibyte is what you want? From debbugs-submit-bounces@debbugs.gnu.org Fri Aug 21 05:39:41 2020 Received: (at 42904) by debbugs.gnu.org; 21 Aug 2020 09:39:41 +0000 Received: from localhost ([127.0.0.1]:45045 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k93WO-0004NN-Vi for submit@debbugs.gnu.org; Fri, 21 Aug 2020 05:39:41 -0400 Received: from mail153c50.megamailservers.eu ([91.136.10.163]:39980 helo=mail50c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k93WK-0004NA-Nb for 42904@debbugs.gnu.org; Fri, 21 Aug 2020 05:39:39 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1598002773; bh=gArLgrf9Qs5gMBuXiHdWBGMB9YQ4mr5EVef/EvhppoU=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=ESuYIzDd0EUEYQUvjHZG4tlAwnIz7wIBR0clErRQNBAGB+f/X3BXaN0Gc66nUNHFL J/9uJn/gFLrcap8jBJ77h53D54ZFTArv7zGDQ6r1e45Dpzx+Kt5Uf4VdtBsdphVDQZ 2R+vwhVurAmlRixWBDrrXt5dvjQI/IaUYdG1vKnw= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail50c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07L9dUXJ011063; Fri, 21 Aug 2020 09:39:32 +0000 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <834koxcere.fsf@gnu.org> Date: Fri, 21 Aug 2020 11:39:30 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> <20200818172824.GA90575@breton.holly.idiocy.org> <83h7sxcux5.fsf@gnu.org> <5719A3A9-06A2-42AF-A290-726D96B6E6F1@acm.org> <834koxcere.fsf@gnu.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F20.5F3F9655.003B, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=NoevjPVJ c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=IkcTkHD0fZMA:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=n3ZewxPj0Rvn7b0G9hYA:9 a=QEXdDO2ut3YA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 20 aug. 2020 kl. 21.13 skrev Eli Zaretskii : > I don't think I understand. mode_line_noprop_buf gets the bytes, and > then we call make_string on it, so the result is the same as the one > you'd like to avoid. Or am I missing something? > > By " [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 20 aug. 2020 kl. 21.13 skrev Eli Zaretskii : > I don't think I understand. mode_line_noprop_buf gets the bytes, and > then we call make_string on it, so the result is the same as the one > you'd like to avoid. Or am I missing something? >=20 > By "settling on multibyte representation", do you mean that we should > convert raw bytes to their multibyte form? Or do you mean something > else? No, I think we are talking about the same thing. Basically, it's about = how the bytes end up in mode_line_noprop_buf in the first place, since = currently the information of whether it should be interpreted as unibyte = or multibyte gets lost as soon as data from the strings it is composed = of (like the buffer name for %b, file name for %f etc) is added to it. = Then make_string tries to restore that information by looking at the = bytes, and it is not always accurate. One way of doing this is to always make sure that the input strings = (buffer name, file name, frame-title-format etc) are always in multibyte = form. Another would be to convert to multibyte as those strings are = used, presumably in decode_mode_spec. You know this code a lot better = than I do, but the former may be slightly more workable (and efficient). > Again, what would you like to have instead? Would calling > str_as_multibyte do what you want? No, I don't think so -- once the unibyte/multibyte bit is lost, it can = only be restored imperfectly if all we have is the sequence of bytes. In = mathematical terms, the function that maps an arbitrary string object to = its bytes has no inverse. (Consider the unibyte string "\xc3\xa5" -- = should the bytes {c3, a5} be recreated as that unibyte string, or as the = multibyte string "=C3=A5"?) Again we are talking about trivialities here, but perhaps the same = syndrome will arise in other contexts where it matters more. If we wrote = Emacs from scratch we likely wouldn't have unibyte strings at all: they = are only there for compatibility and various niche uses and performance = hacks. I don't think it's unreasonable to start normalising strings to = multibyte where it matters. Thanks for your patience! From debbugs-submit-bounces@debbugs.gnu.org Fri Aug 21 09:26:27 2020 Received: (at 42904) by debbugs.gnu.org; 21 Aug 2020 13:26:27 +0000 Received: from localhost ([127.0.0.1]:45425 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k973r-0007Xg-AL for submit@debbugs.gnu.org; Fri, 21 Aug 2020 09:26:27 -0400 Received: from eggs.gnu.org ([209.51.188.92]:34440) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k973o-0007XO-1e for 42904@debbugs.gnu.org; Fri, 21 Aug 2020 09:26:25 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:43288) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k973h-0007rk-KZ; Fri, 21 Aug 2020 09:26:17 -0400 Received: from [176.228.60.248] (port=2495 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1k973g-0004oq-QW; Fri, 21 Aug 2020 09:26:17 -0400 Date: Fri, 21 Aug 2020 16:26:11 +0300 Message-Id: <83ft8gb05o.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-Reply-To: (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Fri, 21 Aug 2020 11:39:30 +0200) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> <20200818172824.GA90575@breton.holly.idiocy.org> <83h7sxcux5.fsf@gnu.org> <5719A3A9-06A2-42AF-A290-726D96B6E6F1@acm.org> <834koxcere.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Mattias EngdegÃ¥rd > Date: Fri, 21 Aug 2020 11:39:30 +0200 > Cc: alan@idiocy.org, 42904@debbugs.gnu.org > > Basically, it's about how the bytes end up in mode_line_noprop_buf in the first place, since currently the information of whether it should be interpreted as unibyte or multibyte gets lost as soon as data from the strings it is composed of (like the buffer name for %b, file name for %f etc) is added to it. Then make_string tries to restore that information by looking at the bytes, and it is not always accurate. make_string was written to work on byte sequences that don't begin as the payload of a Lisp string. So it doesn't handle the information you say is being lost, because it doesn't expect such information to be available to begin with. Which is basically just another way of saying "you want something other than make_string" here. > One way of doing this is to always make sure that the input strings (buffer name, file name, frame-title-format etc) are always in multibyte form. That's what I thought I was suggesting. > > Again, what would you like to have instead? Would calling > > str_as_multibyte do what you want? > > No, I don't think so -- once the unibyte/multibyte bit is lost, it can only be restored imperfectly if all we have is the sequence of bytes. That is true, but str_as_multibyte simply interprets any valid UTF-8 sequence as a character, and any invalid sequence as a raw bytes. I thought this was precisely what you wanted for this use case, no? > If we wrote Emacs from scratch we likely wouldn't have unibyte strings at all: they are only there for compatibility and various niche uses and performance hacks. I don't think it's unreasonable to start normalising strings to multibyte where it matters. Emacs (as any other old editor) started with only unibyte strings, so that's history for you. Some modern text-handling environments solve this conundrum by not supporting raw bytes at all, but Emacs knows better. From debbugs-submit-bounces@debbugs.gnu.org Fri Aug 21 10:53:44 2020 Received: (at 42904) by debbugs.gnu.org; 21 Aug 2020 14:53:44 +0000 Received: from localhost ([127.0.0.1]:48014 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k98QK-0002au-08 for submit@debbugs.gnu.org; Fri, 21 Aug 2020 10:53:44 -0400 Received: from mail211c50.megamailservers.eu ([91.136.10.221]:40438 helo=mail194c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k98QG-0002ai-Vm for 42904@debbugs.gnu.org; Fri, 21 Aug 2020 10:53:42 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1598021618; bh=YS773kL1EwMm7xxQRucKTKe0UFRN/ruTStAYy6zR9uk=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=VH0pgjkHuz5Tlr8Oz8lKXZ+j/9X7e+ZSKTrNx0nNowKYU4iaWdnWe9sWpGUMFMJdg jtCyg2TVTlw65nrv8XAv6wICIXXfLbJ42ivBxt5PBvKVPJ2Zf+5DQnCjDvYelFqCAr 4VF6gp+tMx8EIPJPqIwwnzitf5xQIO7M2NshQsIM= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail194c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07LErZF4012154; Fri, 21 Aug 2020 14:53:37 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Message-Id: Content-Type: multipart/mixed; boundary="Apple-Mail=_8420675B-ECF9-4B50-A662-D373F4D826CA" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS Date: Fri, 21 Aug 2020 16:53:34 +0200 In-Reply-To: <83ft8gb05o.fsf@gnu.org> To: Eli Zaretskii References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> <20200818172824.GA90575@breton.holly.idiocy.org> <83h7sxcux5.fsf@gnu.org> <5719A3A9-06A2-42AF-A290-726D96B6E6F1@acm.org> <834koxcere.fsf@gnu.org> <83ft8gb05o.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F1A.5F3FDFF2.0031, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=KsozJleN c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=TWrO6TkA-wukqx8Q5AwA:9 a=QEXdDO2ut3YA:10 a=Ee3TwFF96ZHXJ2WqImsA:9 a=B2y7HmGcmWMA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Origin-Country: SE X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --Apple-Mail=_8420675B-ECF9-4B50-A662-D373F4D826CA Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 21 aug. 2020 kl. 15.26 skrev Eli Zaretskii : > That is true, but str_as_multibyte simply interprets any valid UTF-8 > sequence as a character, and any invalid sequence as a raw bytes. I > thought this was precisely what you wanted for this use case, no? Sorry, I read the comment for that function and got the impression that = it would interpret raw bytes as Latin-1. Fortunately that wasn't true, = and using it seems to be a clear improvement. Now a mixture of non-ASCII = and raw bytes, like "a\377b=C3=BCc" results in the title "a=EF=BF=BD=EF=BF= =BDb=C3=BCc", which is one =EF=BF=BD too many but good enough. What about the attached patch then? Only tested on macOS, admittedly. --Apple-Mail=_8420675B-ECF9-4B50-A662-D373F4D826CA Content-Disposition: attachment; filename=0001-Always-make-a-multibyte-string-for-the-frame-title-b.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Always-make-a-multibyte-string-for-the-frame-title-b.patch" Content-Transfer-Encoding: quoted-printable =46rom=20dbe9ee59a179a3d42f337c60b5f426f6ff2913ca=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Fri,=2021=20Aug=202020=2016:09:04=20+0200=0A= Subject:=20[PATCH]=20Always=20make=20a=20multibyte=20string=20for=20the=20= frame=20title=0A=20(bug#42904)=0A=0A*=20src/xdisp.c=20= (gui_consider_frame_title):=20Multibyte-encode=20any=20raw=0Abytes=20in=20= the=20title,=20and=20then=20pass=20a=20multibyte=20string=20to=20the=20= back-end=0Afor=20use=20as=20a=20frame=20title.=20=20This=20cuts=20down=20= a=20little=20on=20the=20rubbish=0Ashown=20when=20raw=20bytes=20sneak=20= in=20by=20mistake=20(as=20part=20of=20the=20buffer=20name,=0Afor=20= instance).=0A---=0A=20src/xdisp.c=20|=2012=20+++++++++---=0A=201=20file=20= changed,=209=20insertions(+),=203=20deletions(-)=0A=0Adiff=20--git=20= a/src/xdisp.c=20b/src/xdisp.c=0Aindex=20ad03ac4605..9eeae43a52=20100644=0A= ---=20a/src/xdisp.c=0A+++=20b/src/xdisp.c=0A@@=20-12482,6=20+12482,11=20= @@=20gui_consider_frame_title=20(Lisp_Object=20frame)=0A=20=20=20=20=20=20= =20display_mode_element=20(&it,=200,=20-1,=20-1,=20fmt,=20Qnil,=20= false);=0A=20=20=20=20=20=20=20len=20=3D=20MODE_LINE_NOPROP_LEN=20= (title_start);=0A=20=20=20=20=20=20=20title=20=3D=20mode_line_noprop_buf=20= +=20title_start;=0A+=20=20=20=20=20=20/*=20Make=20sure=20any=20raw=20= bytes=20in=20the=20title=20are=20properly=0A+=20=20=20=20=20=20=20=20=20= multibyte-encoded.=20=20*/=0A+=20=20=20=20=20=20ptrdiff_t=20nchars=20=3D=20= 0;=0A+=20=20=20=20=20=20len=20=3D=20str_as_multibyte=20(title,=20= mode_line_noprop_buf_end=20-=20title,=0A+=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20len,=20= &nchars);=0A=20=20=20=20=20=20=20unbind_to=20(count,=20Qnil);=0A=20=0A=20= =20=20=20=20=20=20/*=20Set=20the=20title=20only=20if=20it's=20changed.=20= =20This=20avoids=20consing=20in=0A@@=20-12493,9=20+12498,10=20@@=20= gui_consider_frame_title=20(Lisp_Object=20frame)=0A=20=20=20=20=20=20=20=20= =20=20=20=20||=20SBYTES=20(f->name)=20!=3D=20len=0A=20=20=20=20=20=20=20=20= =20=20=20=20||=20memcmp=20(title,=20SDATA=20(f->name),=20len)=20!=3D=20= 0)=0A=20=20=20=20=20=20=20=20=20=20=20&&=20FRAME_TERMINAL=20= (f)->implicit_set_name_hook)=0A-=09FRAME_TERMINAL=20= (f)->implicit_set_name_hook=20(f,=0A-=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20make_string=20(title,=20= len),=0A-=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20Qnil);=0A+=20=20=20=20=20=20=20=20{=0A+=20=20=20=20=20=20=20= =20=20=20Lisp_Object=20title_string=20=3D=20make_multibyte_string=20= (title,=20nchars,=20len);=0A+=20=20=20=20=20=20=20=20=20=20= FRAME_TERMINAL=20(f)->implicit_set_name_hook=20(f,=20title_string,=20= Qnil);=0A+=20=20=20=20=20=20=20=20}=0A=20=20=20=20=20}=0A=20}=0A=20=0A--=20= =0A2.21.1=20(Apple=20Git-122.3)=0A=0A= --Apple-Mail=_8420675B-ECF9-4B50-A662-D373F4D826CA-- From debbugs-submit-bounces@debbugs.gnu.org Fri Aug 21 11:28:07 2020 Received: (at 42904) by debbugs.gnu.org; 21 Aug 2020 15:28:07 +0000 Received: from localhost ([127.0.0.1]:48093 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k98xb-00080h-6M for submit@debbugs.gnu.org; Fri, 21 Aug 2020 11:28:07 -0400 Received: from eggs.gnu.org ([209.51.188.92]:34188) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k98xX-000805-P1 for 42904@debbugs.gnu.org; Fri, 21 Aug 2020 11:28:05 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:45150) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k98xR-0005v4-Gt; Fri, 21 Aug 2020 11:27:57 -0400 Received: from [176.228.60.248] (port=2518 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1k98xP-0007YP-Sp; Fri, 21 Aug 2020 11:27:56 -0400 Date: Fri, 21 Aug 2020 18:27:51 +0300 Message-Id: <838se8auiw.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-Reply-To: (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Fri, 21 Aug 2020 16:53:34 +0200) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> <20200818172824.GA90575@breton.holly.idiocy.org> <83h7sxcux5.fsf@gnu.org> <5719A3A9-06A2-42AF-A290-726D96B6E6F1@acm.org> <834koxcere.fsf@gnu.org> <83ft8gb05o.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Mattias EngdegÃ¥rd > Date: Fri, 21 Aug 2020 16:53:34 +0200 > Cc: alan@idiocy.org, 42904@debbugs.gnu.org > > > That is true, but str_as_multibyte simply interprets any valid UTF-8 > > sequence as a character, and any invalid sequence as a raw bytes. I > > thought this was precisely what you wanted for this use case, no? > > Sorry, I read the comment for that function and got the impression that it would interpret raw bytes as Latin-1. That was a remnant from pre-Unicode Emacs; I've fixed the commentary to accurately describe what happens now. > Fortunately that wasn't true, and using it seems to be a clear improvement. Now a mixture of non-ASCII and raw bytes, like "a\377büc" results in the title "a��büc", which is one � too many but good enough. > > What about the attached patch then? Only tested on macOS, admittedly. It looks OK, but someone should see what it does on X before we make this change on all platforms. (On w32 frames, the display stops before the first raw byte, but it also does that with the current code.) If the effect on X is for the worse, we will have to condition this by HAVE_NS. > title = mode_line_noprop_buf + title_start; > + /* Make sure any raw bytes in the title are properly > + multibyte-encoded. */ It is better not to use "encoded" when talking about internal representation. I'd say something like "represented by their multibyte sequences" instead. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Fri Aug 21 11:50:20 2020 Received: (at 42904) by debbugs.gnu.org; 21 Aug 2020 15:50:21 +0000 Received: from localhost ([127.0.0.1]:48123 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k99J6-0002KG-NK for submit@debbugs.gnu.org; Fri, 21 Aug 2020 11:50:20 -0400 Received: from mail1433c50.megamailservers.eu ([91.136.14.33]:38380 helo=mail263c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k99J4-0002Jy-CK for 42904@debbugs.gnu.org; Fri, 21 Aug 2020 11:50:19 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1598025011; bh=G2h5ZlFSFQ4qjhVMcw9NMoMMT6AjuES24IAyPHwUw0Q=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=DvuG4/1lUXc/yyH8pxhlOX52Uc6rtcrVq4NgLiFR8HqWPiuV/hqVcyj6K7gqIDPmf giqoDBSOP9yLTpDHhpLhcC8kAwqXNHAt/6sae+vBb/TUezufZM6O3bh2CGIRNmjrmF jMUc/4e5aMuB62DozwGNDSINa/WzvXLVh5G8tfEo= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail263c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07LFo8ti028620; Fri, 21 Aug 2020 15:50:10 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <838se8auiw.fsf@gnu.org> Date: Fri, 21 Aug 2020 17:50:07 +0200 Content-Transfer-Encoding: 7bit Message-Id: <2FDA288E-0218-4A21-A3FD-BB187E5DFCFE@acm.org> References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> <20200818172824.GA90575@breton.holly.idiocy.org> <83h7sxcux5.fsf@gnu.org> <5719A3A9-06A2-42AF-A290-726D96B6E6F1@acm.org> <834koxcere.fsf@gnu.org> <83ft8gb05o.fsf@gnu.org> <838se8auiw.fsf@gnu.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F20.5F3FED32.009D, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=e6d4tph/ c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=KrI6s-L9-nqN3BeWhtEA:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 21 aug. 2020 kl. 17.27 skrev Eli Zaretskii : > That was a remnant from pre-Unicode Emacs; I've fixed the commentary > to accurately describe what happens now. Much appreciated. Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 21 aug. 2020 kl. 17.27 skrev Eli Zaretskii : > That was a remnant from pre-Unicode Emacs; I've fixed the commentary > to accurately describe what happens now. Much appreciated. > It looks OK, but someone should see what it does on X before we make > this change on all platforms. (On w32 frames, the display stops > before the first raw byte, but it also does that with the current > code.) If the effect on X is for the worse, we will have to condition > this by HAVE_NS. I will be able to test on X in a few days. > It is better not to use "encoded" when talking about internal > representation. I'd say something like "represented by their > multibyte sequences" instead. Thank you, the comment will be amended. From debbugs-submit-bounces@debbugs.gnu.org Sun Aug 23 13:23:41 2020 Received: (at 42904) by debbugs.gnu.org; 23 Aug 2020 17:23:41 +0000 Received: from localhost ([127.0.0.1]:55049 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k9tiX-0008Pv-Ab for submit@debbugs.gnu.org; Sun, 23 Aug 2020 13:23:41 -0400 Received: from mail205c50.megamailservers.eu ([91.136.10.215]:48102 helo=mail193c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k9tiV-0008Pm-Ej for 42904@debbugs.gnu.org; Sun, 23 Aug 2020 13:23:40 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1598203417; bh=2/yNvJYsrNmNj5Yz5MzbIKbwGWJkWjpi6WyLPKufwnY=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=VkwPCbE+eP3Qdt1P9/4A6oT1N9daFJRMuLIPDrqOmIU3+rvN7looN8+xgQiHJGgVE aeJzI5OiNZIftVHCaB66Z3c9k94ki1BOhgOGznGkY+Wy3eY8wAmhjyJWc6fwoh8qrf Bn4lg4HiV3SD/j6TY+DF3cr0bK3EbAee6I5SHgUg= Feedback-ID: mattiase@acm.or Received: from stanniol.lan (c-304ee655.032-75-73746f71.bbcust.telenor.se [85.230.78.48]) (authenticated bits=0) by mail193c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 07NHNYEu022733; Sun, 23 Aug 2020 17:23:35 +0000 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\)) Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <2FDA288E-0218-4A21-A3FD-BB187E5DFCFE@acm.org> Date: Sun, 23 Aug 2020 19:23:33 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <83lfidgtc7.fsf@gnu.org> <838sedgq2x.fsf@gnu.org> <02F52D43-7EAB-4E61-A567-E8CCD11D856B@acm.org> <20200817195610.GA70682@breton.holly.idiocy.org> <3F71EF82-A143-4E3A-AEF3-8A236091891D@acm.org> <20200818084306.GA89999@breton.holly.idiocy.org> <243A5DA8-2865-485D-A8A2-1F543B046BAA@acm.org> <20200818172824.GA90575@breton.holly.idiocy.org> <83h7sxcux5.fsf@gnu.org> <5719A3A9-06A2-42AF-A290-726D96B6E6F1@acm.org> <834koxcere.fsf@gnu.org> <83ft8gb05o.fsf@gnu.org> <838se8auiw.fsf@gnu.org> <2FDA288E-0218-4A21-A3FD-BB187E5DFCFE@acm.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.104.15) X-CTCH-RefID: str=0001.0A782F1F.5F42A619.0026, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=cM2eTWWN c=1 sm=1 tr=0 a=63Z2wlQ1NB3xHpgKFKE71g==:117 a=63Z2wlQ1NB3xHpgKFKE71g==:17 a=IkcTkHD0fZMA:10 a=M51BFTxLslgA:10 a=N54-gffFAAAA:8 a=CEq4Y_VM0_W-os3u44IA:9 a=QEXdDO2ut3YA:10 a=6l0D2HzqY3Epnrm8mE3f:22 X-Origin-Country: SE X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 42904 Cc: 42904@debbugs.gnu.org, alan@idiocy.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 21 aug. 2020 kl. 17.50 skrev Mattias Engdeg=C3=A5rd : >> It looks OK, but someone should see what it does on X before we make >> this change on all platforms. (On w32 frames, the display stops >> before the first raw byte, but it also does that with the current >> code.) If the effect on X is for the worse, we will have to = condition >> this by HAVE_NS. >=20 > I will be able to test on X in a few days. Now tested on X with no regression observed, thus pushed to master. = Thank you! From unknown Fri Aug 08 19:51:30 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Mon, 21 Sep 2020 11:24:03 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator