GNU bug report logs - #42904
[PATCH] Non-Unicode frame title crashes Emacs on macOS

Previous Next

Package: emacs;

Reported by: Mattias Engdegård <mattiase <at> acm.org>

Date: Mon, 17 Aug 2020 14:13:02 UTC

Severity: normal

Tags: patch

Merged with 41184

Found in version 28.0.50

Done: Mattias Engdegård <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

Full log


Message #92 received at 42904 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 42904 <at> debbugs.gnu.org, alan <at> idiocy.org
Subject: Re: bug#42904: [PATCH] Non-Unicode frame title crashes Emacs on macOS
Date: Fri, 21 Aug 2020 18:27:51 +0300
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Fri, 21 Aug 2020 16:53:34 +0200
> Cc: alan <at> idiocy.org, 42904 <at> debbugs.gnu.org
> 
> > That is true, but str_as_multibyte simply interprets any valid UTF-8
> > sequence as a character, and any invalid sequence as a raw bytes.  I
> > thought this was precisely what you wanted for this use case, no?
> 
> Sorry, I read the comment for that function and got the impression that it would interpret raw bytes as Latin-1.

That was a remnant from pre-Unicode Emacs; I've fixed the commentary
to accurately describe what happens now.

> Fortunately that wasn't true, and using it seems to be a clear improvement. Now a mixture of non-ASCII and raw bytes, like "a\377büc" results in the title "a��büc", which is one � too many but good enough.
> 
> What about the attached patch then? Only tested on macOS, admittedly.

It looks OK, but someone should see what it does on X before we make
this change on all platforms.  (On w32 frames, the display stops
before the first raw byte, but it also does that with the current
code.)  If the effect on X is for the worse, we will have to condition
this by HAVE_NS.

>        title = mode_line_noprop_buf + title_start;
> +      /* Make sure any raw bytes in the title are properly
> +         multibyte-encoded.  */

It is better not to use "encoded" when talking about internal
representation.  I'd say something like "represented by their
multibyte sequences" instead.

Thanks.




This bug report was last modified 4 years and 269 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.