GNU bug report logs -
#24206
25.1; Curly quotes generate invalid strings, leading to a segfault
Previous Next
Reported by: Phil <p.stephani2 <at> gmail.com>
Date: Thu, 11 Aug 2016 18:57:02 UTC
Severity: normal
Found in version 25.1
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
> Cc: p.stephani2 <at> gmail.com, 24206 <at> debbugs.gnu.org, johnw <at> gnu.org,
> nicolas <at> petton.fr
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Sun, 14 Aug 2016 09:51:43 -0500
>
> On 08/14/2016 09:27 AM, Eli Zaretskii wrote:
> > The "length = 1" part is only correct if the argument string is
> > multibyte, and should otherwise count the number of bytes in
> > uLSQM/uRSQ, right?
>
> This string is by definition multibyte at that point, since that part of
> the code is inserting a Unicode character that is not ASCII.
Sorry, I don't understand what you are saying. The sequence of bytes
"\xE2\x80\x98" can be either a sequence of unibyte bytes or a single
multibyte character, depending on whether a string it is in is unibyte
or multibyte.
More generally, a Lisp string with the same sequence of bytes as its
data can be treated either as unibyte or as multibyte, I'm sure you
know that. Its multibyteness is entirely in Emacs's imagination.
> More generally, Fsubstitute_command_keys is quite confused about unibyte
> versus multibyte issues. It merges together a number of strings, and
> assumes that they are all multibyte iff the original string is
> multibyte, which is obviously not true in general.
Could you please point out the specific places where this is done?
Because I'm not sure I agree with your interpretation. (Let's use the
code on emacs-25, where it was still not changed, for the purposes of
this discussion.)
Thanks.
This bug report was last modified 8 years and 339 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.