#28179 - Fix use of string-to-multibyte in ispell.el

GNU bug report logs - #28179
Fix use of string-to-multibyte in ispell.el

Package: emacs;

Reported by: Reuben Thomas <rrt <at> sc3d.org>

Date: Tue, 22 Aug 2017 00:53:01 UTC

Severity: minor

Done: Reuben Thomas <rrt <at> sc3d.org>

Bug is archived. No further changes may be made.

Message #29 received at 28179 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org> To: Reuben Thomas <rrt <at> sc3d.org> Cc: 28179 <at> debbugs.gnu.org Subject: Re: Fwd: Re: bug#28179: Fix use of string-to-multibyte in ispell.el Date: Thu, 24 Aug 2017 21:20:46 +0300

> Cc: 28179 <at> debbugs.gnu.org > From: Reuben Thomas <rrt <at> sc3d.org> > Date: Thu, 24 Aug 2017 18:45:33 +0100 > > The reason I am asking again is because you first said: > > > What if decode-coding-string returns a pure ASCII string, which is > > therefore unibyte? > > and then later you said: > > > The way I meant it, it has to do with the internal flag marking a > > string either unibyte or multibyte. Observe: > > (multibyte-string-p "abcd") => nil > > > > but > > > > (multibyte-string-p (decode-coding-string "abcd" 'utf-8)) => t That example may be conclusive for UTF-8, but is it conclusive for _any_ encoding? I don't know. E.g., what about the ISO-2022 based encodings, where all the bytes are (AFAIR) pure ASCII? > 1. As far as I can tell from the above (and my own confirmatory > experiments and reading of the documentation), a pure ASCII string can > be multibyte (it's a matter of the multibyte flag, not the number of > bytes used to store each character). > > 2. decode-coding-string always returns a multibyte string. Can you show me why 2 is always correct? It might be, I simply don't know. All I know is that in general relying on plain-ASCII strings to be always multibyte in any given situation is risky, we were bitten by that a few times. But maybe it's not an issue in this case. Which is why I was asking you whether you have sufficient basis to believe this to be so in this case. > Since these two observations seemed to mean that you contradicted > yourself, I was checking whether in fact I had misunderstood (so that > for example one of my two observations above is wrong), or if your > original understanding was incomplete (so that in fact your question > about decode-coding-string is therefore misguided, because it can return > a pure ASCII unibyte string (in the coding sense) which is nonetheless a > multibyte string (in the sense that multibyte-string-p on it returns t). I only used decode-coding-string because I remembered it as an easy way of creating a multibyte ASCII string, when the coding-system is UTF-8, that's all. There was no contradiction in what I said, at least not an intended one.

This bug report was last modified 7 years and 363 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #28179 Fix use of string-to-multibyte in ispell.el

GNU bug report logs - #28179
Fix use of string-to-multibyte in ispell.el