GNU bug report logs - #28179
Fix use of string-to-multibyte in ispell.el

Previous Next

Package: emacs;

Reported by: Reuben Thomas <rrt <at> sc3d.org>

Date: Tue, 22 Aug 2017 00:53:01 UTC

Severity: minor

Done: Reuben Thomas <rrt <at> sc3d.org>

Bug is archived. No further changes may be made.

Full log


Message #29 received at 28179 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Reuben Thomas <rrt <at> sc3d.org>
Cc: 28179 <at> debbugs.gnu.org
Subject: Re: Fwd: Re: bug#28179: Fix use of string-to-multibyte in ispell.el
Date: Thu, 24 Aug 2017 21:20:46 +0300
> Cc: 28179 <at> debbugs.gnu.org
> From: Reuben Thomas <rrt <at> sc3d.org>
> Date: Thu, 24 Aug 2017 18:45:33 +0100
> 
> The reason I am asking again is because you first said:
> 
> > What if decode-coding-string returns a pure ASCII string, which is
> > therefore unibyte?
> 
> and then later you said:
> 
> > The way I meant it, it has to do with the internal flag marking a
> > string either unibyte or multibyte. Observe:
> >   (multibyte-string-p "abcd") => nil
> >
> > but
> >
> >   (multibyte-string-p (decode-coding-string "abcd" 'utf-8)) => t

That example may be conclusive for UTF-8, but is it conclusive for
_any_ encoding?  I don't know.  E.g., what about the ISO-2022 based
encodings, where all the bytes are (AFAIR) pure ASCII?

> 1. As far as I can tell from the above (and my own confirmatory
> experiments and reading of the documentation), a pure ASCII string can
> be multibyte (it's a matter of the multibyte flag, not the number of
> bytes used to store each character).
> 
> 2. decode-coding-string always returns a multibyte string.

Can you show me why 2 is always correct?  It might be, I simply don't
know.  All I know is that in general relying on plain-ASCII strings to
be always multibyte in any given situation is risky, we were bitten by
that a few times.  But maybe it's not an issue in this case.  Which is
why I was asking you whether you have sufficient basis to believe this
to be so in this case.

> Since these two observations seemed to mean that you contradicted
> yourself, I was checking whether in fact I had misunderstood (so that
> for example one of my two observations above is wrong), or if your
> original understanding was incomplete (so that in fact your question
> about decode-coding-string is therefore misguided, because it can return
> a pure ASCII unibyte string (in the coding sense) which is nonetheless a
> multibyte string (in the sense that multibyte-string-p on it returns t).

I only used decode-coding-string because I remembered it as an easy
way of creating a multibyte ASCII string, when the coding-system is
UTF-8, that's all.  There was no contradiction in what I said, at
least not an intended one.




This bug report was last modified 7 years and 269 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.