GNU bug report logs - #28179
Fix use of string-to-multibyte in ispell.el

Previous Next

Package: emacs;

Reported by: Reuben Thomas <rrt <at> sc3d.org>

Date: Tue, 22 Aug 2017 00:53:01 UTC

Severity: minor

Done: Reuben Thomas <rrt <at> sc3d.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Reuben Thomas <rrt <at> sc3d.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 28179 <at> debbugs.gnu.org
Subject: bug#28179: Fwd: Re: bug#28179: Fix use of string-to-multibyte in ispell.el
Date: Thu, 24 Aug 2017 19:50:17 +0100
On 24 August 2017 at 19:20, Eli Zaretskii <eliz <at> gnu.org> wrote:
>> Cc: 28179 <at> debbugs.gnu.org
>> From: Reuben Thomas <rrt <at> sc3d.org>
>> Date: Thu, 24 Aug 2017 18:45:33 +0100
>>
>> The reason I am asking again is because you first said:
>>
>> > What if decode-coding-string returns a pure ASCII string, which is
>> > therefore unibyte?
>>
>> and then later you said:
>>
>> > The way I meant it, it has to do with the internal flag marking a
>> > string either unibyte or multibyte. Observe:
>> >   (multibyte-string-p "abcd") => nil
>> >
>> > but
>> >
>> >   (multibyte-string-p (decode-coding-string "abcd" 'utf-8)) => t
>
> That example may be conclusive for UTF-8, but is it conclusive for
> _any_ encoding?  I don't know.  E.g., what about the ISO-2022 based
> encodings, where all the bytes are (AFAIR) pure ASCII?

(multibyte-string-p (decode-coding-string "abcd" 'iso-2022-jp)) => t

I still don't understand what you're getting at: the bytes in "abcd"
are pure ASCII, whatever coding system one is decoding from.

> Can you show me why 2 is always correct?  It might be, I simply don't
> know.  All I know is that in general relying on plain-ASCII strings to
> be always multibyte in any given situation is risky, we were bitten by
> that a few times.  But maybe it's not an issue in this case.  Which is
> why I was asking you whether you have sufficient basis to believe this
> to be so in this case.

I don't know.

As I said before, the make-obsolete notice for string-to-multibyte
says "use `decode-coding-string'". If it is as tricky as you suggest
it might be, then the notice should be updated to point to more
detailed guidance.

The relevant commit is:

commit f74d496478cd57f252817bd7437fe1b7972ce01f
Author: Stefan Monnier <monnier <at> iro.umontreal.ca>
Date:   Mon Jan 30 13:02:18 2017 -0500

    * lisp/subr.el (string-make-unibyte, string-make-multibyte): Obsolete.

diff --git a/lisp/subr.el b/lisp/subr.el
index a6ba05c..a204577 100644
--- a/lisp/subr.el
+++ b/lisp/subr.el
@@ -1417,8 +1417,10 @@ posn-object-width-height
 ;; bug#23850
 (make-obsolete 'string-to-unibyte   "use `encode-coding-string'." "26.1")
 (make-obsolete 'string-as-unibyte   "use `encode-coding-string'." "26.1")
+(make-obsolete 'string-make-unibyte   "use `encode-coding-string'." "26.1")
 (make-obsolete 'string-to-multibyte "use `decode-coding-string'." "26.1")
 (make-obsolete 'string-as-multibyte "use `decode-coding-string'." "26.1")
+(make-obsolete 'string-make-multibyte "use `decode-coding-string'." "26.1")

I'm going to close this bug; if better documentation is needed, both
for the obsolescence of string-to-multibyte and for multibyte strings
in general, that's a new bug.

-- 
https://rrt.sc3d.org




This bug report was last modified 7 years and 268 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.