GNU bug report logs - #1174
23.0.60; Some UTF-8 mails displaying wrongly in Emacs 23

Previous Next

Package: emacs;

Reported by: usenet <at> frank-schmitt.net

Date: Wed, 15 Oct 2008 20:30:02 UTC

Severity: normal

Done: Lars Magne Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


Message #290 received at 1174 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Simon Josefsson <jas <at> extundo.com>
Cc: Frank Schmitt <ich <at> frank-schmitt.net>, ding <at> gnus.org,
        1174 <at> debbugs.gnu.org
Subject: Re: bug#1174: 23.0.60; Some UTF-8 mails displaying wrongly in Emacs 23
Date: Tue, 02 Dec 2008 02:36:31 -0500
> In Emacs 21 (which Gnus still aim to be compatible with), we have
> string-as-multibyte, but not string-to-multibyte.  So your proposed
> code (i.e. mm-string-to-multibyte) runs
>   (string-as-multibyte (char-to-string string))
> whereas we used to run
>   (string-as-multibyte string)
> Does char-to-string matter here?

> (defalias 'mm-string-to-multibyte
>   (cond
>    ((featurep 'xemacs)
>     'identity)
>    ((fboundp 'string-to-multibyte)
>     'string-to-multibyte)
>    (t
>     (lambda (string)
>       "Return a multibyte string with the same individual chars as string."
>       (mapconcat
>        (lambda (ch) (mm-string-as-multibyte (char-to-string ch)))
>        string "")))))

Oh, that's clever: yes, the mapconcat/char-to-string dance does make it
implement the string-to-multibyte behavior because doing the
string-as-multibyte conversion one byte at a time avoids the
problematic case.  To quote myself from mm-util.el:

     ;; string-as-multibyte often doesn't really do what you think it does.
     ;; Example:
     ;;    (aref (string-as-multibyte "\201") 0) -> 129 (aka ?\201)
     ;;    (aref (string-as-multibyte "\300") 0) -> 192 (aka ?\300)
     ;;    (aref (string-as-multibyte "\300\201") 0) -> 192 (aka ?\300)
     ;;    (aref (string-as-multibyte "\300\201") 1) -> 129 (aka ?\201)
     ;; but
     ;;    (aref (string-as-multibyte "\201\300") 0) -> 2240
     ;;    (aref (string-as-multibyte "\201\300") 1) -> <error>

Basically when the sring passed is made of a single byte,
string-as-multibyte is equal to string-to-multibyte, which is the
property ued by the code you quoted above to build a poor man's
string-to-multibyte.


        Stefan




This bug report was last modified 13 years and 256 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.