Reported by: Stefan Monnier <monnier <at> iro.umontreal.ca>
Date: Mon, 15 Nov 2010 21:43:01 UTC
Severity: normal
Tags: fixed
Found in version 24.0.50
Fixed in version 24.1
Done: Lars Magne Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Stefan Monnier <monnier <at> iro.umontreal.ca> To: 7410 <at> debbugs.gnu.org Subject: bug#7410: Impossible multibyte->unibyte conversion Date: Mon, 15 Nov 2010 16:46:53 -0500
Package: Emacs Version: 24.0.50 I get incorrect treatment of accents in gnus-article-wash-html in the trunk. More specifically, accents from latin-1 HTML email get turned into \NNN byte chars. With extra checks, I get that the accented chars are properly decoded into the *mm*<4> buffer, and then in mm-shr, we do (mm-with-part handle (when (and charset (setq charset (mm-charset-to-coding-system charset)) (not (eq charset 'ascii))) (insert (prog1 (mm-decode-coding-string (buffer-string) charset) (erase-buffer) (mm-enable-multibyte)))) (libxml-parse-html-region (point-min) (point-max))) where mm-part inserts the `handle' part into a unibyte temp buffer, thus turning those latin-1 accents back into bytes (well, in my own branch of Emacs this signals an error instead, which is how I caught it). It looks like mm-handle-buffer does not consistently return bytes (as it usually does) but also occasionally returns chars. Such inconsistencies will hurt until we get rid of them. Stefan In GNU Emacs 24.0.50.1 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 2010-11-04 on ceviche Windowing system distributor `The X.Org Foundation', version 11.0.10707000 configured using `configure 'CFLAGS=-Wall -Wno-pointer-sign -DUSE_LISP_UNION_TYPE -DSYNC_INPUT -DENABLE_CHECKING -DXASSERTS -DFONTSET_DEBUG -g -O1 -I/usr/include/GNUstep' '--enable-maintainer-mode' '--with-x-toolkit=lucid'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: fr_CH.UTF-8 value of $XMODIFIERS: nil locale-coding-system: utf-8-unix default enable-multibyte-characters: t Major mode: Article Minor modes in effect: diff-auto-refine-mode: t electric-pair-mode: t electric-indent-mode: t url-handler-mode: t global-reveal-mode: t reveal-mode: t auto-insert-mode: t savehist-mode: t minibuffer-electric-default-mode: t mouse-wheel-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Recent input: <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <switch-frame> <select-window> <switch-frame> <switch-frame> <switch-frame> <select-window> <switch-frame> e ( p o p t - o - b u f f e r <backspace> <backspace> <backspace> <backspace> <backspace> <backspace> <backspace> <backspace> <backspace> <backspace> - t o - b u f f e r SPC " SPC * m m * < 4 > > C-e <left> <left> <backspace> <return> M-< <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <switch-frame> <help-echo> <switch-frame> <select-window> <switch-frame> <help-echo> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <switch-frame> <select-window> <switch-frame> <switch-frame> <select-window> <switch-frame> <help-echo> <switch-frame> <select-window> <down-mouse-1> <mouse-1> <C-tab> C-s C-w C-w C-a <switch-frame> <help-echo> <down-mouse-2> <mouse-2> <switch-frame> <select-window> <switch-frame> <select-window> C-e C-c @ C-a <right> <down> <left> <right> <down> <left> <right> <down> <left> <right> <down> <left> <right> <up> <left> <right> <up> <left> <right> <down> <left> <right> <down> <down> <left> <right> <down> <left> <left> <left> <left> <right> <right> <right> <right> <left> <right> <up> <left> <right> <up> <left> <right> <down> <left> <right> <down> <left> <right> <down> <left> <right> <down> <left> <right> <down> <left> <right> <up> <left> <right> <up> <left> <right> <up> <left> <right> <up> <left> <right> <up> <left> <right> <switch-frame> <select-window> <switch-frame> <switch-frame> <help-echo> <switch-frame> <switch-frame> <switch-frame> <switch-frame> <help-echo> <switch-frame> <switch-frame> <select-window> <switch-frame> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <switch-frame> <select-window> <switch-frame> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <switch-frame> <help-echo> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <switch-frame> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <switch-frame> <help-echo> <switch-frame> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <select-window> <switch-frame> <switch-frame> <select-window> <switch-frame> <select-window> <help-echo> <switch-frame> <select-window> <switch-frame> <switch-frame> <select-window> <switch-frame> <select-window> <help-echo> <switch-frame> <select-window> <select-window> M-x r e p o <tab> r <tab> <return> Recent messages: Mark saved where search started mm-shr Mark saved where search started [3 times] Mark set mm-shr Entering debugger... #<buffer *mm*<4>> Mark set Mark saved where search started Making completion list... Load-path shadows: /usr/share/emacs23/site-lisp/bbdb/bbdb-migrate hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-migrate /usr/share/emacs23/site-lisp/bbdb/bbdb hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb /usr/share/emacs23/site-lisp/bbdb/bbdb-rmail hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-rmail /usr/share/emacs23/site-lisp/bbdb/bbdb-gnus hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-gnus /usr/share/emacs23/site-lisp/bbdb/bbdb-w3 hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-w3 /usr/share/emacs23/site-lisp/bbdb/bbdb-com hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-com /usr/share/emacs23/site-lisp/bbdb/bbdb-merge hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-merge /usr/share/emacs23/site-lisp/bbdb/bbdb-ftp hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-ftp /usr/share/emacs23/site-lisp/bbdb/bbdb-sc hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-sc /usr/share/emacs23/site-lisp/bbdb/bbdb-vm hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-vm /usr/share/emacs23/site-lisp/bbdb/bbdb-gui hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-gui /usr/share/emacs23/site-lisp/bbdb/bbdb-print hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-print /usr/share/emacs23/site-lisp/bbdb/bbdb-hooks hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-hooks /usr/share/emacs23/site-lisp/bbdb/bbdb-mhe hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-mhe /usr/share/emacs23/site-lisp/bbdb/bbdb-whois hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-whois /usr/share/emacs23/site-lisp/bbdb/bbdb-snarf hides /usr/share/emacs/site-lisp/bbdb/lisp/bbdb-snarf Features: (emacsbug gnus-topic cl-specs shr url-http url-auth url-gw footnote xscheme warnings trace testcover scheme unsafep re-builder shadow inf-lisp ielm comint ring elp edebug cust-print vc-bzr filecache find-func dabbrev multi-isearch diff-mode jka-compr rect pp descr-text gnus-fun skeleton canlock sha1 hex-util novice woman tutorial help-macro man assoc info-look info help-at-pt ehelp apropos cus-edit cus-start cus-load gnus-html browse-url xml url-cache mm-url url url-proxy url-privacy url-expand url-methods url-history url-cookie url-util supercite regi flow-fill executable copyright debug gnus-draft gnus-dup mule-util sort smiley ansi-color gnus-cite mail-extr gnus-async gnus-bcklg qp byte-opt bytecomp byte-compile gnus-ml disp-table nnfolder utf-7 nnimap parse-time tls utf7 nndraft nnmh nnagent nnml gnus-agent gnus-srvr gnus-score score-mode nnvirtual gnus-msg gnus-art mm-uu mml2015 epg-config mm-view smime password-cache dig mailcap nntp gnus-cache gnus-sum nnoo gnus-group time-date gnus-undo nnmail mail-source format-spec server gnus-start gnus-spec gnus-int gnus-range message sendmail rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 ietf-drums mailabbrev gmm-utils mailheader gnus-win gnus gnus-ems nnheader mail-utils wid-edit noutline outline easy-mmode flyspell ispell eldoc checkdoc regexp-opt thingatpt help-mode easymenu view prog-mode electric url-handlers url-parse auth-source netrc gnus-util url-vars mm-util mail-prsvr reveal autoinsert uniquify advice help-fns advice-preload savehist minibuf-eldef cl cl-loaddefs proof-site proof-autoloads pg-vars bbdb-autoloads agda2 tooltip ediff-hook vc-hooks lisp-float-type mwheel x-win x-dnd tool-bar dnd fontset image fringe lisp-mode register page newcomment menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer loaddefs button faces cus-face files text-properties overlay md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote make-network-process dbusbind dynamic-setting system-font-setting font-render-setting x-toolkit x multi-tty emacs)
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.