#3745 - 23.0.95; emacs-23.0.95: unibyte-display-via-language-environment

GNU bug report logs - #3745
23.0.95; emacs-23.0.95: unibyte-display-via-language-environment

Package: emacs;

Reported by: Jay Berkenbilt <ejb <at> ql.org>

Date: Fri, 3 Jul 2009 01:45:04 UTC

Severity: normal

Done: Chong Yidong <cyd <at> stupidchicken.com>

Bug is archived. No further changes may be made.

View this message in rfc822 format

From: Kenichi Handa <handa <at> m17n.org> To: Chong Yidong <cyd <at> stupidchicken.com> Cc: 3745 <at> debbugs.gnu.org Subject: bug#3745: 23.0.95; emacs-23.0.95: unibyte-display-via-language-environment Date: Mon, 06 Jul 2009 09:51:58 +0900

In article <87bpo13ks8.fsf <at> stupidchicken.com>, Chong Yidong <cyd <at> stupidchicken.com> writes: > > Now `charset_unibyte' is always 0 (i.e. the same as `charset_ascii'). > Is this variable obsolete, then? Yes, at the moment. But, I'd like to use it for unibyte-display-via-language-environment. In article <87y6r560y3.fsf <at> stupidchicken.com>, Chong Yidong <cyd <at> stupidchicken.com> writes: > > Now `charset_unibyte' is always 0 (i.e. the same as > > `charset_ascii'). So, unibyte->multibyte conversion always > > results in an eight-bit multibyte character. > Looking through the code, I see that the variable `charset_unibyte' is > not initialized properly. That's the only reason it's 0. We have to > fix this for sure. Yes. > > To fix the above problem, I propose these changes for 23.1 > > and the trunk. > > > > (1) Fix all codes accessing charset_unibyte > > (e.g. Funibyte_char_to_multibyte) not to refer to it. > Can we use charset_iso_8859_1 instead of charset_unibyte, or add a line > that says > charset_unibyte > = define_charset_internal (...); > in syms_of_charset? No. Stefan's change was to make unibyte-char-to-multibyte (and unibyte_char_to_multibyte) always returning an 8-bit char for an 8-bit byte. To do that, charset_unibyte must be the same as charset_ascii, but, first of all, we don't have to use charset_unibyte in such an operation. We can simply use BYTE8_TO_CHAR. > > (2) Setup charset_unibyte correctly in Fset_charset_priority. > > > > (3) Fix x_produce_glyphs to do DECODE_CHAR (charset_unibyte, > > it->c) instead of unibyte_char_to_multibyte (it->c). > Number 3 is not a trivial change. IIUC, unibyte_char_to_multibyte is > very fast. Changing it to use DECODE_CHAR may lead to a performance > hit. But, using unibyte_char_to_multibyte here is a clear bug. If the overhead by DECODE_CHAR is untolerable (I don't believe it), we can do this: (1) modify unibyte_char_to_multibyte to use BYTE8_TO_CHAR instead of the table unibyte_to_multibyte_table. (2) Setup unibyte_to_multibyte_table for unibyte_charset. (3) Just lookup that table in x_produce_glyphs. --- Kenichi Handa handa <at> m17n.org

This bug report was last modified 16 years and 12 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #3745 23.0.95; emacs-23.0.95: unibyte-display-via-language-environment

GNU bug report logs - #3745
23.0.95; emacs-23.0.95: unibyte-display-via-language-environment