GNU bug report logs -
#3745
23.0.95; emacs-23.0.95: unibyte-display-via-language-environment
Previous Next
Reported by: Jay Berkenbilt <ejb <at> ql.org>
Date: Fri, 3 Jul 2009 01:45:04 UTC
Severity: normal
Done: Chong Yidong <cyd <at> stupidchicken.com>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
In article <87bpo13ks8.fsf <at> stupidchicken.com>, Chong Yidong <cyd <at> stupidchicken.com> writes:
> > Now `charset_unibyte' is always 0 (i.e. the same as `charset_ascii').
> Is this variable obsolete, then?
Yes, at the moment. But, I'd like to use it for
unibyte-display-via-language-environment.
In article <87y6r560y3.fsf <at> stupidchicken.com>, Chong Yidong <cyd <at> stupidchicken.com> writes:
> > Now `charset_unibyte' is always 0 (i.e. the same as
> > `charset_ascii'). So, unibyte->multibyte conversion always
> > results in an eight-bit multibyte character.
> Looking through the code, I see that the variable `charset_unibyte' is
> not initialized properly. That's the only reason it's 0. We have to
> fix this for sure.
Yes.
> > To fix the above problem, I propose these changes for 23.1
> > and the trunk.
> >
> > (1) Fix all codes accessing charset_unibyte
> > (e.g. Funibyte_char_to_multibyte) not to refer to it.
> Can we use charset_iso_8859_1 instead of charset_unibyte, or add a line
> that says
> charset_unibyte
> = define_charset_internal (...);
> in syms_of_charset?
No. Stefan's change was to make unibyte-char-to-multibyte
(and unibyte_char_to_multibyte) always returning an 8-bit
char for an 8-bit byte. To do that, charset_unibyte must be
the same as charset_ascii, but, first of all, we don't have
to use charset_unibyte in such an operation. We can simply
use BYTE8_TO_CHAR.
> > (2) Setup charset_unibyte correctly in Fset_charset_priority.
> >
> > (3) Fix x_produce_glyphs to do DECODE_CHAR (charset_unibyte,
> > it->c) instead of unibyte_char_to_multibyte (it->c).
> Number 3 is not a trivial change. IIUC, unibyte_char_to_multibyte is
> very fast. Changing it to use DECODE_CHAR may lead to a performance
> hit.
But, using unibyte_char_to_multibyte here is a clear bug.
If the overhead by DECODE_CHAR is untolerable (I don't
believe it), we can do this:
(1) modify unibyte_char_to_multibyte to use BYTE8_TO_CHAR
instead of the table unibyte_to_multibyte_table.
(2) Setup unibyte_to_multibyte_table for unibyte_charset.
(3) Just lookup that table in x_produce_glyphs.
---
Kenichi Handa
handa <at> m17n.org
This bug report was last modified 15 years and 323 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.