GNU bug report logs -
#23698
24.5; broken character name
Previous Next
Reported by: ynyaaa <at> gmail.com
Date: Sun, 5 Jun 2016 13:07:02 UTC
Severity: normal
Found in version 24.5
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 23698 in the body.
You can then email your comments to 23698 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23698
; Package
emacs
.
(Sun, 05 Jun 2016 13:07:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
ynyaaa <at> gmail.com
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Sun, 05 Jun 2016 13:07:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Just after starting emacs -Q, this form returns correct value.
(get-char-code-property #xE01 'name)
=>"THAI CHARACTER KO KAI"
Then display THAI characters by typing this.
M-x list-charset-chars RET thai-iso8859-11 RET
After THAI characters are displayed, the form returns wrong value.
(get-char-code-property #xE01 'name)
=>"LETTER KO KAI"
For other THAI characters, get-char-code-property returns wrong names.
UCS correct name wrong name
E01 THAI CHARACTER KO KAI LETTER KO KAI
E02 THAI CHARACTER KHO KHAI LETTER KHO KHAI
E03 THAI CHARACTER KHO KHUAT LETTER KHO KHUAT
E04 THAI CHARACTER KHO KHWAI LETTER KHO KHWAI
E05 THAI CHARACTER KHO KHON LETTER KHO KHON
E06 THAI CHARACTER KHO RAKHANG LETTER KHO RAKHANG
E07 THAI CHARACTER NGO NGU LETTER NGO NGU
E08 THAI CHARACTER CHO CHAN LETTER CHO CHAN
E09 THAI CHARACTER CHO CHING LETTER CHO CHING
E0A THAI CHARACTER CHO CHANG LETTER CHO CHANG
E0B THAI CHARACTER SO SO LETTER SO SO
E0C THAI CHARACTER CHO CHOE LETTER CHO CHOE
E0D THAI CHARACTER YO YING LETTER YO YING
E0E THAI CHARACTER DO CHADA LETTER DO CHADA
E0F THAI CHARACTER TO PATAK LETTER TO PATAK
E10 THAI CHARACTER THO THAN LETTER THO THAN
E11 THAI CHARACTER THO NANGMONTHO LETTER THO NANGMONTHO
E12 THAI CHARACTER THO PHUTHAO LETTER THO PHUTHAO
E13 THAI CHARACTER NO NEN LETTER NO NEN
E14 THAI CHARACTER DO DEK LETTER DO DEK
E15 THAI CHARACTER TO TAO LETTER TO TAO
E16 THAI CHARACTER THO THUNG LETTER THO THUNG
E17 THAI CHARACTER THO THAHAN LETTER THO THAHAN
E18 THAI CHARACTER THO THONG LETTER THO THONG
E19 THAI CHARACTER NO NU LETTER NO NU
E1A THAI CHARACTER BO BAIMAI LETTER BO BAIMAI
E1B THAI CHARACTER PO PLA LETTER PO PLA
E1C THAI CHARACTER PHO PHUNG LETTER PHO PHUNG
E1D THAI CHARACTER FO FA LETTER FO FA
E1E THAI CHARACTER PHO PHAN LETTER PHO PHAN
E1F THAI CHARACTER FO FAN LETTER FO FAN
E20 THAI CHARACTER PHO SAMPHAO LETTER PHO SAMPHAO
E21 THAI CHARACTER MO MA LETTER MO MA
E22 THAI CHARACTER YO YAK LETTER YO YAK
E23 THAI CHARACTER RO RUA LETTER RO RUA
E24 THAI CHARACTER RU LETTER RU (Pali vowel letter)
E25 THAI CHARACTER LO LING LETTER LO LING
E26 THAI CHARACTER LU LETTER LU (Pali vowel letter)
E27 THAI CHARACTER WO WAEN LETTER WO WAEN
E28 THAI CHARACTER SO SALA LETTER SO SALA
E29 THAI CHARACTER SO RUSI LETTER SO RUSI
E2A THAI CHARACTER SO SUA LETTER SO SUA
E2B THAI CHARACTER HO HIP LETTER HO HIP
E2C THAI CHARACTER LO CHULA LETTER LO CHULA
E2D THAI CHARACTER O ANG LETTER O ANG
E2E THAI CHARACTER HO NOKHUK LETTER HO NOK HUK
E2F THAI CHARACTER PAIYANNOI PAI YAN NOI (abbreviation)
E30 THAI CHARACTER SARA A VOWEL SIGN SARA A
E31 THAI CHARACTER MAI HAN-AKAT VOWEL SIGN MAI HAN-AKAT N/S-T
E32 THAI CHARACTER SARA AA VOWEL SIGN SARA AA
E33 THAI CHARACTER SARA AM VOWEL SIGN SARA AM
E34 THAI CHARACTER SARA I VOWEL SIGN SARA I N/S-T
E35 THAI CHARACTER SARA II VOWEL SIGN SARA II N/S-T
E36 THAI CHARACTER SARA UE VOWEL SIGN SARA UE N/S-T
E37 THAI CHARACTER SARA UEE VOWEL SIGN SARA UEE N/S-T
E38 THAI CHARACTER SARA U VOWEL SIGN SARA U N/S-B
E39 THAI CHARACTER SARA UU VOWEL SIGN SARA UU N/S-B
E3A THAI CHARACTER PHINTHU VOWEL SIGN PHINTHU N/S-B (Pali virama)
E3F THAI CURRENCY SYMBOL BAHT BAHT SIGN (currency symbol)
E40 THAI CHARACTER SARA E VOWEL SIGN SARA E
E41 THAI CHARACTER SARA AE VOWEL SIGN SARA AE
E42 THAI CHARACTER SARA O VOWEL SIGN SARA O
E43 THAI CHARACTER SARA AI MAIMUAN VOWEL SIGN SARA MAI MUAN
E44 THAI CHARACTER SARA AI MAIMALAI VOWEL SIGN SARA MAI MALAI
E45 THAI CHARACTER LAKKHANGYAO LAK KHANG YAO
E46 THAI CHARACTER MAIYAMOK MAI YAMOK (repetition)
E47 THAI CHARACTER MAITAIKHU VOWEL SIGN MAI TAI KHU N/S-T
E48 THAI CHARACTER MAI EK TONE MAI EK N/S-T
E49 THAI CHARACTER MAI THO TONE MAI THO N/S-T
E4A THAI CHARACTER MAI TRI TONE MAI TRI N/S-T
E4B THAI CHARACTER MAI CHATTAWA TONE MAI CHATTAWA N/S-T
E4C THAI CHARACTER THANTHAKHAT THANTHAKHAT N/S-T (cancellation mark)
E4D THAI CHARACTER NIKHAHIT NIKKHAHIT N/S-T (final nasal)
E4E THAI CHARACTER YAMAKKAN YAMAKKAN N/S-T
E4F THAI CHARACTER FONGMAN FONRMAN
E50 THAI DIGIT ZERO DIGIT ZERO
E51 THAI DIGIT ONE DIGIT ONE
E52 THAI DIGIT TWO DIGIT TWO
E53 THAI DIGIT THREE DIGIT THREE
E54 THAI DIGIT FOUR DIGIT FOUR
E55 THAI DIGIT FIVE DIGIT FIVE
E56 THAI DIGIT SIX DIGIT SIX
E57 THAI DIGIT SEVEN DIGIT SEVEN
E58 THAI DIGIT EIGHT DIGIT EIGHT
E59 THAI DIGIT NINE DIGIT NINE
E5A THAI CHARACTER ANGKHANKHU ANGKHANKHU (ellipsis)
E5B THAI CHARACTER KHOMUT KHOMUT (beginning of religious texts)
In GNU Emacs 24.5.1 (i686-pc-mingw32)
of 2015-04-11 on LEG570
Windowing system distributor `Microsoft Corp.', version 6.0.6002
Configured using:
`configure --prefix=/c/usr --host=i686-pc-mingw32'
Important settings:
value of $LANG: JPN
locale-coding-system: cp932
Major mode: Lisp Interaction
Minor modes in effect:
tooltip-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
Recent messages:
Load-path shadows:
None found.
Features:
(shadow sort gnus-util mail-extr emacsbug message format-spec rfc822 mml
mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev
gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util
mail-prsvr mail-utils thai-util thai-word mule-util info mule-diag
help-mode easymenu advice help-fns time-date japan-util tooltip electric
uniquify ediff-hook vc-hooks lisp-float-type mwheel dos-w32 ls-lisp
w32-common-fns disp-table w32-win w32-vars tool-bar dnd fontset image
regexp-opt fringe tabulated-list newcomment lisp-mode prog-mode register
page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock
font-lock syntax facemenu font-core frame cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew
greek romanian slovak czech european ethiopic indian cyrillic chinese
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer nadvice
loaddefs button faces cus-face macroexp files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process w32notify w32
multi-tty emacs)
Memory information:
((conses 8 164471 6457)
(symbols 32 27891 0)
(miscs 32 55 271)
(strings 16 24013 5282)
(string-bytes 1 514589)
(vectors 8 11535)
(vector-slots 4 515427 5354)
(floats 8 65 359)
(intervals 28 317 19)
(buffers 508 16))
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23698
; Package
emacs
.
(Sun, 05 Jun 2016 15:45:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 23698 <at> debbugs.gnu.org (full text, mbox):
> From: ynyaaa <at> gmail.com
> Date: Sun, 05 Jun 2016 22:06:23 +0900
>
>
> Just after starting emacs -Q, this form returns correct value.
>
> (get-char-code-property #xE01 'name)
> =>"THAI CHARACTER KO KAI"
>
> Then display THAI characters by typing this.
> M-x list-charset-chars RET thai-iso8859-11 RET
>
> After THAI characters are displayed, the form returns wrong value.
>
> (get-char-code-property #xE01 'name)
> =>"LETTER KO KAI"
Displaying the list of Thai characters loads thai-util.el, which
deliberately overwrites the names derived from the Unicode Character
Database with its own variants. I'm CC'ing Handa-san, who added that
code back in 2008, in the hope that he could tell why are we doing
that, and whether this is still needed nowadays.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23698
; Package
emacs
.
(Sun, 05 Jun 2016 16:44:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 23698 <at> debbugs.gnu.org (full text, mbox):
> Date: Sun, 05 Jun 2016 18:44:56 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 23698 <at> debbugs.gnu.org
>
> Displaying the list of Thai characters loads thai-util.el, which
> deliberately overwrites the names derived from the Unicode Character
> Database with its own variants. I'm CC'ing Handa-san, who added that
> code back in 2008, in the hope that he could tell why are we doing
> that, and whether this is still needed nowadays.
Actually, this code is much older: we have it since 1997, i.e. before
we started using the UCD for these purposes. So I think we can either
remove it or use a property that doesn't clash with the Unicode
standard properties.
Btw, the same problem exists with Lao (see lao-util.el).
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Mon, 06 Jun 2016 17:50:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
ynyaaa <at> gmail.com
:
bug acknowledged by developer.
(Mon, 06 Jun 2016 17:50:01 GMT)
Full text and
rfc822 format available.
Message #16 received at 23698-done <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
> we have it since 1997, i.e. before
> we started using the UCD for these purposes. So I think we can either
> remove it or use a property that doesn't clash with the Unicode
> standard properties.
It appears to be unnecessary these days, so I removed it in the attached
patch to master and am marking the bug as done. If I'm wrong and we need
it we can use a different property as you suggest.
[0001-Use-standard-Unicode-names-for-Thai-Lao.txt (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23698
; Package
emacs
.
(Mon, 06 Jun 2016 19:01:01 GMT)
Full text and
rfc822 format available.
Message #19 received at 23698 <at> debbugs.gnu.org (full text, mbox):
> Cc: 23698-done <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
> Kenichi Handa <handa <at> gnu.org>
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Mon, 6 Jun 2016 10:49:12 -0700
>
> > we have it since 1997, i.e. before
> > we started using the UCD for these purposes. So I think we can either
> > remove it or use a property that doesn't clash with the Unicode
> > standard properties.
> It appears to be unnecessary these days, so I removed it in the attached
> patch to master and am marking the bug as done. If I'm wrong and we need
> it we can use a different property as you suggest.
I'd still like to hear Handa-san's opinions on this.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23698
; Package
emacs
.
(Thu, 09 Jun 2016 14:33:01 GMT)
Full text and
rfc822 format available.
Message #22 received at 23698 <at> debbugs.gnu.org (full text, mbox):
In article <83y46jtzbb.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:
> Displaying the list of Thai characters loads thai-util.el, which
> deliberately overwrites the names derived from the Unicode Character
> Database with its own variants. I'm CC'ing Handa-san, who added that
> code back in 2008, in the hope that he could tell why are we doing
> that, and whether this is still needed nowadays.
Long ago, I discussed with Thai/Lao people about how to support their
languages in Mule. At that time, as I can't speak those languages, we
refer each character by name in the discussion. And, for debugging the
support code, name property was very useful. That's the reason for
those properties. Now, I think we can get rid them.
---
K. Handa
handa <at> gnu.org
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23698
; Package
emacs
.
(Thu, 09 Jun 2016 14:38:02 GMT)
Full text and
rfc822 format available.
Message #25 received at 23698-done <at> debbugs.gnu.org (full text, mbox):
> From: handa <handa <at> gnu.org>
> Cc: ynyaaa <at> gmail.com, 23698 <at> debbugs.gnu.org
> Date: Thu, 09 Jun 2016 23:32:21 +0900
>
> In article <83y46jtzbb.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:
>
> > Displaying the list of Thai characters loads thai-util.el, which
> > deliberately overwrites the names derived from the Unicode Character
> > Database with its own variants. I'm CC'ing Handa-san, who added that
> > code back in 2008, in the hope that he could tell why are we doing
> > that, and whether this is still needed nowadays.
>
> Long ago, I discussed with Thai/Lao people about how to support their
> languages in Mule. At that time, as I can't speak those languages, we
> refer each character by name in the discussion. And, for debugging the
> support code, name property was very useful. That's the reason for
> those properties. Now, I think we can get rid them.
OK, thanks. So I guess Paul's change did TRT, and we can close this
bug report.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 08 Jul 2016 11:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 9 years and 45 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.