GNU bug report logs -
#6866
mule-cmds.el just _assumes_ all of Taiwan uses Big5 and not UTF-8
Previous Next
Reported by: jidanni <at> jidanni.org
Date: Mon, 16 Aug 2010 12:03:02 UTC
Severity: normal
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 6866 in the body.
You can then email your comments to 6866 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6866
; Package
emacs
.
(Mon, 16 Aug 2010 12:03:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
jidanni <at> jidanni.org
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Mon, 16 Aug 2010 12:03:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
I demand an explanation.
$ zgrep TW mule-cmds.el
("zh_TW" . "Chinese-Big5")
You guys just *assume* that all TW people still use Big5.
One can do LC_ALL=zh_TW.UTF-8 until he is blue in the face, but still
current-language-environment is a variable defined in `mule-cmds.el'.
Its value is "Chinese-BIG5"
emacs-version "24.0.50.1"
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6866
; Package
emacs
.
(Mon, 16 Aug 2010 12:22:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 6866 <at> debbugs.gnu.org (full text, mbox):
> From: jidanni <at> jidanni.org
> Date: Mon, 16 Aug 2010 19:09:48 +0800
> Cc:
>
> I demand an explanation.
> $ zgrep TW mule-cmds.el
> ("zh_TW" . "Chinese-Big5")
> You guys just *assume* that all TW people still use Big5.
That's because they do. Case closed.
> One can do LC_ALL=zh_TW.UTF-8 until he is blue in the face
How dare you??!!!
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6866
; Package
emacs
.
(Mon, 16 Aug 2010 12:24:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 6866 <at> debbugs.gnu.org (full text, mbox):
On 16/8/2010 7:09 PM, jidanni <at> jidanni.org wrote:
> I demand an explanation.
> $ zgrep TW mule-cmds.el
> ("zh_TW" . "Chinese-Big5")
> You guys just *assume* that all TW people still use Big5.
> One can do LC_ALL=zh_TW.UTF-8 until he is blue in the face, but still
> current-language-environment is a variable defined in `mule-cmds.el'.
> Its value is "Chinese-BIG5"
> emacs-version "24.0.50.1"
Please explain what bug you think this caused.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6866
; Package
emacs
.
(Mon, 16 Aug 2010 12:58:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 6866 <at> debbugs.gnu.org (full text, mbox):
>>>>> "JR" == Jason Rumney <jasonr <at> gnu.org> writes:
JR> Please explain what bug you think this caused.
http://news.gmane.org/group/gmane.emacs.w3m/thread=8661
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6866
; Package
emacs
.
(Mon, 16 Aug 2010 13:17:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 6866 <at> debbugs.gnu.org (full text, mbox):
On 16/8/2010 8:58 PM, jidanni <at> jidanni.org wrote:
>>>>>> "JR" == Jason Rumney<jasonr <at> gnu.org> writes:
> JR> Please explain what bug you think this caused.
> http://news.gmane.org/group/gmane.emacs.w3m/thread=8661
It appears from that thread that there is a bug in w3m. It is not
apparent that it is related to your report here though, as the expected
behavior is that a Japanese search engine should only be chosen if the
current-language matches "Japanese", which this clearly does not.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6866
; Package
emacs
.
(Mon, 16 Aug 2010 13:39:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 6866 <at> debbugs.gnu.org (full text, mbox):
Well anyway, for our locale me and my friends all use zh_TW.UTF-8 and
stopped using zh_TW.big5 years ago. So at least it looks very dumb there
in mule-cmds.el that the zh_CN people can use UTF-8, but the HK and TW
are locked in the dark ages:
("zh_HK" . "Chinese-Big5")
("zh_TW" . "Chinese-Big5")
("zh_CN.UTF-8" . "Chinese-GBK")
("zh_CN" . "Chinese-GB")
The only big5 thing I apparently sometimes still use is
$ GET http://jidanni.org/comp/configuration/.emacs | grep -i b5
(setq default-input-method 'chinese-py-punct-b5))));no 'utf' ones
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6866
; Package
emacs
.
(Mon, 16 Aug 2010 14:08:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 6866 <at> debbugs.gnu.org (full text, mbox):
On 16/8/2010 9:39 PM, jidanni <at> jidanni.org wrote:
> Well anyway, for our locale me and my friends all use zh_TW.UTF-8 and
> stopped using zh_TW.big5 years ago. So at least it looks very dumb there
> in mule-cmds.el that the zh_CN people can use UTF-8, but the HK and TW
> are locked in the dark ages:
>
> ("zh_HK" . "Chinese-Big5")
> ("zh_TW" . "Chinese-Big5")
> ("zh_CN.UTF-8" . "Chinese-GBK")
> ("zh_CN" . "Chinese-GB")
GBK is a backwards compatible extension of GB with more characters. I'm
not sure that Big5 has an equivalent. In all these cases, the character
set is used to select preferences for fonts, input methods and other
language sensitive things, and has nothing to do with UTF-8 (which is
used as a preference for file encoding when specified).
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6866
; Package
emacs
.
(Mon, 16 Aug 2010 15:27:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 6866 <at> debbugs.gnu.org (full text, mbox):
>>>>> "JR" == Jason Rumney <jasonr <at> gnu.org> writes:
JR> In all these cases, the character set is used to select preferences
JR> for fonts, input methods and other language sensitive things, and
JR> has nothing to do with UTF-8 (which is used as a preference for file
JR> encoding when specified).
Then it is a sad choice of the name of a character set being used for
other purposes. Many users will say: didn't I make a big effort years
ago to totally convert my environment? Why do I still have traces of
big5 hanging around?
Perhaps there should be a more neutral name used. Since it seems what
you are calling Chinese-Big5 does not have much to do with
http://en.wikipedia.org/wiki/Traditional_Chinese#Computer_encoding
after all.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6866
; Package
emacs
.
(Mon, 16 Aug 2010 16:27:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 6866 <at> debbugs.gnu.org (full text, mbox):
> GBK is a backwards compatible extension of GB with more characters.
> I'm not sure that Big5 has an equivalent.
Such an extension exists; it is called Big5-plus. However, AFAIK,
nobody has ever used it, and today it is obsolete since Unicode is
much better.
Werner
Reply sent
to
Eli Zaretskii <eliz <at> gnu.org>
:
You have taken responsibility.
(Mon, 16 Aug 2010 17:40:03 GMT)
Full text and
rfc822 format available.
Notification sent
to
jidanni <at> jidanni.org
:
bug acknowledged by developer.
(Mon, 16 Aug 2010 17:40:03 GMT)
Full text and
rfc822 format available.
Message #34 received at 6866-done <at> debbugs.gnu.org (full text, mbox):
> From: jidanni <at> jidanni.org
> Date: Mon, 16 Aug 2010 21:39:10 +0800
> Cc: 6866 <at> debbugs.gnu.org
>
> Well anyway, for our locale me and my friends all use zh_TW.UTF-8 and
> stopped using zh_TW.big5 years ago. So at least it looks very dumb there
> in mule-cmds.el that the zh_CN people can use UTF-8, but the HK and TW
> are locked in the dark ages:
>
> ("zh_HK" . "Chinese-Big5")
> ("zh_TW" . "Chinese-Big5")
> ("zh_CN.UTF-8" . "Chinese-GBK")
> ("zh_CN" . "Chinese-GB")
Are you sure you understand what this data base is used for in Emacs?
The function within mule-cmds.el which uses this data has this
comment:
;; locale-language-names specify both lang-env and coding.
;; But, what specified in locale-preferred-coding-systems
;; has higher priority.
Thus, if you specify UTF-8 as the preferred encoding (e.g., via
LC_ALL), it overrules the Big5 default.
> The only big5 thing I apparently sometimes still use is
> $ GET http://jidanni.org/comp/configuration/.emacs | grep -i b5
> (setq default-input-method 'chinese-py-punct-b5))));no 'utf' ones
You are confused: an input method can produce Big5 characters, but
that won't prevent Emacs from encoding them in UTF-8 if that's your
preference.
I'm closing this bug.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6866
; Package
emacs
.
(Mon, 16 Aug 2010 17:44:02 GMT)
Full text and
rfc822 format available.
Message #37 received at 6866 <at> debbugs.gnu.org (full text, mbox):
> From: jidanni <at> jidanni.org
> Date: Mon, 16 Aug 2010 23:27:51 +0800
> Cc: 6866 <at> debbugs.gnu.org
>
> Many users will say: didn't I make a big effort years ago to totally
> convert my environment? Why do I still have traces of big5 hanging
> around?
Users should not look into the code unless they actually read it (as
opposed to grep them with some random string) and understand what the
code does.
> Perhaps there should be a more neutral name used. Since it seems what
> you are calling Chinese-Big5 does not have much to do with
> http://en.wikipedia.org/wiki/Traditional_Chinese#Computer_encoding
> after all.
It _is_ a name of an encoding, just the Emacs name. It just isn't
used in the way you thought.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 14 Sep 2010 11:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 14 years and 342 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.