GNU bug report logs - #60275
28.1; chinese-cns11643-15 mapping is wrong

Previous Next

Package: emacs;

Reported by: awrhygty <at> outlook.com

Date: Fri, 23 Dec 2022 14:20:01 UTC

Severity: normal

Found in version 28.1

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 60275 in the body.
You can then email your comments to 60275 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#60275; Package emacs. (Fri, 23 Dec 2022 14:20:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to awrhygty <at> outlook.com:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 23 Dec 2022 14:20:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: awrhygty <at> outlook.com
To: bug-gnu-emacs <at> gnu.org
Subject: 28.1; chinese-cns11643-15 mapping is wrong
Date: Fri, 23 Dec 2022 23:19:33 +0900
Typing 'M-x list-charset-chars RET chinese-cns11643-15 RET',
a wrong character set is listed.
It contains hangul characters, and looks like korean-ksc5601 charset.


In GNU Emacs 28.1 (build 2, x86_64-w64-mingw32)
 of 2022-04-22 built on AVALON
Windowing system distributor 'Microsoft Corp.', version 10.0.19045
System Description: Microsoft Windows 10 Pro (v10.0.2009.19045.2364)

Configured using:
 'configure --with-modules --without-dbus --with-native-compilation
 --without-compress-install CFLAGS=-O2'

Configured features:
ACL GIF GMP GNUTLS HARFBUZZ JPEG JSON LCMS2 LIBXML2 MODULES NATIVE_COMP
NOTIFY W32NOTIFY PDUMPER PNG RSVG SOUND THREADS TIFF TOOLKIT_SCROLL_BARS
XPM ZLIB

(NATIVE_COMP present but libgccjit not available)

Important settings:
  value of $LANG: JPN
  locale-coding-system: cp932

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  eldoc-mode: t
  show-paren-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  indent-tabs-mode: t
  transient-mark-mode: t

Load-path shadows:
None found.

Features:
(mule-diag gnutls network-stream nsm mailalias smtpmail help-mode pp
shadow sort mail-extr emacsbug message rmc puny dired dired-loaddefs
rfc822 mml mml-sec epa derived epg rfc6068 epg-config gnus-util rmail
rmail-loaddefs auth-source cl-seq eieio eieio-core cl-macs
eieio-loaddefs password-cache json map text-property-search mm-decode
mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader
sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils
term/bobcat misearch multi-isearch seq byte-opt gv bytecomp byte-compile
cconv noutline outline easy-mmode view time-date subr-x cl-loaddefs
cl-lib japan-util iso-transl tooltip eldoc paren electric uniquify
ediff-hook vc-hooks lisp-float-type elisp-mode mwheel dos-w32 ls-lisp
disp-table term/w32-win w32-win w32-vars term/common-win tool-bar dnd
fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow
isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax
font-core term/tty-colors frame minibuffer cl-generic cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese composite emoji-zwj charscript charprop
case-table epa-hook jka-cmpr-hook help simple abbrev obarray
cl-preloaded nadvice button loaddefs faces cus-face macroexp files
window text-properties overlay sha1 md5 base64 format env code-pages
mule custom widget hashtable-print-readable backquote threads w32notify
w32 lcms2 multi-tty make-network-process native-compile emacs)

Memory information:
((conses 16 195583 32392)
 (symbols 48 14557 2)
 (strings 32 30185 13370)
 (string-bytes 1 872033)
 (vectors 16 27031)
 (vector-slots 8 609454 38164)
 (floats 8 30 281)
 (intervals 56 8155 7128)
 (buffers 992 14))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60275; Package emacs. (Fri, 23 Dec 2022 14:59:02 GMT) Full text and rfc822 format available.

Message #8 received at 60275 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: awrhygty <at> outlook.com
Cc: 60275 <at> debbugs.gnu.org
Subject: Re: bug#60275: 28.1; chinese-cns11643-15 mapping is wrong
Date: Fri, 23 Dec 2022 16:57:50 +0200
> From: awrhygty <at> outlook.com
> Date: Fri, 23 Dec 2022 23:19:33 +0900
> 
> 
> Typing 'M-x list-charset-chars RET chinese-cns11643-15 RET',
> a wrong character set is listed.
> It contains hangul characters, and looks like korean-ksc5601 charset.

Why do you think this is an error?  CNS11643 has 16 planes and can
contain up to 141376 characters.  What is your reference for judging
which characters belong and don't belong to this character set?  Emacs
uses the glibc mapping.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60275; Package emacs. (Fri, 23 Dec 2022 15:27:01 GMT) Full text and rfc822 format available.

Message #11 received at 60275 <at> debbugs.gnu.org (full text, mbox):

From: awrhygty <at> outlook.com
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 60275 <at> debbugs.gnu.org
Subject: Re: bug#60275: 28.1; chinese-cns11643-15 mapping is wrong
Date: Sat, 24 Dec 2022 00:25:52 +0900
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: awrhygty <at> outlook.com
>> Date: Fri, 23 Dec 2022 23:19:33 +0900
>> 
>> 
>> Typing 'M-x list-charset-chars RET chinese-cns11643-15 RET',
>> a wrong character set is listed.
>> It contains hangul characters, and looks like korean-ksc5601 charset.
>
> Why do you think this is an error?  CNS11643 has 16 planes and can
> contain up to 141376 characters.  What is your reference for judging
> which characters belong and don't belong to this character set?  Emacs
> uses the glibc mapping.

Reading emacs-28.1/share/emacs/28.1/etc/charsets/CNS-F.map,
there are only UCS values for CJK IDEOGRAPH characters.
The result buffer of #'list-charset-chars contains non-han characters.
For example, '*Character List*' buffer starts with:
	Characters in the coded character set chinese-cns11643-15.
	-----------------------------------------------------------------------
	        0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
	   212x	 	¿	ː	∮	∑	∏	¤	℉	‰	◁	◀	▷	▶	♤	♠	♡
	   213x	♥	♧	♣	⊙	◈	▣	◐	◑	▒	▤	▥	▨	▧	▦	▩	♨
	   214x	☏	☎	☜	☞	¶	†	‡	↕	↗	↙	↖	↘	♭	♩	♪	♬
	   215x	㉿	㈜	№	㏇	™	㏂	㏘	℡	€	®	㉾	 	 	 	 	 
	   216x	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 
	   217x	 	!	"	#	$	%	&	'	(	)	*	+	,	-	.
	-----------------------------------------------------------------------
	        0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
	   222x	 	/	0	1	2	3	4	5	6	7	8	9	:	;	<	=
	   223x	>	?	@	A	B	C	D	E	F	G	H	I	J	K	L	M
	   224x	N	O	P	Q	R	S	T	U	V	W	X	Y	Z	[	₩	]
	   225x	^	_	`	a	b	c	d	e	f	g	h	i	j	k	l	m
	   226x	n	o	p	q	r	s	t	u	v	w	x	y	z	{	|	}
	   227x	 ̄	ㄱ	ㄲ	ㄳ	ㄴ	ㄵ	ㄶ	ㄷ	ㄸ	ㄹ	ㄺ	ㄻ	ㄼ	ㄽ	ㄾ
	-----------------------------------------------------------------------

Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Sat, 24 Dec 2022 10:11:02 GMT) Full text and rfc822 format available.

Notification sent to awrhygty <at> outlook.com:
bug acknowledged by developer. (Sat, 24 Dec 2022 10:11:02 GMT) Full text and rfc822 format available.

Message #16 received at 60275-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: awrhygty <at> outlook.com
Cc: 60275-done <at> debbugs.gnu.org
Subject: Re: bug#60275: 28.1; chinese-cns11643-15 mapping is wrong
Date: Sat, 24 Dec 2022 12:10:31 +0200
> From: awrhygty <at> outlook.com
> Cc: 60275 <at> debbugs.gnu.org
> Date: Sat, 24 Dec 2022 00:25:52 +0900
> 
> Reading emacs-28.1/share/emacs/28.1/etc/charsets/CNS-F.map,
> there are only UCS values for CJK IDEOGRAPH characters.
> The result buffer of #'list-charset-chars contains non-han characters.
> For example, '*Character List*' buffer starts with:

Thanks, should be fixed now on the emacs-29 branch.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 21 Jan 2023 12:24:10 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 152 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.