GNU bug report logs - #5387
MS950 alias for CP950 charset

Previous Next

Packages: emacs, gnus;

Reported by: jidanni <at> jidanni.org

Date: Fri, 15 Jan 2010 11:15:02 UTC

Severity: normal

Merged with 5647, 5681

Done: Glenn Morris <rgm <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 5387 in the body.
You can then email your comments to 5387 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#5387; Package emacs. (Fri, 15 Jan 2010 11:15:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni <at> jidanni.org:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 15 Jan 2010 11:15:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: jidanni <at> jidanni.org
To: bug-gnu-emacs <at> gnu.org
Cc: handa <at> etl.go.jp
Subject: MS950 alias for CP950 charset
Date: Fri, 15 Jan 2010 19:01:11 +0800
I signed up on the Legislature of Taiwan's website, and the confirmation
mail had
From: sysop <at> ly.gov.tw
Subject: =?BIG5?B?pd+qa7B8pf6yebjqsFS69C2/76XBqkGwyLHSsMq9VLt7qOc=?=
Mime-Version: 1.0
Content-Type: text/html; charset=MS950
Content-Transfer-Encoding: quoted-printable

And it turns out MS950 is an alias for CP950, so perhaps emacs should
incorporate this alias, even though this is the first time I've seen it.
Perhaps make all MSxxx be aliases for CPxxx.





Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#5387; Package emacs. (Fri, 15 Jan 2010 11:47:01 GMT) Full text and rfc822 format available.

Message #8 received at 5387 <at> debbugs.gnu.org (full text, mbox):

From: Kenichi Handa <handa <at> m17n.org>
To: jidanni <at> jidanni.org
Cc: 5387 <at> debbugs.gnu.org
Subject: Re: bug#5387: MS950 alias for CP950 charset
Date: Fri, 15 Jan 2010 20:46:47 +0900
In article <87aawfzl6w.fsf <at> jidanni.org>, jidanni <at> jidanni.org writes:

> I signed up on the Legislature of Taiwan's website, and the confirmation
> mail had
> From: sysop <at> ly.gov.tw
> Subject: =?BIG5?B?pd+qa7B8pf6yebjqsFS69C2/76XBqkGwyLHSsMq9VLt7qOc=?=
> Mime-Version: 1.0
> Content-Type: text/html; charset=MS950
> Content-Transfer-Encoding: quoted-printable

> And it turns out MS950 is an alias for CP950,

Where did you get that information?

> so perhaps emacs should
> incorporate this alias, even though this is the first time I've seen it.
> Perhaps make all MSxxx be aliases for CPxxx.

I checked <http://www.iana.org/assignments/character-sets>
and found that only MS936 is listed as an alias of GBK.

---
Kenichi Handa
handa <at> m17n.org




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#5387; Package emacs. (Fri, 15 Jan 2010 11:56:01 GMT) Full text and rfc822 format available.

Message #11 received at 5387 <at> debbugs.gnu.org (full text, mbox):

From: jidanni <at> jidanni.org
To: handa <at> m17n.org
Cc: 5387 <at> debbugs.gnu.org
Subject: Re: bug#5387: MS950 alias for CP950 charset
Date: Fri, 15 Jan 2010 19:55:00 +0800
>> And it turns out MS950 is an alias for CP950,
K> Where did you get that information?
I inferred it from my single encounter.
K> I checked <http://www.iana.org/assignments/character-sets>
K> and found that only MS936 is listed as an alias of GBK.
That makes two... or 1.5.




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#5387; Package emacs. (Fri, 15 Jan 2010 12:58:02 GMT) Full text and rfc822 format available.

Message #14 received at 5387 <at> debbugs.gnu.org (full text, mbox):

From: Kenichi Handa <handa <at> m17n.org>
To: jidanni <at> jidanni.org
Cc: 5387 <at> debbugs.gnu.org
Subject: Re: bug#5387: MS950 alias for CP950 charset
Date: Fri, 15 Jan 2010 21:57:07 +0900
In article <87my0fy44r.fsf <at> jidanni.org>, jidanni <at> jidanni.org writes:

>>> And it turns out MS950 is an alias for CP950,
>>> Where did you get that information?
> I inferred it from my single encounter.
>>> I checked <http://www.iana.org/assignments/character-sets>
>>> and found that only MS936 is listed as an alias of GBK.
> That makes two... or 1.5.

And,
iconv: MS932, MS936, MSCP949, MSCP1361
python: ms932, ms936, ms949, ms950, ms1361

hmmm...

I've just installed the attached change to accept msXXX.
canonicalize-coding-system-name is used by
coding-system-from-name, and, at least, rmail uses it.

If you are using rmail, please try the latest code, or the
attached patch.

---
Kenichi Handa
handa <at> m17n.org

=== modified file 'lisp/international/mule-cmds.el'
--- lisp/international/mule-cmds.el	2010-01-13 08:35:10 +0000
+++ lisp/international/mule-cmds.el	2010-01-15 12:33:24 +0000
@@ -226,19 +226,22 @@
 ;; and delimiter characters.  Support function of
 ;; coding-system-from-name.
 (defun canonicalize-coding-system-name (name)
-  (if (string-match "^iso[-_ ]?[0-9]" name)
-      ;; "iso-8859-1" -> "8859-1", "iso-2022-jp" ->"2022-jp"
-      (setq name (substring name (1- (match-end 0)))))
-  (let ((idx (string-match "[-_ /]" name)))
-    ;; Delete "-", "_", " ", "/" but do distinguish "16-be" and "16be".
-    (while idx
-      (if (and (>= idx 2)
-	       (eq (string-match "16-[lb]e$" name (- idx 2))
-		   (- idx 2)))
-	  (setq idx (string-match "[-_ /]" name (match-end 0)))
-	(setq name (concat (substring name 0 idx) (substring name (1+ idx)))
-	      idx (string-match "[-_ /]" name idx))))
-    name))
+  (if (string-match "^\\(ms\\|ibm\\|windows-\\)\\([0-9]+\\)$" name)
+      ;; "ms950", "ibm950", "windows-950" -> "cp950"
+      (concat "cp" (match-string 2 name))
+    (if (string-match "^iso[-_ ]?[0-9]" name)
+	;; "iso-8859-1" -> "8859-1", "iso-2022-jp" ->"2022-jp"
+	(setq name (substring name (1- (match-end 0)))))
+    (let ((idx (string-match "[-_ /]" name)))
+      ;; Delete "-", "_", " ", "/" but do distinguish "16-be" and "16be".
+      (while idx
+	(if (and (>= idx 2)
+		 (eq (string-match "16-[lb]e$" name (- idx 2))
+		     (- idx 2)))
+	    (setq idx (string-match "[-_ /]" name (match-end 0)))
+	  (setq name (concat (substring name 0 idx) (substring name (1+ idx)))
+		idx (string-match "[-_ /]" name idx))))
+      name)))
 
 (defun coding-system-from-name (name)
   "Return a coding system whose name matches with NAME (string or symbol)."





Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#5387; Package emacs. (Sat, 16 Jan 2010 01:16:01 GMT) Full text and rfc822 format available.

Message #17 received at 5387 <at> debbugs.gnu.org (full text, mbox):

From: jidanni <at> jidanni.org
To: handa <at> m17n.org
Cc: 5387 <at> debbugs.gnu.org, ding <at> gnus.org
Subject: Re: bug#5387: MS950 alias for CP950 charset
Date: Sat, 16 Jan 2010 09:15:32 +0800
>>>>> "K" == Kenichi Handa <handa <at> m17n.org> writes:
K> I've just installed the attached change to accept msXXX.
K> canonicalize-coding-system-name is used by
K> coding-system-from-name, and, at least, rmail uses it.
I'll CC the gnus people to make sure they will use it too.




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#5387; Package emacs. (Sat, 16 Jan 2010 12:25:01 GMT) Full text and rfc822 format available.

Message #20 received at 5387 <at> debbugs.gnu.org (full text, mbox):

From: Reiner Steib <reinersteib+gmane <at> imap.cc>
To: jidanni <at> jidanni.org, handa <at> m17n.org
Cc: 5387 <at> debbugs.gnu.org, ding <at> gnus.org
Subject: Re: bug#5387: MS950 alias for CP950 charset
Date: Sat, 16 Jan 2010 13:09:23 +0100
On Sat, Jan 16 2010, jidanni <at> jidanni.org wrote:

>>>>>> "K" == Kenichi Handa <handa <at> m17n.org> writes:
> K> I've just installed the attached change to accept msXXX.

Please add "(Bug#5387)" to the ChangeLog entry.

> K> canonicalize-coding-system-name is used by
> K> coding-system-from-name, and, at least, rmail uses it.
> I'll CC the gnus people to make sure they will use it too.

Gnus should use all coding-systems / charsets provided by Emacs.  No
change in Gnus required.

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#5387; Package emacs. (Mon, 18 Jan 2010 01:15:02 GMT) Full text and rfc822 format available.

Message #23 received at 5387 <at> debbugs.gnu.org (full text, mbox):

From: Kenichi Handa <handa <at> m17n.org>
To: Reiner Steib <Reiner.Steib <at> gmx.de>
Cc: 5387 <at> debbugs.gnu.org, ding <at> gnus.org, jidanni <at> jidanni.org
Subject: Re: bug#5387: MS950 alias for CP950 charset
Date: Mon, 18 Jan 2010 10:14:50 +0900
In article <871vhqw8ss.fsf <at> marauder.physik.uni-ulm.de>, Reiner Steib <reinersteib+gmane <at> imap.cc> writes:

> On Sat, Jan 16 2010, jidanni <at> jidanni.org wrote:
>>>>>>> "K" == Kenichi Handa <handa <at> m17n.org> writes:
> > K> I've just installed the attached change to accept msXXX.

> Please add "(Bug#5387)" to the ChangeLog entry.

Ah, ok, just done.

> > K> canonicalize-coding-system-name is used by
> > K> coding-system-from-name, and, at least, rmail uses it.
> > I'll CC the gnus people to make sure they will use it too.

> Gnus should use all coding-systems / charsets provided by Emacs.  No
> change in Gnus required.

But MS950 is still not a coding-system in Emacs.
coding-system-from-name is a function to guess a coding
system from the given name.

(coding-system-from-name "MS950")
cp950
(coding-system-p 'MS950)
nil
(coding-system-p 'CP950)
nil
(coding-system-from-name "CP950")
cp950
(coding-system-p 'cp950)
t

---
Kenichi Handa
handa <at> m17n.org




Merged 5387 5647. Request was from Glenn Morris <rgm <at> gnu.org> to control <at> debbugs.gnu.org. (Thu, 25 Feb 2010 17:44:02 GMT) Full text and rfc822 format available.

bug reassigned from package 'emacs' to 'emacs,gnus'. Request was from Glenn Morris <rgm <at> gnu.org> to control <at> debbugs.gnu.org. (Tue, 02 Mar 2010 19:47:02 GMT) Full text and rfc822 format available.

Merged 5387 5647 5681. Request was from Glenn Morris <rgm <at> gnu.org> to control <at> debbugs.gnu.org. (Thu, 04 Mar 2010 19:28:01 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 13 Apr 2010 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 15 years and 67 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.