GNU bug report logs - #23814
24.5; bug of hz coding-system

Previous Next

Package: emacs;

Reported by: ynyaaa <at> gmail.com

Date: Tue, 21 Jun 2016 12:23:02 UTC

Severity: normal

Found in version 24.5

Fixed in version 26.1

Done: Glenn Morris <rgm <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #35 received at 23814 <at> debbugs.gnu.org (full text, mbox):

From: handa <handa <at> gnu.org>
To: ynyaaa <at> gmail.com
Cc: eliz <at> gnu.org, 23814 <at> debbugs.gnu.org
Subject: Re: bug#23814: 24.5; bug of hz coding-system
Date: Wed, 27 Jul 2016 00:09:24 +0900
In article <87twffigzv.fsf <at> gmail.com>, ynyaaa <at> gmail.com writes:

> But I found other bugs about decodings of "~" escape.
> "~~" and "~{!!~}" should be encoded and decoded as below.
>     "~~" -> "~~~~" -> "~~"
>     "~{!!~}" -> "~~{!!~~}" -> "~{!!~}"

> In really they are encoded properly, but decoded in wrong way.
>     (decode-coding-string (encode-coding-string "~~" 'hz) 'hz)
>>> "~"
>     (decode-coding-string (encode-coding-string "~{!!~}" 'hz) 'hz)
>>> #("\x3000" 0 1 (charset chinese-gb2312))

Thank you for finding those bugs.  Could you please try the attached
patch instead?

---
K. Handa
handa <at> gnu.org

diff --git a/lisp/language/china-util.el b/lisp/language/china-util.el
index e531640..9abdae1 100644
--- a/lisp/language/china-util.el
+++ b/lisp/language/china-util.el
@@ -95,7 +95,12 @@ decode-hz-region
 	(goto-char (point-min))
 	(while (search-forward "~" nil t)
 	  (setq ch (following-char))
-	  (if (or (= ch ?\n) (= ch ?~)) (delete-char -1)))
+          (if (= ch ?{)
+              (search-forward "~}" nil 'move)
+            (when (or (= ch ?\n) (= ch ?~))
+              (delete-char -1)
+              (put-text-property (point) (1+ (point)) 'hz-decoded t)
+              (forward-char 1))))
 
 	;; "^zW...\n" -> Chinese GB2312
 	;; "~{...~}"  -> Chinese GB2312
@@ -104,6 +109,8 @@ decode-hz-region
 	(while (re-search-forward hz/zw-start-gb nil t)
 	  (setq pos (match-beginning 0)
 		ch (char-after pos))
+          (if (and (= ch ?~) (get-text-property pos 'hz-decoded))
+              (forward-char 1)
 	  ;; Record the first position to start conversion.
 	  (or beg (setq beg pos))
 	  (end-of-line)
@@ -122,9 +129,10 @@ decode-hz-region
 				  t)
 		  (delete-char -2))
 	      (setq end (point))
-	      (translate-region pos (point) hz-set-msb-table))))
+	      (translate-region pos (point) hz-set-msb-table)))))
 	(if beg
 	    (decode-coding-region beg end 'euc-china)))
+      (remove-text-properties (point-min) (point-max) '(hz-decoded nil))
       (- (point-max) (point-min)))))
 
 ;;;###autoload
@@ -142,6 +150,7 @@ encode-hz-region
     (save-restriction
       (narrow-to-region beg end)
 
+      (put-text-property beg end 'charset 'chinese-gb2312)
       ;; "~" -> "~~"
       (goto-char (point-min))
       (while (search-forward "~" nil t)	(insert ?~))




This bug report was last modified 8 years and 85 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.