GNU bug report logs - #65997
29.1; ?\N{char_name} reference is wrong

Previous Next

Package: emacs;

Reported by: awrhygty <at> outlook.com

Date: Fri, 15 Sep 2023 13:04:01 UTC

Severity: normal

Tags: fixed

Found in version 29.1

Fixed in version 29.2

Done: Robert Pluim <rpluim <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Robert Pluim <rpluim <at> gmail.com>
To: awrhygty <at> outlook.com
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 65997 <at> debbugs.gnu.org
Subject: bug#65997: 29.1; ?\N{char_name} reference is wrong
Date: Fri, 15 Sep 2023 17:33:41 +0200
>>>>> On Fri, 15 Sep 2023 22:02:37 +0900, awrhygty <at> outlook.com said:

    awrhygty> S-exps in the form of ?\N{char_name} return wrong values for some
    awrhygty> characters.
    awrhygty> The S-exp below inserts a whole list of such characters.

    awrhygty> (dotimes (u (1+ (max-char 'ucs)))
    awrhygty>   (let* ((name (get-char-code-property u 'name)))
    awrhygty>     (when (and name (not (<= #xD800 u #xDFFF)))
    awrhygty>       (let ((u2 (condition-case err
    awrhygty>                     (read (format "?\\N{%s}" name))
    awrhygty>                   (error 0))))
    awrhygty>         (unless (eq u u2)
    awrhygty>           (insert (format "%X\t%s\t%X\t%s\n" u name u2
    awrhygty>                           (if (= 0 u2)
    awrhygty>                               "error"
    awrhygty>                             (get-char-code-property u2 'name)))))))))

For a minute there I thought our hash tables were broken :-). Stefan,
it only took 9 years, but this is no longer true:

lisp/international/mule-cmds.el:

	        ;; In theory this code could end up pushing an "old-name" that
	        ;; shadows a "new-name" but in practice every time an
	        ;; `old-name' conflicts with a `new-name', the newer one has a
	        ;; higher code, so it gets pushed later!

The patch below fixes that issue.

    awrhygty> output(TANGUT COMPONENTs are omitted):

I donʼt know why the ranges in `ucs-names' donʼt cover these
code-points. Itʼs easy enough to change them, but theyʼre
explicitly commented out.

    awrhygty> 16FE4	KHITAN SMALL SCRIPT FILLER	0	error
    awrhygty> 16FF0	VIETNAMESE ALTERNATE READING MARK CA	0	error
    awrhygty> 16FF1	VIETNAMESE ALTERNATE READING MARK NHAY	0	error
    awrhygty> 1B132	HIRAGANA LETTER SMALL KO	0	error

And similarly for these 4.

Robert
-- 

diff --git a/lisp/international/mule-cmds.el b/lisp/international/mule-cmds.el
index c26898f7649..254ecae5bd5 100644
--- a/lisp/international/mule-cmds.el
+++ b/lisp/international/mule-cmds.el
@@ -3135,7 +3135,9 @@ ucs-names
 	        ;; `old-name' conflicts with a `new-name', the newer one has a
 	        ;; higher code, so it gets pushed later!
 	        (if new-name (puthash new-name c names))
-	        (if old-name (puthash old-name c names))
+                (when (and old-name
+                           (not (gethash old-name names)))
+                  (puthash old-name c names))
                 ;; Unicode uses the spelling "lamda" in character
                 ;; names, instead of "lambda", due to "preferences
                 ;; expressed by the Greek National Body" (Bug#30513).




This bug report was last modified 1 year and 297 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.