#65997 - 29.1; ?\N{char_name} reference is wrong

GNU bug report logs - #65997
29.1; ?\N{char_name} reference is wrong

Package: emacs;

Reported by: awrhygty <at> outlook.com

Date: Fri, 15 Sep 2023 13:04:01 UTC

Severity: normal

Tags: fixed

Found in version 29.1

Fixed in version 29.2

Done: Robert Pluim <rpluim <at> gmail.com>

Bug is archived. No further changes may be made.

View this message in rfc822 format

From: Robert Pluim <rpluim <at> gmail.com> To: awrhygty <at> outlook.com Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 65997 <at> debbugs.gnu.org Subject: bug#65997: 29.1; ?\N{char_name} reference is wrong Date: Fri, 15 Sep 2023 17:33:41 +0200

>>>>> On Fri, 15 Sep 2023 22:02:37 +0900, awrhygty <at> outlook.com said: awrhygty> S-exps in the form of ?\N{char_name} return wrong values for some awrhygty> characters. awrhygty> The S-exp below inserts a whole list of such characters. awrhygty> (dotimes (u (1+ (max-char 'ucs))) awrhygty> (let* ((name (get-char-code-property u 'name))) awrhygty> (when (and name (not (<= #xD800 u #xDFFF))) awrhygty> (let ((u2 (condition-case err awrhygty> (read (format "?\\N{%s}" name)) awrhygty> (error 0)))) awrhygty> (unless (eq u u2) awrhygty> (insert (format "%X\t%s\t%X\t%s\n" u name u2 awrhygty> (if (= 0 u2) awrhygty> "error" awrhygty> (get-char-code-property u2 'name))))))))) For a minute there I thought our hash tables were broken :-). Stefan, it only took 9 years, but this is no longer true: lisp/international/mule-cmds.el: ;; In theory this code could end up pushing an "old-name" that ;; shadows a "new-name" but in practice every time an ;; `old-name' conflicts with a `new-name', the newer one has a ;; higher code, so it gets pushed later! The patch below fixes that issue. awrhygty> output(TANGUT COMPONENTs are omitted): I donʼt know why the ranges in `ucs-names' donʼt cover these code-points. Itʼs easy enough to change them, but theyʼre explicitly commented out. awrhygty> 16FE4 KHITAN SMALL SCRIPT FILLER 0 error awrhygty> 16FF0 VIETNAMESE ALTERNATE READING MARK CA 0 error awrhygty> 16FF1 VIETNAMESE ALTERNATE READING MARK NHAY 0 error awrhygty> 1B132 HIRAGANA LETTER SMALL KO 0 error And similarly for these 4. Robert -- diff --git a/lisp/international/mule-cmds.el b/lisp/international/mule-cmds.el index c26898f7649..254ecae5bd5 100644 --- a/lisp/international/mule-cmds.el +++ b/lisp/international/mule-cmds.el @@ -3135,7 +3135,9 @@ ucs-names ;; `old-name' conflicts with a `new-name', the newer one has a ;; higher code, so it gets pushed later! (if new-name (puthash new-name c names)) - (if old-name (puthash old-name c names)) + (when (and old-name + (not (gethash old-name names))) + (puthash old-name c names)) ;; Unicode uses the spelling "lamda" in character ;; names, instead of "lambda", due to "preferences ;; expressed by the Greek National Body" (Bug#30513).

This bug report was last modified 1 year and 297 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #65997 29.1; ?\N{char_name} reference is wrong

GNU bug report logs - #65997
29.1; ?\N{char_name} reference is wrong