Package: emacs;
Reported by: Visuwesh <visuweshm <at> gmail.com>
Date: Thu, 30 Jun 2022 12:14:02 UTC
Severity: wishlist
Tags: patch
Found in version 29.0.50
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Visuwesh <visuweshm <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 56323 <at> debbugs.gnu.org Subject: bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method Date: Fri, 01 Jul 2022 22:07:38 +0530
[வெள்ளி ஜூலை 01, 2022] Eli Zaretskii wrote: >> I mostly meant to ask if the weighted approach was good but I wasn't >> clear enough, sorry. Let me try to explain it better: >> >> Let's suppose that string-lessp does not work for English for the >> discussion here. The task is to sort a list of jumbled English >> alphabets in alphabetical order. What I'm currently doing is creating >> an alist where the key is the alphabet and the value is the alphabet's >> order (so a will be 1, b will be 2, etc.). Then in the sort function, I >> look for this order. If the alphabet is not in this list, then I fall >> back to a large number. >> >> So the code above would look like this if it were in English, >> >> (sort '("b" "z" "c" "n" "a" "aa" "p") >> (lambda (x y) >> (let ((cp '(("a" . 0) ("b" . 1) ("c" . 2) ("d" . 3) ("e" . 4) >> ("f" . 5) ("g" . 6) ("h" . 7) ("i" . 8) ("j" . 9) >> ("k" . 10) ("l" . 11) ("m" . 12) ("n" . 13) ("o" . 14) >> ("p" . 15) ("q" . 16) ("r" . 17) ("s" . 18) ("t" . 19) >> ("u" . 20) ("v" . 21) ("w" . 22) ("x" . 23) ("y" . 24) >> ("z" . 25)))) >> (< (or (assoc-default x cp) 10000) >> (or (assoc-default y cp) 10000))))) >> >> and the sorted list comes out as ("a" "b" "c" "n" "p" "z" "aa") >> which is exactly what I desire. I hope this is clear enough. > > The above just gives each letter its order in the alphabet. But if > that is what you wanted, string-lessp (or even just direct comparison > of characters) would have worked for you. So there's still something > important missing from your description, I think. > Unfortunately, string-lessp does not do the job. (string-lessp "ஞ" "ஜ") should return t but it returns nil probably because ஞ's codepoint is 2974 and ஜ's codepoint is 2972. But ஜ is not even part of the "core" Tamil characters and hence should come at last. This is why I went with defining an alist with the _actual_ order of the characters. I hope this is clear: to demonstrate this using English, it would be something like... c's codepoint is 29 and d's codepoint is 27. Clearly, c comes before d but since string-lessp seems to rely on the Unicode codepoint, when we do the sorting with string-lessp, we get "... d c ..." in the list instead of the desired "... c d ...". I hope this is clear. >> Yep, it is misalignment. I could try to use those pixel-resolution >> alignment features but I really don't think I can do a good enough job. >> It is something I tried in the past but gave up since it was too complex >> for me. The current code produces a Good Enough™ table and I think I >> will just leave it unless Someone™ complains since after all, the >> current situation is much better than what we have in Emacs 28 (the >> docfix that happened as part of bug#50143 isn't in Emacs 28). > > I thought vtable.el was about solving such problems? Okay then, I will use that. I was mostly unsure if using vtable would be alright especially since it puts keymap properties and the entire vtable object as a text property -- it seemed too excessive for a docstring. Maybe some of this can be addressed? >> BTW, do you have any other code/documentation review? And what about >> the patch I posted in https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-06/msg02256.html? >> No rush but I would like to know if it can go in since it only addresses >> fallouts from the previous bug in this area. Thanks. > > It sounded to me like you are still working on the code, so I didn't > see a need to review it. If you have specific parts that you'd like > me to review nonetheless, please tell which parts are those. Thanks. The patch I posted in https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-06/msg02256.html is done, and can be pushed to master if you see no problems. All it does is address a few fallouts that were accidentally left out when fixing bug#50143. Specifically, it adds an entry for the TAMIL OM character, and adds two more Sanskrit consonants to the Tamil itrans table. Also, I would like to know if there's a better to write the :set function for the defcustoms tamil-vowel-translation, tamil-consonant-translation, tamil-misc-translation, tamil-native-digits without the boundp check chain below, (defun tamil--set-variable (sym val) (set-default sym val) (when (and (boundp 'tamil-vowel-translation) (boundp 'tamil-consonant-translation) (boundp 'tamil-misc-translation) (boundp 'tamil-native-digits)) (tamil--update-quail-rules))) I'm also doubtful about the current group being used for these defcustoms. Should I go ahead and make a new 'tamil' group and make it a subgroup of leim or i18n? And is the prefix tamil- okay or should I change it to something else? Finally, I'm unsure if "List of input sequences to translate to ..." is clear. I think it sounds a mouthful and there should be a better way to put it. I think "translation rules" is quite nice but I'm afraid that it is too Quail specific and might not be well understood.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.