GNU bug report logs -
#56323
29.0.50; Add new customisable phonetic Tamil input method
Previous Next
Reported by: Visuwesh <visuweshm <at> gmail.com>
Date: Thu, 30 Jun 2022 12:14:02 UTC
Severity: wishlist
Tags: patch
Found in version 29.0.50
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>> From: Visuwesh <visuweshm <at> gmail.com>
>> Cc: 56323 <at> debbugs.gnu.org
>> Date: Sat, 02 Jul 2022 13:41:17 +0530
>>
>> > (defun sort-by-codepoint (c1 c2)
>> > (< (string-to-char c1) (string-to-char c2)))
>> >
>> > (let ((core-consonants '("க" "ங" "ச" "ஞ" "ட" "ண" "த"
>> > "ந" "ப" "ம" "ய" "ர" "ல"
>> > "வ" "ழ" "ள" "ற" "ன")))
>> >
>> > (sort core-consonants 'sort-by-codepoint))
>> > => ("க" "ங" "ச" "ஞ" "ட" "ண" "த" "ந" "ன" "ப" "ம" "ய" "ர" "ற" "ல" "ள" "ழ" "வ")
>> >
>> > (To understand why, read the doc string of 'sort' carefully, where it
>> > explains what is expected from PREDICATE.)
>>
>> Unfortunately not, since it jumbles up the list. The desired outcome is
>> the same list.
>
> But we already established that you need to break the list in two, and
> always sort any member of one of the two sub-lists before any member
> of the other sub-list. I then suggested to use string-lessp _within_
> each sub-list, but you said it still yielded a wrong order for some
> reason.
>
Yes, I hope I made my point clear below.
> So when you now return to the issue of splitting the list in two, and
> show how sorting the full list doesn't work, you make a step back: we
> already established the list cannot be sorted as a single list.
I think I might not have made my point clear: the sort function above
sorts one of the sub-lists.
> The only remaining issue, AFAIU, is why string-lessp is not good
> enough for sorting within each sub-list.
It is not good enough for each sub-list for the same reason: the order
produced by string-lessp is not the same as the actual order.
I will try to explain the situation using the regular English alphabets
and the extra letter þ (which was used in place of "th" AFAIU).
The core English alphabets are a-z then we have some extra alphabets
like the þ above. When we have a list containing _both_ a-z and þ, the
order produced by string-lessp is wrong. To work around this issue, we
decided to break the list into two. I think we were on the same page
till here.
When I did as you suggested and broke the list into two -- a-z and þ --
and sorted the sub-list that only contained a-z with string-lessp, the
sorted sub-list was not in the right alphabetical order i.e., instead of
"a b c d ..." it was "a c b d ..."
I hope the above makes the situation clear.
This bug report was last modified 2 years and 312 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.