Package: emacs;
Reported by: समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com>
Date: Wed, 11 May 2022 15:03:02 UTC
Severity: normal
Tags: patch
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Eli Zaretskii <eliz <at> gnu.org> To: समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com> Cc: 55370 <at> debbugs.gnu.org Subject: bug#55370: [PATCH] Add support for the Syloti Nagri script Date: Thu, 12 May 2022 19:29:23 +0300
> From: समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com> > Date: Thu, 12 May 2022 20:36:49 +0530 > Cc: 55370 <at> debbugs.gnu.org > > For example in tirhuta, when I do this: > > ;; Tirhuta composition rules > (let ((consonant "[\x1148F-\x114AF]") > (nukta "\x114C3") > (independent-vowel "[\x11481-\x1148E]") > (vowel "[\x114B0-\x114BE]") > (nasal "[\x114BF\x114C0]") > (virama "\x114C2")) > (set-char-table-range composition-function-table > '(#x114B0 . #x114BE) > (list (vector > ;; Consonant based syllables > (concat consonant nukta "?\\(?:" virama > consonant nukta "?\\)*\\(?:" > virama "\\|" vowel "*" nukta "?" > nasal "?\\)") > 1 'font-shape-gstring)))) > > Notice here, the nasal sign is not included in the range. > And then I type: 𑒅𑓀 𑒆𑒿 > It is rendered correctly It is rendered correctly because your rule isn't used. The rule '(#x114B0 . #x114BE) (list (vector ;; Consonant based syllables (concat consonant nukta "?\\(?:" virama consonant nukta "?\\)* \\(?:" virama "\\|" vowel "*" nukta "?" nasal "?\\)") 1 'font-shape-gstring)))) says this: . find a character C between #x114B0 and #x114BE . see if the characters starting one character before C match the above regexp . if they match, compose them But your text doesn't include any characters in the range [\x114B0-\x114BE], so the above rule will never match anything, and will not cause any composition. You see the characters composed because the second character in each par, #x114C0 and #x114BF, is a combining accent, and for those we have a catch-all rule in composite.el: (when unicode-category-table (let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic] [nil 0 compose-gstring-for-graphic]))) (map-char-table #'(lambda (key val) (if (memq val '(Mn Mc Me)) (set-char-table-range composition-function-table key elt))) unicode-category-table)) > But when I do: > > ;; Tirhuta composition rules > (let ((consonant "[\x1148F-\x114AF]") > (nukta "\x114C3") > (independent-vowel "[\x11481-\x1148E]") > (vowel "[\x114B0-\x114BE]") > (nasal "[\x114BF\x114C0]") > (virama "\x114C2")) > (set-char-table-range composition-function-table > '(#x114B0 . #x114C0) > (list (vector > ;; Consonant based syllables > (concat consonant nukta "?\\(?:" virama > consonant nukta "?\\)*\\(?:" > virama "\\|" vowel "*" nukta "?" > nasal "?\\)") > 1 'font-shape-gstring)))) > The range now has the nasal signs. > And then type the above characters: 𑒅𑓀 𑒆𑒿 > They are not rendered correctly In this case, the characters that trigger examination of the composition rules, #x114C0 and #x114BF, _are_ in the range '(#x114B0 . #x114C0). However, the preceding characters, #x11484 and #x11486, are independent-vowel's, and there are no independent-vowel in the regexp. So again, the rules will never match. Except that now you also replaced the default rule we have for the combining accents, so what worked before no longer does. > But when I include their composition rules: > > ;; Tirhuta composition rules > (let ((consonant "[\x1148F-\x114AF]") > (nukta "\x114C3") > (independent-vowel "[\x11481-\x1148E]") > (vowel "[\x114B0-\x114BE]") > (nasal "[\x114BF\x114C0]") > (virama "\x114C2")) > (set-char-table-range composition-function-table > '(#x114B0 . #x114C0) > (list (vector > ;; Consonant based syllables > (concat consonant nukta "?\\(?:" virama > consonant nukta "?\\)*\\(?:" > virama "\\|" vowel "*" nukta "?" > nasal "?\\)") > 1 'font-shape-gstring) > (vector > ;; Nasal vowels > (concat independent-vowel nasal "?") > 1 'font-shape-gstring)))) > > They are now once more rendered correctly. As expected, see above: now you do have a regexp that can match, it's this one: (concat independent-vowel nasal "?") I hope you now understand how to fix the rules. If not, please ask more questions and show more examples.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.