GNU bug report logs - #55370
[PATCH] Add support for the Syloti Nagri script

Previous Next

Package: emacs;

Reported by: समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com>

Date: Wed, 11 May 2022 15:03:02 UTC

Severity: normal

Tags: patch

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: समीर सिंह Sameer Singh
 <lumarzeli30 <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 55370 <at> debbugs.gnu.org
Subject: bug#55370: [PATCH] Add support for the Syloti Nagri script
Date: Thu, 12 May 2022 22:20:15 +0530
[Message part 1 (text/plain, inline)]
Ah! I think I understand now.
Since Emacs had pre defined rules for these characters, they were rendering
fine without my input, but when I had included them in range,
a new rule had to be defined for them, because the previous ones were
overwritten.

For example, this is correct now isn't it?

;; Syloti Nagri composition rules
(let ((consonant            "[\xA807-\xA80A\xA80C-\xA822]")
      (independent-vowel    "[\xA800\xA801\xA803-\xA805]")
      (vowel                "[\xA802\xA823-\xA827]")
      (nasal                "[\xA80B]")
      (virama               "[\xA806\xA82C]"))
  (set-char-table-range composition-function-table
                        '(#xA802 . #xA82C)
                        (list (vector
                               ;; Consonant conjunct based syllables
                               (concat consonant "\\(?:" virama consonant
"\\)+"
                                       vowel "?" nasal "?")
                               1 'font-shape-gstring)
                              (vector
                               ;; Vowels based syllables
                               (concat independent-vowel consonant "?"
virama "?"
                                       vowel "?" nasal "?")
                               1 'font-shape-gstring))))

Here I have included the nasal sign, virama and vowel sign in the range.
I have also added a rule for independent vowels with consonants, virama,
vowel signs and nasal signs so that emacs does not hang, when they appear
together.

On Thu, May 12, 2022 at 9:59 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > From: समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com>
> > Date: Thu, 12 May 2022 20:36:49 +0530
> > Cc: 55370 <at> debbugs.gnu.org
> >
> > For example in tirhuta, when I do this:
> >
> > ;; Tirhuta composition rules
> > (let ((consonant            "[\x1148F-\x114AF]")
> >       (nukta                "\x114C3")
> >       (independent-vowel    "[\x11481-\x1148E]")
> >       (vowel                "[\x114B0-\x114BE]")
> >       (nasal                "[\x114BF\x114C0]")
> >       (virama               "\x114C2"))
> >   (set-char-table-range composition-function-table
> >                         '(#x114B0 . #x114BE)
> >                         (list (vector
> >                                ;; Consonant based syllables
> >                                (concat consonant nukta "?\\(?:" virama
> > consonant nukta "?\\)*\\(?:"
> >                                        virama "\\|" vowel "*" nukta "?"
> > nasal "?\\)")
> >                                1 'font-shape-gstring))))
> >
> > Notice here, the nasal sign is not included in the range.
> > And then I type: 𑒅𑓀 𑒆𑒿
> > It is rendered correctly
>
> It is rendered correctly because your rule isn't used.
>
> The rule
>
>                         '(#x114B0 . #x114BE)
>                         (list (vector
>                                ;; Consonant based syllables
>                                (concat consonant nukta "?\\(?:"
>                                        virama consonant nukta "?\\)* \\(?:"
>                                        virama "\\|" vowel "*" nukta "?"
>                                        nasal "?\\)")
>                                1 'font-shape-gstring))))
>
> says this:
>
>   . find a character C between #x114B0 and #x114BE
>   . see if the characters starting one character before C match the
>     above regexp
>   . if they match, compose them
>
> But your text doesn't include any characters in the range
> [\x114B0-\x114BE], so the above rule will never match anything, and
> will not cause any composition.
>
> You see the characters composed because the second character in each
> par, #x114C0 and #x114BF, is a combining accent, and for those we have
> a catch-all rule in composite.el:
>
>   (when unicode-category-table
>     (let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
>                  [nil 0 compose-gstring-for-graphic])))
>       (map-char-table
>        #'(lambda (key val)
>            (if (memq val '(Mn Mc Me))
>                (set-char-table-range composition-function-table key elt)))
>        unicode-category-table))
>
>
> > But when I do:
> >
> > ;; Tirhuta composition rules
> > (let ((consonant            "[\x1148F-\x114AF]")
> >       (nukta                "\x114C3")
> >       (independent-vowel    "[\x11481-\x1148E]")
> >       (vowel                "[\x114B0-\x114BE]")
> >       (nasal                "[\x114BF\x114C0]")
> >       (virama               "\x114C2"))
> >   (set-char-table-range composition-function-table
> >                         '(#x114B0 . #x114C0)
> >                         (list (vector
> >                                ;; Consonant based syllables
> >                                (concat consonant nukta "?\\(?:" virama
> > consonant nukta "?\\)*\\(?:"
> >                                        virama "\\|" vowel "*" nukta "?"
> > nasal "?\\)")
> >                                1 'font-shape-gstring))))
> > The range now has the nasal signs.
> > And then type the above characters: 𑒅𑓀 𑒆𑒿
> > They are not rendered correctly
>
> In this case, the characters that trigger examination of the
> composition rules, #x114C0 and #x114BF, _are_ in the range
> '(#x114B0 . #x114C0).  However, the preceding characters, #x11484 and
> #x11486, are independent-vowel's, and there are no independent-vowel
> in the regexp.  So again, the rules will never match.  Except that now
> you also replaced the default rule we have for the combining accents,
> so what worked before no longer does.
>
> > But when I include their composition rules:
> >
> > ;; Tirhuta composition rules
> > (let ((consonant            "[\x1148F-\x114AF]")
> >       (nukta                "\x114C3")
> >       (independent-vowel    "[\x11481-\x1148E]")
> >       (vowel                "[\x114B0-\x114BE]")
> >       (nasal                "[\x114BF\x114C0]")
> >       (virama               "\x114C2"))
> >   (set-char-table-range composition-function-table
> >                         '(#x114B0 . #x114C0)
> >                         (list (vector
> >                                ;; Consonant based syllables
> >                                (concat consonant nukta "?\\(?:" virama
> > consonant nukta "?\\)*\\(?:"
> >                                        virama "\\|" vowel "*" nukta "?"
> > nasal "?\\)")
> >                                1 'font-shape-gstring)
> >                               (vector
> >                                ;; Nasal vowels
> >                                (concat independent-vowel nasal "?")
> >                                1 'font-shape-gstring))))
> >
> > They are now once more rendered correctly.
>
> As expected, see above: now you do have a regexp that can match, it's
> this one:
>
>     (concat independent-vowel nasal "?")
>
> I hope you now understand how to fix the rules.  If not, please ask
> more questions and show more examples.
>
[Message part 2 (text/html, inline)]

This bug report was last modified 3 years and 8 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.