Package: emacs;
Reported by: Binbin YE <phantom2501 <at> gmail.com>
Date: Thu, 21 Aug 2025 15:16:02 UTC
Severity: normal
Tags: patch
Message #50 received at 79285 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Binbin YE <phantom2501 <at> gmail.com> Cc: 79285 <at> debbugs.gnu.org Subject: Re: bug#79285: [Patch] support :font-features in face Date: Mon, 08 Sep 2025 19:33:53 +0300
> From: Binbin YE <phantom2501 <at> gmail.com> > Date: Tue, 9 Sep 2025 00:26:11 +0900 > Cc: 79285 <at> debbugs.gnu.org > > On Wed, Sep 3, 2025 at 9:10 PM Eli Zaretskii <eliz <at> gnu.org> wrote: > > That's the idea, yes. It would mean to have a function in hbfont.c > that is the subset of hbfont_shape, and which accepts a single > character (not a Lisp string) and a font, and then constructs the > hb_buffer and submits that to hb_shape_full. > > But please test if this should give good results by simulating it, as > follows: > > . make composition-function-table whose cells for several characters > match only that one character, and see how a string of such > characters is rendered using a font with relevant OpenType features > . then compare that with rendering when composition-function-table > has the same rule in the cell of each of those characters, matching > any sequence of these characters (as in "[abcdefg]+") > > If applying stylistic sets by rendering text one character at a time > produces different results from rendering them all as a single string, > then this idea is not workable, and we will need to use the (slower > and more complex) composite.c machinery instead. > > If the idea does work, then presumably a change in > get_glyph_face_and_encoding for characters that have this special > face attribute will be all that's needed, perhaps together with some > flag in the 'struct it' to make that faster. Details later, when we > know whether the idea works or not. > > I compared the result between > > #+begin_src elisp > (set-char-table-range composition-function-table > ?! > '(["\\(!==\\)" 0 font-shape-gstring])) > #+end_src > > and > > #+begin_src elisp > (set-char-table-range composition-function-table > ?! > '(["\\(!\\)" 0 font-shape-gstring])) > > (set-char-table-range composition-function-table > ?= > '(["\\(=\\)" 0 font-shape-gstring])) > #+end_src > > They are different, only matching the sequence produces the desired result for multi-character ligatures. > > I read the hbfont.c code and the hb buffer is cleared every time > handling the shaping. I think it makes sense that it should not > store the state of the Emacs buffer in hb buffer, and HarfBuzz needs > to know the whole sequence to shape according to their document. OK, thanks. This is what I suspected. Unfortunately, it means we need to use the full machinery of character composition to support this face attribute on arbitrary text: the entire chunk of text which has this attribute, or at least its individual wortds, will need to be passed through HarfBuzz en-masse. It also means using this will probably slow down redisplay of the relevant text parts, unless we find a way of avoiding some of the slow code parts. Let me think about the best way of doing this. Meanwhile, I invite you to read the large commentary at the beginning of xdisp.c, which mentions character composition, and also take at least a cursory look at the "automatic compositions" parts of composite.c, which is where most of the code that deals with character compositions lives. > I did some research on how other programs make use of HarfBuzz. They > typically put an entire paragraph or put a line into the shaping > function. It is quite an interesting way for Emacs to detect a > sequence first and specifically shape that sequence using > HarfBuzz. It might be historical reason but it seems a lot more work > needs to be done in composite.c or we need to figure out something > better. It's not just a historical accident. The Emacs display engine is special, in that it examines text one character (or one grapheme cluster, in case of compositions) at a time, and makes all the layout decisions on the fly. So passing large chunks of text to a shaping engine, like other programs do, is out of the question, as long as we keep this basic design of the Emacs display. The reason why Emacs tries to avoid using the shaper, unless composition-function-table tells us we must, is that the implementation of shaping and composition in Emacs is exposed to Lisp and uses Lisp code for some of its workings, and thus is slow. Emacs is unique in this: no other program allows the user to affect character composition and shaping by a simple change of a character-indexed table, while the session keeps running. This gives Lisp programs and users an unprecedented freedom of affecting how stuff is displayed, but it comes at a price.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.