GNU bug report logs - #54562
28.0.91; Emoji sequence not composed

Previous Next

Package: emacs;

Reported by: Po Lu <luangruo <at> yahoo.com>

Date: Fri, 25 Mar 2022 09:18:02 UTC

Severity: normal

Found in version 28.0.91

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Robert Pluim <rpluim <at> gmail.com>
Cc: luangruo <at> yahoo.com, larsi <at> gnus.org, 54562 <at> debbugs.gnu.org
Subject: bug#54562: 28.0.91; Emoji sequence not composed
Date: Tue, 29 Mar 2022 14:44:47 +0300
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: luangruo <at> yahoo.com,  larsi <at> gnus.org,  54562 <at> debbugs.gnu.org
> Date: Tue, 29 Mar 2022 12:45:44 +0200
> 
>     Eli> I thought about any Mn character whose canonical-combining-class
>     Eli> property is 200 and above.  The COMBINING ENCLOSING <SOMETHING> stuff
>     Eli> will need to be added to that, of course.  And we could have that
>     Eli> option have multiple possible values, not just on/off...
> 
> OK. Would Me be ok for you, or would you specifically want only the
> codepoints from the "Combining Diacritical Marks for Symbols" block?

Using Me is fine with me.

> I guess you'd want options like:
> 
> 'all => combining-class + enclosing
> 'enclosing
> 'combining-class
> 
> (did we want to cover the 'number followed U+20E3 => emoji' case with
> an option too?)

That's a separate issue, IMO, and it can be handled via
auto-composition-emoji-eligible-codepoints, I think?  We could even
tell users to do that by themselves.

> 
>     Eli> Btw, for sequences that include a base character and 2 or more
>     Eli> diacritics, selecting a font that supports the first diacritic (the
>     Eli> one which triggers the composition) might not be enough, since the
>     Eli> rest of the diacritics could be absent from that font.  Instead, we'd
>     Eli> need something like "find the font for each one of them and then use
>     Eli> the one which supports the largest subset of them".
> 
> font_range currently only has access to the first diacritic, so that
> would be a bigger change. And that subset had better have the same
> size as the number of unique diacritics, otherwise itʼs unlikely to
> work.

We could perhaps avoid the complexity by rewriting the composition
rule for diacritics.  Instead of "\\c.\\c^+" with 1-character
look-back, we could have several rules:

   "\\c.\\c^\\c^\\c^\\c^" with 4-character look-back
   "\\c.\\c^\\c^\\c^+"    with 3-character look-back
   "\\c.\\c^\\c^+"        with 2-character look-back
   "\\c.\\c^+"            with 1-character look-back

(in that order).  I didn't test this, but if it works, maybe it could
solve the problem without any deep changes on the C level.




This bug report was last modified 3 years and 133 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.