GNU bug report logs - #54562
28.0.91; Emoji sequence not composed

Previous Next

Package: emacs;

Reported by: Po Lu <luangruo <at> yahoo.com>

Date: Fri, 25 Mar 2022 09:18:02 UTC

Severity: normal

Found in version 28.0.91

Full log


Message #134 received at 54562 <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: luangruo <at> yahoo.com, larsi <at> gnus.org, 54562 <at> debbugs.gnu.org
Subject: Re: bug#54562: 28.0.91; Emoji sequence not composed
Date: Tue, 29 Mar 2022 16:50:10 +0200
>>>>> On Tue, 29 Mar 2022 14:44:47 +0300, Eli Zaretskii <eliz <at> gnu.org> said:

    >> From: Robert Pluim <rpluim <at> gmail.com>
    >> Cc: luangruo <at> yahoo.com,  larsi <at> gnus.org,  54562 <at> debbugs.gnu.org
    >> Date: Tue, 29 Mar 2022 12:45:44 +0200
    >> 
    Eli> I thought about any Mn character whose canonical-combining-class
    Eli> property is 200 and above.  The COMBINING ENCLOSING <SOMETHING> stuff
    Eli> will need to be added to that, of course.  And we could have that
    Eli> option have multiple possible values, not just on/off...
    >> 
    >> OK. Would Me be ok for you, or would you specifically want only the
    >> codepoints from the "Combining Diacritical Marks for Symbols" block?

    Eli> Using Me is fine with me.

OK. There are probably subtleties surrounding things like U+20D2 that
I need to read up on (or we say "overlays are deprecated, letʼs ignore
them").

    >> I guess you'd want options like:
    >> 
    >> 'all => combining-class + enclosing
    >> 'enclosing
    >> 'combining-class
    >> 
    >> (did we want to cover the 'number followed U+20E3 => emoji' case with
    >> an option too?)

    Eli> That's a separate issue, IMO, and it can be handled via
    Eli> auto-composition-emoji-eligible-codepoints, I think?  We could even
    Eli> tell users to do that by themselves.

We could, although my purist side doesnʼt want to do it, since the
standard exists for a reason, dammit.

    Eli> We could perhaps avoid the complexity by rewriting the composition
    Eli> rule for diacritics.  Instead of "\\c.\\c^+" with 1-character
    Eli> look-back, we could have several rules:

    Eli>    "\\c.\\c^\\c^\\c^\\c^" with 4-character look-back
    Eli>    "\\c.\\c^\\c^\\c^+"    with 3-character look-back
    Eli>    "\\c.\\c^\\c^+"        with 2-character look-back
    Eli>    "\\c.\\c^+"            with 1-character look-back

    Eli> (in that order).  I didn't test this, but if it works, maybe it could
    Eli> solve the problem without any deep changes on the C level.

That might work. What would the fallback look like? Suppose we have 4
diacritics, 3 of which are covered by the same font, and one by a
different one. Would you prefer to attempt to use the font of 3 of
them, or would you prefer to fall back to the font of the base
character? (Iʼm not sure which would give better results in practice,
they might both fail)

Robert
-- 




This bug report was last modified 3 years and 175 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.