#39799 - 28.0.50; Most emoji sequences don’t render correctly

GNU bug report logs - #39799
28.0.50; Most emoji sequences don’t render correctly

Package: emacs;

Reported by: Mike FABIAN <mfabian <at> redhat.com>

Date: Wed, 26 Feb 2020 14:30:03 UTC

Severity: normal

Found in version 28.0.50

Fixed in version 28.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Message #221 received at 39799 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org> To: Robert Pluim <rpluim <at> gmail.com> Cc: rgm <at> gnu.org, 39799 <at> debbugs.gnu.org, mfabian <at> redhat.com Subject: Re: bug#39799: 28.0.50; Most emoji sequences don’t render correctly Date: Tue, 21 Sep 2021 12:16:38 +0300

> From: Robert Pluim <rpluim <at> gmail.com> > Cc: rgm <at> gnu.org, 39799 <at> debbugs.gnu.org, mfabian <at> redhat.com > Date: Mon, 20 Sep 2021 22:38:28 +0200 > > Iʼve just pushed a change to master that should fix (almost) all the > issues with displaying emoji sequences (except for keycaps). Feedback > welcome. Thanks, this is mostly okay, IMO. the only issue I have with this is here: > --- a/admin/unidata/blocks.awk > +++ b/admin/unidata/blocks.awk > @@ -221,6 +221,46 @@ FILENAME ~ "emoji-data.txt" && /^[0-9A-F].*; Emoji_Presentation / { > } > > END { > + ## These codepoints have Emoji_Presentation = No, but they are > + ## used in emoji-sequences.txt and emoji-zwj-sequences.txt (with a > + ## Variation Selector), so force them into the emoji script so > + ## they will get composed correctly. FIXME: delete this when we > + ## can change the font used for a codepoint based on whether it's > + ## followed by a VS (usually VS-16) > + idx = 0 > + override_start[idx] = "261D" > + override_end[idx] = "261D" > + idx++ > + override_start[idx] = "26F9" > + override_end[idx] = "26F9" > + idx++ > + override_start[idx] = "270C" > + override_end[idx] = "270D" > + idx++ > + override_start[idx] = "2764" > + override_end[idx] = "2764" > + idx++ > + override_start[idx] = "1F3CB" > + override_end[idx] = "1F3CC" > + idx++ > + override_start[idx] = "1F3F3" > + override_end[idx] = "1F3F4" > + idx++ > + override_start[idx] = "1F441" > + override_end[idx] = "1F441" > + idx++ > + override_start[idx] = "1F575" > + override_end[idx] = "1F575" > + > + for (k in override_start) > + { > + i++ > + start[i] = override_start[k] > + end[i] = override_end[k] > + alt[i] = "emoji" > + name[i] = "Autogenerated emoji (override)" > + } Specifically, the U+2xxx codepoints are now in the 'emoji' script, which I think is undesirable, even if the price is that we won't support the sequences in which those codepoints are followed by VS-16. So I think we should remove those codepoints from the above, leaving only the U+1Fxxx" ones. Btw, currently U+261D followed by VS-16 doesn't compose for me, probably because compose-gstring-for-variation-glyph is hardcoded to work only for Han characters, and U+261D isn't, or because that function is not suited to VS-16 (it looks for glyph variations in the font)? Or am I missing something? Now to my idea of supporting those "U+2xxx VS-16" sequences without assigning them to the 'emoji' script: The function autocmp_chars uses font_range to find whether the sequence of characters that can be composed are supported by the same font. It currently takes the first character of the sequence, calls font_for_char for it, then checks that all the rest of the characters are supported by that font by calling font_encode_char. In our case, the first character of the sequence is U+2xxx, which is not in the 'emoji' script, so Emacs is likely to pick up a font that doesn't support Emoji, and the composition will fail. To avoid that, I propose the following change: . add a new argument to font_range, the codepoint that triggered the composition . inside font_range, if that codepoint belongs to the 'emoji' script (use char-script-table to find that out), call font_for_char with a representative character for 'emoji' (from script-representative-chars) instead of the first character of the sequence, then check that all the sequence characters, including the first one, can be supported by that font; if they can, return that font to the caller, to be used for the composition WDYT? Btw, if you use Firefox or Chrome, or some other application that can show Emoji sequences, or maybe just use HarfBuzz's hb-view, how does the display of the U+2xxx changes when they are followed by VS-16? Is the change prominent enough for us to try to support it? If not, perhaps the above should be left out for the moment.

This bug report was last modified 3 years and 256 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #39799 28.0.50; Most emoji sequences don’t render correctly

GNU bug report logs - #39799
28.0.50; Most emoji sequences don’t render correctly