GNU bug report logs -
#63731
[PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
Previous Next
Reported by: Steven Allen <steven <at> stebalien.com>
Date: Fri, 26 May 2023 03:19:01 UTC
Severity: normal
Tags: fixed, patch
Fixed in version 29.1
Done: Robert Pluim <rpluim <at> gmail.com>
Bug is archived. No further changes may be made.
Full log
Message #11 received at 63731 <at> debbugs.gnu.org (full text, mbox):
Disclaimer: I havenʼt looked at the patch yet
>>>>> On Fri, 26 May 2023 09:41:42 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Steven Allen <steven <at> stebalien.com>
>> Date: Thu, 25 May 2023 20:18:02 -0700
>>
>> This patch imports the full list from unicode.org instead of
>> special-casing a few characters as was done previously.
>>
>> With this patch, '👍️' (1F44D FE0F) should look the same as '👍' (1F44D).
>> Without it, it will look like '👍️'.
>>
>> As a simple regression test, '✔' (2714) should still as "text" while '✔️'
>> (2714 FE0F) should still display as an emoji.
>>
>> Fixes https://github.com/alphapapa/ement.el/issues/137
>>
>> NOTE: I'm not a Unicode expert, nor do I understand how Emacs handles
>> Unicode (beyond what was required to implement this patch). But this
>> patch appears to work and I can't find any regressions.
Eli> AFAIU, this change will populate composition-function-table for many
Eli> "normal" characters, including ASCII digits and symbol/punctuation
Eli> characters from the 0x2xxx blocks. E.g., after you build Emacs with
Eli> this patch, what do the following evaluations yield:
Eli> M-: (aref composition-function-table ?0) RET
Eli> M-: (aref composition-function-table #x2122) RET
Eli> If they yield non-nil values, it could mean dramatic slowdown of
Eli> redisplay with these characters. Which is precisely what we wanted to
Eli> avoid when we made the decision which parts of the Unicode-defined
Eli> Emoji sequences to support in Emacs, and how to arrange for that
Eli> support to work.
Yes. We donʼt want to do composition checks for ASCII if we can avoid it.
Eli> The issue you site is strange: according to the "C-u C-x =" display
Eli> there, Emacs did compose #x1f44d with VS-16 using the Noto Color Emoji
Eli> font, so I don't quite understand why VS-16 is then also shown as an
Eli> empty rectangle. On my system Noto Color Emoji doesn't work, and "C-u
Eli> C-x =" says this instead:
Eli> Composed with the following character(s) "️" using this font:
Eli> harfbuzz:-outline-Noto Emoji-regular-normal-normal-mono-15-*-*-*-c-*-iso10646-1
Eli> by these glyphs:
Eli> [0 1 128077 422 19 2 17 14 2 nil]
Eli> [0 1 65039 3 19 0 1 0 1 [0 0 0]]
Eli> with these character(s):
Eli> ️ (#xfe0f) VARIATION SELECTOR-16
Eli> which explains why I see two glyphs and not 1. But in the display
Eli> shown in the above issue, I see
Eli> Composed with the following character(s) "️" using this font:
Eli> ftcrhb:-GOOG-Noto Color Emoji-regular-normal-normal-*-18-*-*-*-m-0-iso10646-1
Eli> by these glyphs:
Eli> [0 1 128077 569 22 0 23 17 5 [0 0 136]]
Eli> with these character(s):
Eli> ️ (#xfe0f) VARIATION SELECTOR-16
Eli> which describes only one glyph, not two. So the result ought to be
Eli> what you expect.
I see the emoji followed by a blank box with Noto Color Emoji here. I
donʼt yet understand why.
Eli> Robert, what am I missing here?
1F44D FE0F is a valid sequence according to tr51
(aref composition-function-table #x1f44d)
=> (["\\(?:👍[🏻-🏿]\\)" 0 compose-gstring-for-graphic])
which means that the composition is being triggered by this entry:
(aref composition-function-table #xfe0f)
=> (["\\c.\\c^+" 1 compose-gstring-for-graphic] [nil 0 compose-gstring-for-graphic])
(time passes)
Ugh. The following fixes it for me:
diff --git a/lisp/composite.el b/lisp/composite.el
index fb8b76114f4..af86d1436d3 100644
--- a/lisp/composite.el
+++ b/lisp/composite.el
@@ -756,7 +756,7 @@ compose-gstring-for-dotted-circle
;; Allow for bootstrapping without uni-*.el.
(when unicode-category-table
(let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
- [nil 0 compose-gstring-for-graphic])))
+ )))
(map-char-table
#'(lambda (key val)
(if (memq val '(Mn Mc Me))
Although the following is less invasive:
diff --git a/lisp/composite.el b/lisp/composite.el
index fb8b76114f4..333428f008a 100644
--- a/lisp/composite.el
+++ b/lisp/composite.el
@@ -762,6 +762,11 @@ compose-gstring-for-dotted-circle
(if (memq val '(Mn Mc Me))
(set-char-table-range composition-function-table key elt)))
unicode-category-table))
+ ;; for Emoji presentation selector
+ (set-char-table-range
+ composition-function-table
+ #xFE0F
+ `([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))
;; for dotted-circle
(aset composition-function-table #x25CC
`([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))
Didnʼt we conclude that composition had some issues with multiple
entries for the same codepoint if there was a mix for forward and
backward looking regexp?
Robert
--
This bug report was last modified 1 year and 350 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.