On Apr 18, 2022, at 1:12 AM, Eli Zaretskii wrote: > > We decided not to use the Emoji presentation of these characters by > default for a good reason: there are many symbols that _can_ have > Emoji presentation, but they are used frequently in "normal" text. > Look at the beginning of admin/unidata/emoji-data.txt, and you will > see what I mean: even ASCII characters like '#' and digits can have > Emoji presentation, and we definitely don't want them appear as Emoji. > Don't forget that Emacs display features like this one are _global_, > so if for some insane reason we decide to have digits displayed as > Emoji, you will get that everywhere, including on the mode line, for > example. > > So what will that hypothetical variable do? It cannot affect all the > characters that _may_ have the Emoji presentation, so we will need to > decide which ones to affect and which ones not to affect. If you look > at emoji-data.txt, you will realize that the decision is not easy; for > example, what about U+2122 TRADE MARK SIGN or U+23F0 ALARM CLOCK or > U+262E PEACE SYMBOL? Some people will want them as Emoji, while > others won't. Attached are 3 screenshots of that file in Emacs, the first is Emacs -Q, then with 'emoji script as Apple Color Emoji and then with 'symbol as Apple Color Emoji. I don't see a difference between -Q and setting 'emoji. Setting symbol seems better to me, filling in a bunch of blanks and few if any bad choices (I get that's somewhat personal preference). I mean, I like both ALARM CLOCK and STOPWATCH showing as emoji. I guess this is because Apple Color Emoji already made "reasonable" choices as to which "symbols" to render? > So we'd need another variable with a long list of codepoints that are > Emoji by default? Such a variable sounds like a PITA to let users customize. > A character cannot belong to more than one script in Emacs, so that's > not possible, AFAIU. And I don't see why it would be necessary, since > one can customize the fontset for individual codepoints without using > a script symbol. I agree it would be a pain to customize, and it's also a pain for users to pick individual code points for a large set of emoji. I was hoping the unicode data could be used to define something better that users could start from and then tweak as needed. E.g., here's emacs -Q of all the text; L2; codepoints and then with 'symbol set (this is the first screen, the 2nd screen is quite similar to the bottom of this: Basically (and maybe exactly) all the blanks are filled by emoji. For completeness here's the text L1 chars in -Q and with 'symbol set: This font isn't rendering many of these and those it does probably are not great choices. So it seems like an easy way to tell emacs to use a font for all the L2 text codepoints would be useful? At least on mac? Maybe the answer is just on mac that Apple Color Emoji should be used for 'symbol? I'm still not clear what emacs mechansim finds Apple Color Emoji font for me for just some codepoints, since it's isn't in the default fontset at startup. Maybe that mechanism could be taught to do it for L2 text codepoints too without the need to set it for 'symbol? Maybe L2 text codepoints aren't the right grouping (I'm new at this), and I know that any grouping might have false positives, but I think it can be better than what's there now (at least on mac) Do the commonly used emoji fonts on other systems render way more than these as emoji? If i read this correctly: https://www.unicode.org/emoji/charts/text-style.html It seems like newer emoji are more regular and it's perhaps the old ones (like TRADEMARK or EXCLAMATION QUESTION MARK) that are more likely to be false positives. If so, an effort to list these old ones might be good enough going forward (I'd help that effort). For completeness, here is that file in the macport with emoji and then symbol. I'm not sure why the macport is rendering as emoji more than emacs -q given what I think are the same fontset settings, but just for comparison. Howard