GNU bug report logs -
#39799
28.0.50; Most emoji sequences don’t render correctly
Previous Next
Reported by: Mike FABIAN <mfabian <at> redhat.com>
Date: Wed, 26 Feb 2020 14:30:03 UTC
Severity: normal
Found in version 28.0.50
Fixed in version 28.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
Message #77 received at 39799 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 28 Feb 2020 18:19:10 +0200, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: Glenn Morris <rgm <at> gnu.org>, mfabian <at> redhat.com, 39799 <at> debbugs.gnu.org
>> Date: Fri, 28 Feb 2020 15:14:01 +0100
>>
>> >> It matches forward off the first char, so the
>> >> composition-function-table entries all have '0' as the number of chars
>> >> to match. Would it be better to match backwards?
>>
Eli> I don't think matching backwards is better in general. Did you have a
Eli> reason for thinking it was?
>>
>> I thought I saw a comment in composite.c that says matching is done
>> backward, but I see that itʼs done forwards as well.
Eli> Btw, it sometimes _can_ be beneficial to use backward matching: if it
Eli> makes the size of composition-function-table smaller. Since
Eli> composition-function-table is a char-table, and char-tables allocate
Eli> sub-tables only if needed, you can conserve memory (and thus make
Eli> Emacs's memory footprint smaller) and faster (because 'aref' will llok
Eli> up values in a char-table faster) by setting a smaller number of
Eli> slots. For example, if the 2nd character of an Emoji sequence was
Eli> always one specific character, or a small set of characters, you could
Eli> set only the slots of those few characters, which would make the
Eli> char-table smaller. OTOH, if that would yield many different
Eli> composition rules in the list of rules for those few characters,
Eli> redisplay could become slower, because it generally examines the rules
Eli> one by one until it finds an appropriate one. So the winning setup of
Eli> composition-function-table is the one that sets the smallest number of
Eli> slots, but still keeps the lists of rules for those slots short. And
Eli> note that setting the same rule for a range of codepoints generally
Eli> uses up only one slot in the char-table, so rules that can be
Eli> generalized to cover many characters are preferable.
I donʼt think that applies in this case. The sequences are all easily
categorised based on the first char in the sequence. It could be done
based on the 2nd, or 3rd or whatever, but I donʼt think that reduces
the number of entries. Plus thereʼs always one rule per character,
since multiple patterns starting with the same character are combined
using regexp-opt.
One thing though: the code currently does set-char-table-range to a
new value. Is there a chance that an entry already exists in
composition-function-table for a particular character? If so Iʼd have
to change it to add the new rule after the existing one (before?).
Robert
This bug report was last modified 3 years and 255 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.