GNU bug report logs - #56237
29.0.50; delete-forward-char fails to delete character

Previous Next

Package: emacs;

Reported by: visuweshm <at> gmail.com

Date: Sun, 26 Jun 2022 16:08:02 UTC

Severity: normal

Tags: moreinfo

Found in version 29.0.50

Done: Visuwesh <visuweshm <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #44 received at 56237 <at> debbugs.gnu.org (full text, mbox):

From: Visuwesh <visuweshm <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 56237 <at> debbugs.gnu.org
Subject: Re: bug#56237: 29.0.50; delete-forward-char fails to delete character
Date: Mon, 27 Jun 2022 11:17:25 +0530
[திங்கள் ஜூன் 27, 2022] Visuwesh wrote:

> [ஞாயிறு ஜூன் 26, 2022] Eli Zaretskii wrote:
>
>>> From: Visuwesh <visuweshm <at> gmail.com>
>>> Cc: 56237 <at> debbugs.gnu.org
>>> Date: Sun, 26 Jun 2022 22:36:31 +0530
>>> 
>>> > Invoke find-composition, and you will see that it returns a single
>>> > composition there.
>>> 
>>> If find-composition is indeed right, then the return value is very
>>> unintuvitive as a native speaker: ப் and போ are two separate characters
>>> and combining them into a single cluster is weird...  
>>
>> Maybe you are right, but then Someone(TM) will have to either modify
>> find-composition or explain how to interpret its return value
>> differently from what we do now.  What is now in delete-forward-char
>> expresses my level of knowledge in this area, which admittedly is
>> limited.
>>
>
> Turns out that Someone™ was closer to us than I thought: describe-char.
> With a bit of edebug and reading the code in composition.h (for the
> LGLYPH_* macros) and defsubst's in composite.el, I think I figured out
> the logic:
>
> We need to call find-composition with a non-nil DETAIL-P argument to get
> the gstring.  The gstring contains the glyphs that will be used to
> construct the grapheme cluster [1].  According to composition.h, those
> glyphs which have the same FROM and TO indices are part of the same
> grapheme cluster so to get the actual length of individual codepoints,
> we need to calculate the number of glyphs which have an equal FROM and
> TO indices.
>
> Understanding all this, I came up with the following code:
>
>     (let* ((composition (find-composition 0 nil "ப்போ" t))
>            (gstring (nth 2 composition))
>            (num-glyphs (lgstring-glyph-len gstring))
>            (i 1)
>            (from (lglyph-from (lgstring-glyph gstring 0)))
>            (to (lglyph-to (lgstring-glyph gstring 0))))
>       (while (and (< i num-glyphs)
>                   (= from (lglyph-from (lgstring-glyph gstring i)))
>                   (= to (lglyph-to (lgstring-glyph gstring i))))
>         (setq i (1+ i)))
>       i)
>
> here i is the number of characters we need to delete using delete-char.
>
> [1] For the gstring format, see composition-get-gstring.
>
> But I think we should test this code in cases where a grapheme cluster
> contains more than two codepoints since all the composed characters in
> Tamil are made up of two Unicode codepoints.  I can't test it on emojis
> since I don't know of an Emoji font that won't crash potentially Xft and
> has enough coverage.
>

I got my hopes too high.  :(

This fails for the simple case of ரு (C-u C-x = also fails!) so I guess
we are back to square one.  Although ரு is composed from 0BB0 0BC1, the
gstring only has one glyph.





This bug report was last modified 2 years and 311 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.