GNU bug report logs - #20140
24.4; M17n shaper output rejected

Previous Next

Package: emacs;

Reported by: Richard Wordingham <richard.wordingham <at> ntlworld.com>

Date: Wed, 18 Mar 2015 22:21:02 UTC

Severity: normal

Tags: moreinfo

Found in version 24.4

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: handa <at> gnu.org (K. Handa)
To: Richard Wordingham <richard.wordingham <at> ntlworld.com>
Cc: 20140 <at> debbugs.gnu.org
Subject: bug#20140: 24.4; M17n shaper output rejected
Date: Sat, 21 Mar 2015 17:33:17 +0900
In article <20150318222040.4066e6e9 <at> JRWUBU2>, Richard Wordingham <richard.wordingham <at> ntlworld.com> writes:
[...]
> I extract and analyse what was rendered as shaped ('accepted') and what
> was not ('rejected'), quoting the monitoring output.  I suspect the
> problem is the strict testing of the from and to fields in Lisp function
> font-shape-gstring, which is defined in file font.c.
[...]
> The shaping of the following, with vowels or MEDIAL RA that should be
> rendered before the consonant, was rejected:

> mflt_run( 1A3E 1A6E 1A6C 1A65) produced ( 1A6E>872:1:1 1A3E>810:0:3
1A6C>869:0:3 1A65>862:0:3) 

If U+1A6E is displayed before U+1A3E, and they are in
different grapheme cluster, when you move point forward one
step by one, the cursor must move back and forth as below
(cursor is indicated by dashes):

 display: SPC 1A6E 1A3E+1A6C+1A65 SPC
 step 1:  ---    
 step 2:           --------------
 step 3:      ----
 step 4:                          ---

Is that what you want?

At least, the support for all Indic scripts (they have
characters in logical order as your example of Tai Tham
text) treats re-ordered glyphs as one grapheme cluster.
That is not only Emacs but also gtk (pango) applications.
Please try to move cursor over this Devanagri text "हिंदी" on
Emacs, gedit, and, for instance, firefox.  They all treat
that text as 2 grapheme clusters "हिं" and "दी".  The first
one corresponds to character the sequence U+935 U+93F, and
U+93F (vowel I) is displayed before U+935 (base cosonant).

[...]

> There does appear to be a work around, which is to have m17n declare
> the orthographic syllables it receives to be 'grapheme clusters'.

I think that's the right solution; i.e. make all combined
and out-of-ordered glyphs as one cluster.

> It solves at least some of the problems above.

Which one is not solved by it?

> However, it then makes editing of the 'clusters' more
> difficult.  Note that there are examples above with 5
> characters in a cluster, and this is by no means the
> limit.

But, it seems that the current behavior is accepted, at
least, by Indic people.

By the way, I long ago proposed these commands which enables
you to move point into a grapheme cluster (by suppressing
composing of a cluster temporarily).  It worked in old Emacs (I
don't remember the version), but not in the latest Emacs.

(defun forward-char-intrusive ()
  (interactive)
  (setq disable-point-adjustment t)
  (forward-char 1))

(defun backward-char-intrusive ()
  (interactive)
  (setq disable-point-adjustment t)
  (forward-char -1))

(global-set-key (kbd "C-S-f") 'forward-char-intrusive)
(global-set-key (kbd "C-S-b") 'backward-char-intrusive)

---
K. Handa
handa <at> gnu.org




This bug report was last modified 3 years and 155 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.