GNU bug report logs - #24405
24.5; Possibly ``forward-word`` doesn't respect ``word-combining-categories`` for word boundaries on changing between latin/phonetic scripts.

Previous Next

Package: emacs;

Reported by: Oleksandr Gavenko <gavenkoa <at> gmail.com>

Date: Sat, 10 Sep 2016 08:35:01 UTC

Severity: normal

Tags: notabug

Found in version 24.5

Done: Stefan Kangas <stefan <at> marxist.se>

Bug is archived. No further changes may be made.

Full log


Message #19 received at 24405 <at> debbugs.gnu.org (full text, mbox):

From: Oleksandr Gavenko <gavenkoa <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24405 <at> debbugs.gnu.org
Subject: Re: bug#24405: 24.5; Possibly ``forward-word`` doesn't respect
 ``word-combining-categories`` for word boundaries on changing between
 latin/phonetic scripts.
Date: Sun, 11 Sep 2016 14:57:33 +0300
On 2016-09-10, Eli Zaretskii wrote:

>> Another solution is to invent own:
>> 
>>   (define-category ?p "Phonetic")
>> 
>> and to add it to IPA characters:
>> 
>>   (mapc (lambda (ch) (modify-category-entry ch "p"))
>>         '(?ʌ ?ə ?ɜ ?ɒ ?ɛ ?θ ?ʊ ?ɪ ?ɔ ?ɑ ?ʃ ?ʧ ?ː ?ˈ ?ˌ ?ʒ ?ŋ))
>> 
>> so it becomes possible to use:
>> 
>>   (add-to-list 'word-combining-categories '(?p . ?l))
>>   (add-to-list 'word-combining-categories '(?l . ?p))
>
> That'd be my second best advice.  But I think regular expressions
> should provide a better and easier solution.

This works for me:

  (defconst my/ipa-chars (list ?ˈ ?ˌ ?ː ?ǁ ?ʲ ?θ ?ð ?ŋ ?ɡ ?ʒ ?ʃ ?ʧ ?ə ?ɜ ?ɛ ?ʌ ?ɒ ?ɔ ?ɑ ?æ ?ʊ ?ɪ))
  (define-category ?p "Phonetic")
  (mapc (lambda (ch)
       (cond
        ((eq (aref char-script-table ch) 'phonetic)
         (modify-category-entry ch ?p)
         (modify-category-entry ch ?l nil t))
        ((eq (aref char-script-table ch) 'latin)  ; (aref char-script-table ?ˌ) is 'latin but (char-category-set ?ˌ) is ".j"
         (modify-category-entry ch ?l))))
        my/ipa-chars)
  (add-to-list 'word-combining-categories '(?p . ?l))
  (add-to-list 'word-combining-categories '(?l . ?p))

But adding and removing categories looks too low level. It is necessary to use
some (define-category ?p "Phonetic") that is not defined in Emacs itself.

This looks easier to me:

  (mapc (lambda (ch)
          (aset char-script-table ch 'latin)
          (modify-syntax-entry ch "w"))
        my/ipa-chars)

But ``char-script-table`` derived from Unicode and some code my depends on
this database...

-- 
http://defun.work/




This bug report was last modified 5 years and 294 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.