GNU bug report logs -
#36923
Combining Diacritical Marks are not Latin only
Previous Next
Reported by: Juri Linkov <juri <at> linkov.net>
Date: Sun, 4 Aug 2019 20:50:02 UTC
Severity: normal
Done: Juri Linkov <juri <at> linkov.net>
Bug is archived. No further changes may be made.
Full log
Message #11 received at 36923 <at> debbugs.gnu.org (full text, mbox):
>> (aref char-script-table ?\N{COMBINING ACUTE ACCENT})
>>
>> could return
>>
>> (latin greek cyrillic)
>>
>> instead of the current
>>
>> latin
>
> char-script-table is documented to yield a single symbol, so returning
> a list would be an incompatible change, which we should avoid.
The docstring of char-script-table says:
Char table of script symbols.
It has one extra slot whose value is a list of script symbols.
So it seems char-script-table should yield a list of script symbols?
I searched more for char-script-table in the documentation, and one
place where it's used is forward-word. But I don't understand why
forward-word doesn't stop between “COMBINING ACUTE ACCENT” (that is
the Latin script) and non-Latin letters.
This is good that it doesn't stop here, and I'm just trying to
understand why - so the same logic could be used in markchars-mode.
Maybe it doesn't stop because of special script handling in
‘find-word-boundary-function-table’? Or because it ignores all
combining characters?
BTW, while looking at forward-word and right-word I noticed inconsistency:
there are left-word and right-word commands, but no left-sexp and right-sexp
to accompany forward-sexp.
> More generally, I think what you describe is a clear conceptual bug in
> markchars-mode: it should only pay attention to the script of the base
> characters, not to the script of combining accents. The latter is
> mostly irrelevant, certainly so for the purpose of detecting
> confusables.
Could you suggest a proper function to strip all combining characters
from the string?
This bug report was last modified 5 years and 347 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.