GNU bug report logs -
#37036
[PATCH] Inconsistent ASCII and Latin char categories
Previous Next
Reported by: Mattias Engdegård <mattiase <at> acm.org>
Date: Thu, 15 Aug 2019 12:18:02 UTC
Severity: normal
Tags: patch, wontfix
Done: Mattias Engdegård <mattiase <at> acm.org>
Bug is archived. No further changes may be made.
Full log
Message #8 received at 37036 <at> debbugs.gnu.org (full text, mbox):
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Thu, 15 Aug 2019 14:17:15 +0200
>
> The ASCII (a) and Latin (l) character categories are inconsistent in what characters they contain.
>
> It should be clear what the ASCII category means, but it omits 00-1f (contrary to a comment in the code).
>
> The Latin category isn't exactly defined anywhere but should reasonably comprise letters from Latin-based scripts. Currently, it also includes many control characters and symbols from the ASCII and Latin-1 Supplement blocks, which seems hard to justify.
>
> Other changes to Latin could be argued: what modifiers/combining chars should be included? What about IPA and non-IPA phonetics? Ligatures? What about Latin-derived forms such as circled letters? &c. The attached patch does not go there but only fixes the glaring errors in the 00-ff range.
Did you try moving by words after these changes? What happens in
words that consist of ASCII and non-ASCII Latin characters, for
example?
This bug report was last modified 5 years and 275 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.