GNU bug report logs - #37036
[PATCH] Inconsistent ASCII and Latin char categories

Previous Next

Package: emacs;

Reported by: Mattias Engdegård <mattiase <at> acm.org>

Date: Thu, 15 Aug 2019 12:18:02 UTC

Severity: normal

Tags: patch, wontfix

Done: Mattias Engdegård <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

Full log


Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: bug-gnu-emacs <at> gnu.org
Subject: [PATCH] Inconsistent ASCII and Latin char categories
Date: Thu, 15 Aug 2019 14:17:15 +0200
[Message part 1 (text/plain, inline)]
The ASCII (a) and Latin (l) character categories are inconsistent in what characters they contain.

It should be clear what the ASCII category means, but it omits 00-1f (contrary to a comment in the code).

The Latin category isn't exactly defined anywhere but should reasonably comprise letters from Latin-based scripts. Currently, it also includes many control characters and symbols from the ASCII and Latin-1 Supplement blocks, which seems hard to justify.

Other changes to Latin could be argued: what modifiers/combining chars should be included? What about IPA and non-IPA phonetics? Ligatures? What about Latin-derived forms such as circled letters? &c. The attached patch does not go there but only fixes the glaring errors in the 00-ff range.

[0001-Fix-ASCII-and-Latin-character-categories.patch (application/octet-stream, attachment)]

This bug report was last modified 5 years and 275 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.