GNU bug report logs -
#16731
24.3.50; Latin small letter sharp s is not considered lower-case
Previous Next
Reported by: Jorgen Schaefer <forcer <at> forcix.cx>
Date: Wed, 12 Feb 2014 17:31:02 UTC
Severity: normal
Merged with 10576
Found in version 24.3.50
Fixed in version 28.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Andreas Röhler <andreas.roehler <at> easy-emacs.de>,
> 16731 <at> debbugs.gnu.org
> Date: Thu, 13 Feb 2014 08:37:45 -0500
>
> > How will we then be able to distinguish between lower-case characters
> > that have no upcase variant and characters that are not lower-case
> > characters at all?
>
> Right: to handle this, we need to distinguish characters that are
> lower-case without an uppercase variant from characters which are
> neither lowercase nor uppercase.
>
> We could do that by saying that the upcase table should return nil or -1
> for ß, to indicate that the upcase version is "missing". But such
> a change will probably require carefully revising "all" the code that
> uses those tables.
Right. I can instead suggest a much less intrusive change below. Its
only disadvantage is that if some user or Lisp program overrides the
standard case tables, and actually _wants_ some lower-case characters
behave as if they weren't, looking at the Unicode tables will undo
such customizations. If this is a concern, perhaps we could compare
the case table with the standard value, and only use the Unicode
attributes when they are equal?
If the approach below is accepted, a related question is how to treat
letters whose category is Lt, i.e. "titlecase" -- do we consider such
letters upper case or don't we?
--- src/buffer.h~0 2014-01-01 09:46:07.000000000 +0200
+++ src/buffer.h 2014-02-13 18:27:32.225839000 +0200
@@ -1349,7 +1349,19 @@ downcase (int c)
}
/* True if C is upper case. */
-INLINE bool uppercasep (int c) { return downcase (c) != c; }
+INLINE bool uppercasep (int c)
+{
+ Lisp_Object val;
+
+ if (downcase (c) != c)
+ return true;
+
+ if (NILP (Vunicode_category_table))
+ return false;
+
+ val = CHAR_TABLE_REF (Vunicode_category_table, c);
+ return INTEGERP (val) && XINT (val) == UNICODE_CATEGORY_Lu;
+}
/* Upcase a character C known to be not upper case. */
INLINE int
@@ -1364,7 +1376,16 @@ upcase1 (int c)
INLINE bool
lowercasep (int c)
{
- return !uppercasep (c) && upcase1 (c) != c;
+ Lisp_Object val;
+
+ if (!uppercasep (c) && upcase1 (c) != c)
+ return true;
+
+ if (NILP (Vunicode_category_table))
+ return false;
+
+ val = CHAR_TABLE_REF (Vunicode_category_table, c);
+ return INTEGERP (val) && XINT (val) == UNICODE_CATEGORY_Ll;
}
/* Upcase a character C, or make no change if that cannot be done. */
This bug report was last modified 3 years and 311 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.