GNU bug report logs - #69968
Case-folding of Mathematical Alphanumeric Symbols

Previous Next

Package: emacs;

Reported by: Juri Linkov <juri <at> linkov.net>

Date: Sat, 23 Mar 2024 20:41:02 UTC

Severity: normal

Done: Juri Linkov <juri <at> linkov.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 69968 <at> debbugs.gnu.org
Subject: bug#69968: Case-folding of Mathematical Alphanumeric Symbols
Date: Mon, 25 Mar 2024 09:37:10 +0200
>> >> I wonder why case-folding is not supported for letters from
>> >> the Unicode block "Mathematical Alphanumeric Symbols":
>> >> https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols
>> >
>> > These are not letters, they are symbols.  And letter-case is not
>> > defined for symbols.
>>
>> π˜‹π˜° 𝘺𝘰𝘢 𝘳𝘦𝘒𝘭𝘭𝘺 𝘡𝘩π˜ͺ𝘯𝘬 𝘡𝘩π˜ͺ𝘴 𝘡𝘦𝘹𝘡 π˜ͺ𝘴 𝘯𝘰𝘡 𝘸𝘳π˜ͺ𝘡𝘡𝘦𝘯 𝘸π˜ͺ𝘡𝘩 π™‘π™šπ™©π™©π™šπ™§π™¨?
>
> What does that prove?  The fact that the glyphs look like normal
> letters doesn't mean they are.  Like β„΅ and β„Ά are not Hebrew letters
> they look like (and have left-to-right directionality).  And similarly
> with πžΈ€, 𞸁 and other mathematical symbols in that block aren't Arabic
> letters, and in particular don't shape like Arabic letters.

I agree that these characters were intended to be used only
as mathematical symbols.  The problem is that often these symbols
are abused as letters to apply more styles in applications that
don't support styles.  There are special sites such as
https://www.textconverter.net/
that convert ASCII text to styled Unicode characters.

I don't use such sites, but once tried to copy such text to Emacs
and discovered that Isearch already nicely supports the search
of these characters by char-fold.  So it was a surprise that
unlike char-fold, case-fold is not supported to ignore case
while searching.

>> >> Case-folding is already supported for some characters from other
>> >> Unicode blocks such e.g. FULLWIDTH LATIN CAPITAL LETTERs,
>> >> CIRCLED LATIN CAPITAL LETTERs, etc.
>> >
>> > That's because UnicodeData.txt defines their letter-case conversions.
>>
>> Ok, then it's very strange that the Unicode standard doesn't define
>> letter-case conversions for other letters.  But what can we do.
>
> We can define case-conversions for them if we decide to do so.
> Moreover, Lisp programs which for some reason need that can do that
> themselves, even if by default there are no case-conversions defined
> for them.  The question is when and why is this needed?

Probably case-conversions for them could be added later only
when there is more support for such symbols in Emacs:
for example, after creating an input method to input them,
or better a command that will convert the region of ASCII chars,
etc.




This bug report was last modified 1 year and 115 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.