#13041 - 24.2; diacritic-fold-search

GNU bug report logs - #13041
24.2; diacritic-fold-search

Package: emacs;

Reported by: perin <at> acm.org

Date: Fri, 30 Nov 2012 18:31:02 UTC

Severity: wishlist

Found in version 24.2

Fixed in version 25.1

Done: Michael Albinus <michael.albinus <at> gmx.de>

Bug is archived. No further changes may be made.

View this message in rfc822 format

From: "Drew Adams" <drew.adams <at> oracle.com> To: "'Eli Zaretskii'" <eliz <at> gnu.org>, "'Juri Linkov'" <juri <at> jurta.org> Cc: perin <at> acm.org, 13041 <at> debbugs.gnu.org, perin <at> panix.com Subject: bug#13041: 24.2; diacritic-fold-search Date: Sat, 1 Dec 2012 08:38:45 -0800

> I don't understand why this thread is talking only about Latin > characters with diacritics. That is a special case of what Unicode > calls "compatibility equivalence" (q.e.). For example, even in the > Latin environments, don't you want to find "sni?" when searching for > "sniff", and vice versa? And there are similar issues in many > non-Latin scripts. Actually, in the original thread I made the same point. Please see that discussion for this and other points. http://lists.gnu.org/archive/html/help-gnu-emacs/2012-11/msg00429.html > The decomposition of a character such as '?' is given by > the Unicode database... Emacs already supports these > decomposition properties. That's good news (new to me). So it sounds like even the most hopeful wanna-haves of the discussion could perhaps be realized without too much trouble. > Using these properties, every search string can be converted to a > sequence of non-decomposable characters (this process is recursive, > because the 'decomposition' property can use characters that > themselves are decomposable). If the user wants to ignore diacritics, > then the diacritics should be dropped from the decomposition sequence > before starting the search. E.g., for the decomposition of è above, > we will drop the 768 and will be left with 101, which is 'e'. Then > searching for that string should apply the same decomposition > transformation to the text being searched, when comparing them. > > This would be the most general way of solving this issue, a way that > is not limited to diacritics nor to Latin scripts. And doing that > will move Emacs closer to the goal of being Unicode compatible, since > support for this is required by the Unicode Standard. This sounds great. I really hope someone with the time and knowledge adds such a feature soon (even though, to be clear, I personally do not have much need for it). I think it would be very handy for many users - most welcome.

This bug report was last modified 8 years and 342 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #13041 24.2; diacritic-fold-search

GNU bug report logs - #13041
24.2; diacritic-fold-search