GNU bug report logs -
#13041
24.2; diacritic-fold-search
Previous Next
Reported by: perin <at> acm.org
Date: Fri, 30 Nov 2012 18:31:02 UTC
Severity: wishlist
Found in version 24.2
Fixed in version 25.1
Done: Michael Albinus <michael.albinus <at> gmx.de>
Bug is archived. No further changes may be made.
Full log
Message #188 received at 13041 <at> debbugs.gnu.org (full text, mbox):
> - leave the text alone but give each string that should be handled
> specially a text property with the normalized form. In this case
> searching has to pay attention to these properties, if present.
>
> - normalize the text and give each normalized string a text property
> with the original text. In this case searching will proceed as usual
> but you have to restore the original text when done.
This reminds an idea that searching should take into account the text
displayed with the `display' property and other display-related properties.
It seems this is more difficult to implement.
> Also I don't know how to handle the return value and/or highlighting
> when, for example, finding a match for "suf" within "suffer". For
> example, replacing each occurrence of "suf" with the empty string should
> leave us with "fer" here.
I believe such ligature characters should be handled as a whole,
i.e. "suf" doesn't match "suffer", only "suff" should match it.
> I have no idea how many mappings like "ß" -> "ss" exist. The problem is
> that we don't get them from UnicodeData.txt IIUC.
I can't find them in UnicodeData.txt too. Looking at the files in
http://www.unicode.org/Public/UNIDATA/ can find them in the file
http://www.unicode.org/Public/UNIDATA/DerivedNormalizationProps.txt
that is derived from
http://www.unicode.org/Public/UNIDATA/CaseFolding.txt
http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt
This bug report was last modified 8 years and 342 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.