GNU bug report logs -
#540
23.0.60; Unicode search bug
Previous Next
Reported by: Juri Linkov <juri <at> jurta.org>
Date: Sun, 6 Jul 2008 18:55:05 UTC
Severity: normal
Done: Chong Yidong <cyd <at> stupidchicken.com>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
This is an automatic notification regarding your bug report
which was filed against the emacs package:
#540: 23.0.60; Unicode search bug
It has been closed by Chong Yidong <cyd <at> stupidchicken.com>.
Their explanation is attached below along with your original report.
If this explanation is unsatisfactory and you have not received a
better one in a separate message then please contact Chong Yidong <cyd <at> stupidchicken.com> by
replying to this email.
--
540: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=540
Emacs Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
Andreas Schwab <schwab <at> suse.de> writes:
> Should be fixed now.
Thanks!
[Message part 3 (message/rfc822, inline)]
There is a weird bug in searching Unicode text. The search function
fails on Cyrillic letters between codepoints #x0400 and #x041f, but
successfully finds a Cyrillic letter between #x0420 and #x042f.
I tried to debug this and see that in case of failure
it calls `boyer_moore', and in case of successful search
it calls `simple_search'. I checked the Unicode properties,
but everything seems correct.
This bug didn't exist before the Unicode merge.
The easiest way to reproduce it: run `emacs -Q',
put in the *scratch* buffer the following 4 lines
(note the leading space):
(search-forward " П" nil t)
(search-forward " Р" nil t)
П
Р
and type `C-x C-e' after each of first two lines.
In GNU Emacs 23.0.60 (x86_64-pc-linux-gnu)
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_US.UTF-8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default-enable-multibyte-characters: t
--
Juri Linkov
http://www.jurta.org/emacs/
This bug report was last modified 15 years and 246 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.