#16800 - 24.3; flyspell works slow on very short words at the end of big file

GNU bug report logs - #16800
24.3; flyspell works slow on very short words at the end of big file

Package: emacs;

Reported by: Aleksey Cherepanov <aleksey.4erepanov <at> gmail.com>

Date: Tue, 18 Feb 2014 20:59:02 UTC

Severity: normal

Found in version 24.3

Fixed in version 24.5

Done: Agustin Martin <agustin6martin <at> gmail.com>

Bug is archived. No further changes may be made.

View this message in rfc822 format

From: Aleksey Cherepanov <aleksey.4erepanov <at> gmail.com> To: Agustin Martin <agustin.martin <at> hispalinux.es> Cc: 16800 <at> debbugs.gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Date: Sun, 2 Mar 2014 01:39:06 +0400

[Message part 1 (text/plain, inline)]

On Sat, Mar 01, 2014 at 03:11:41AM +0400, Aleksey Cherepanov wrote: > I've wrote a small fuzzer. It is in attach. To run it: > $ LANG=C emacs -Q --eval '(load-file "t2.el")' > Then C-j to start. It modifies buffer you are in. New version is attached. M-j tries last macro or macro specified in my-macro variable. For manual experiments C-o and C-u C-o defines flyspell-word-search-* as my-test-*-(orig|new). Though I improved output so C-j should be enough. > > Hope no one will generate files with words containing something in > > OTHERCHARS. > > Why? > > Otherchars are not rare as of ' is there for "american" dictionary. So > even this email contains such words ("while's"). > > BTW quite interesting flyspell behaviour could be observed with > "met'met'and": if you jump back and forth over this word then met'met > is highlighted when you are at the beginning and met'and is > highlighted when you are at the end. > > Also "met'met'and met'and" highlights both met'and as mis-spelled (the > second met'and is not marked as duplicate). I think original search of "n'n" against "n'n'n'n" finds only (n'n)'(n'n) but not n'(n'n)'n. Our search marks the first word as duplicate running (kbd "n'n SPC en'n'n C-a") while original search does not. What behaviour is preferable? Should the first word of "n'n en'n'n" be marked as duplicate? > Are there any variables that could affect search like > case-fold-search? My fuzzer does not set them but users could have > them set. Also my fuzzer does not try bounds for the search. But we will be in trouble if the search bound is at word bound because we want one more char. Though we could extend bound by 1 char to solve that. Now only forward search is enabled in my fuzzer. Setup it at the end of file as you need. I've implemented a variant of forward search using regexp. It seems that forward search does not get slow from the group in regexp. I did not measured well though. The function is shorter with regexp. Maybe we should make a correct variant before fast one... %-) Also forward search works a bit faster in general. So we could try to implement backward search though forward search. I've removed (goto-char (1+ p)) to not fail on (kbd "nd SPC d'nd SPC nd SPC met C-a"). At the moment the fuzzer could pass several thousands of tests well. You need to wait for fails or improve test generator. Thanks! -- Regards, Aleksey Cherepanov

[t2.el (text/plain, attachment)]

This bug report was last modified 10 years and 137 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #16800 24.3; flyspell works slow on very short words at the end of big file

GNU bug report logs - #16800
24.3; flyspell works slow on very short words at the end of big file