#16800 - 24.3; flyspell works slow on very short words at the end of big file

GNU bug report logs - #16800
24.3; flyspell works slow on very short words at the end of big file

Package: emacs;

Reported by: Aleksey Cherepanov <aleksey.4erepanov <at> gmail.com>

Date: Tue, 18 Feb 2014 20:59:02 UTC

Severity: normal

Found in version 24.3

Fixed in version 24.5

Done: Agustin Martin <agustin6martin <at> gmail.com>

Bug is archived. No further changes may be made.

View this message in rfc822 format

From: Aleksey Cherepanov <aleksey.4erepanov <at> gmail.com> To: Agustin Martin <agustin.martin <at> hispalinux.es> Cc: 16800 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org> Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Date: Mon, 24 Feb 2014 03:02:51 +0400

I've performed some tests against my .org file (not in emacs -Q): (insert (mapconcat (lambda (re) (save-excursion (let ((time (current-time)) (count 0)) (while (re-search-backward re nil t) (setq count (1+ count))) (format "%d: %S :: %s" count (subtract-time (current-time) time) re)))) '("\\<[[:alpha:]]" "\\b[[:alpha:]]" "\\([^[:alpha:]]\\|\\b\\)[[:alpha:]]" "\\([^[:alpha:]]\\|\\`\\)[[:alpha:]]" "\\(?:[^[:alpha:]]\\|\\`\\)[[:alpha:]]" "\\(?:[^[:alpha:]]\\)[[:alpha:]]" "[^[:alpha:]][[:alpha:]]" "\\(?:\\b\\|'\\)[[:alpha:]]" "\\(?:[^[:alpha:]]\\|\\`\\)\\([[:alpha:]]+\\)" "\\([^[:alpha:]]\\|\\`\\)\\(?:[[:alpha:]]+\\)" "\\([^[:alpha:]]\\|\\`\\)[[:alpha:]]+") "\n")) Matches| Time | Regexp tried 299158: (0 2 841190 614000) :: \<[[:alpha:]] 299158: (0 2 876846 547000) :: \b[[:alpha:]] 307919: (0 3 321676 163000) :: \([^[:alpha:]]\|\b\)[[:alpha:]] 307899: (0 3 291931 838000) :: \([^[:alpha:]]\|\`\)[[:alpha:]] 307899: (0 2 821347 257000) :: \(?:[^[:alpha:]]\|\`\)[[:alpha:]] 307899: (0 2 760125 839000) :: \(?:[^[:alpha:]]\)[[:alpha:]] 307899: (0 2 765410 758000) :: [^[:alpha:]][[:alpha:]] 299518: (0 2 998895 976000) :: \(?:\b\|'\)[[:alpha:]] 307899: (0 3 174172 939000) :: \(?:[^[:alpha:]]\|\`\)\([[:alpha:]]+\) 307899: (0 3 250515 907000) :: \([^[:alpha:]]\|\`\)\(?:[[:alpha:]]+\) 307899: (0 3 218270 136000) :: \([^[:alpha:]]\|\`\)[[:alpha:]]+ I should admit that word search breaks things even for setup with [[:alpha:]]: a0a is 1 word for emacs and 2 for flyspell. I missed it because Russian behaves differently (there is word boundary on border between digits and Russian letters). My bad. 307899: (0 2 760125 839000) :: \(?:[^[:alpha:]]\)[[:alpha:]] 307899: (0 2 765410 758000) :: [^[:alpha:]][[:alpha:]] These two suggest that it may provide a speed up if we do not check beginning of buffer in regexp but check it separately. But I doubt it is worth it. On Sun, Feb 23, 2014 at 11:56:59PM +0400, Aleksey Cherepanov wrote: > Also not capturing group ("\\(?:") could be used because we do not > need a match data of the first group. It should work faster but I > don't really know. 307899: (0 3 291931 838000) :: \([^[:alpha:]]\|\`\)[[:alpha:]] 307899: (0 2 821347 257000) :: \(?:[^[:alpha:]]\|\`\)[[:alpha:]] The test shows that not capturing group is faster. > Maybe it would be faster to not capture word but capture one char or > void but I doubt the difference would be noticable. 307899: (0 3 174172 939000) :: \(?:[^[:alpha:]]\|\`\)\([[:alpha:]]+\) 307899: (0 3 250515 907000) :: \([^[:alpha:]]\|\`\)\(?:[[:alpha:]]+\) 307899: (0 3 218270 136000) :: \([^[:alpha:]]\|\`\)[[:alpha:]]+ Unexpectedly capturing of word works a bit faster. Maybe it is not a word but the second group and it would work differently for search forward. Or alpha+ instead of fixed word caused it. Anyway the difference is very small. Capturing word allows us to make a function to wrap a word into regexp like word-search-regexp function wraps a word for word-search-forward/-backward functions. > I guess that \b would work faster than the group so we could have 'if' > statement around the whole loop that has one implementation with \b > for case when casechars are "[[:alpha:]]" and not-casechars are > "[^[:alpha:]]" and another implementation as above for other cases. > But it seems cumbersome. My guess is wrong: \b works slower than the group. Also it is inappropriate at all. Thanks! -- Regards, Aleksey Cherepanov

This bug report was last modified 10 years and 137 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #16800 24.3; flyspell works slow on very short words at the end of big file

GNU bug report logs - #16800
24.3; flyspell works slow on very short words at the end of big file