GNU bug report logs - #7343
Making flyspell incredibly fast when checking whole files

Previous Next

Package: emacs;

Reported by: Brandon Craig Rhodes <brandon <at> rhodesmill.org>

Date: Sat, 6 Nov 2010 15:18:02 UTC

Severity: normal

Done: Agustin Martin <agustin.martin <at> hispalinux.es>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 7343 in the body.
You can then email your comments to 7343 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#7343; Package emacs. (Sat, 06 Nov 2010 15:18:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Brandon Craig Rhodes <brandon <at> rhodesmill.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 06 Nov 2010 15:18:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Brandon Craig Rhodes <brandon <at> rhodesmill.org>
To: bug-gnu-emacs <at> gnu.org
Subject: Making flyspell incredibly fast when checking whole files
Date: Sat, 06 Nov 2010 10:03:52 -0400
[Message part 1 (text/plain, inline)]
Spell-checking programs like "aspell" and "hunspell" are blazingly fast
when simply asked to check words with their "-l" option, but become much
slower (the difference is often one or more orders of magnitude,
depending on the dictionary size) when asked about words interactively
because in that case they generate helpful near-misses.

(Actually, "aspell" can turn this off even in interactive mode, which
might become the basis of a further patch; but right now I will confine
myself to submitting this one, since it has gotten my Emacs running fast
enough again that I am happy.)

Anyway, flyspell does try to take advantage of the above behavior by
checking whether a region is larger than flyspell-large-region
characters, and if so then it runs the spell checker as a separate
process with "-l".  But then it does something that, in many cases, is
rather ruinous: it takes every misspelling so identified, and passes it
*back* through the normal interactive spell-checking logic!  This is
because all of the real logic of what to do with a misspelling - how to
highlight it, how to search for nearby instances of the same word, how
to cache spellings, and so forth - is bound up in flyspell-word, so the
flyspell-external-point-words function, which processes the actual
misspellings discovered by flyspell-large-region, really has no other
choice but to call flyspell-word for each misspelling.

So to let flyspell-large-region enjoy the speed that it really should,
we need to tell it never to re-check its words against the live *spell
process attached to Emacs, because that is (a) redundant and (b) very
expensive since, this second time, the spell checker will pause to
generate near-misses.

A patch is attached that fixes this problem, and - here on my laptop, at
least - makes flyspell blazing fast at even large files.  The mechanism
is simple: I have added a second optional argument to flyspell-word,
named "known-misspelling", that tells flyspell-word that the word has
already been checked and is a misspelling and does not need to be
checked again.  Then, down in the function, I simply placed the entire
interactive session with ispell/aspell/hunspell inside of an "if".

I apologize in advance that this diff is constructed against the Ubuntu
version of flyspell, which has who-knows-how-many differences with the
official Emacs one.  If the patch is too difficult to apply, let me
know, and I will find the time to check out Emacs from trunk myself and
reapply the patch there.

Thanks!

[flyspell-large-fast.patch (text/x-diff, inline)]
diff -r 0de00de3360c -r 06af33083844 site-lisp/flyspell.el
--- a/site-lisp/flyspell.el	Sat Nov 06 08:10:07 2010 -0400
+++ b/site-lisp/flyspell.el	Sat Nov 06 08:23:58 2010 -0400
@@ -1009,7 +1009,7 @@
 ;;*---------------------------------------------------------------------*/
 ;;*    flyspell-word ...                                                */
 ;;*---------------------------------------------------------------------*/
-(defun flyspell-word (&optional following)
+(defun flyspell-word (&optional following known-misspelling)
   "Spell check a word."
   (interactive (list ispell-following-word))
   (ispell-set-spellchecker-params)    ; Initialize variables and dicts alists
@@ -1071,29 +1071,35 @@
 	    (setq flyspell-word-cache-end end)
 	    (setq flyspell-word-cache-word word)
 	    ;; now check spelling of word.
-	    (ispell-send-string "%\n")
-	    ;; put in verbose mode
-	    (ispell-send-string (concat "^" word "\n"))
-	    ;; we mark the ispell process so it can be killed
-	    ;; when emacs is exited without query
-	    (set-process-query-on-exit-flag ispell-process nil)
-	    ;; Wait until ispell has processed word.  Since this code is often
-            ;; executed from post-command-hook but the ispell process may not
-            ;; be responsive, it's important to make sure we re-enable C-g.
-	    (with-local-quit
-	      (while (progn
-		       (accept-process-output ispell-process)
-		       (not (string= "" (car ispell-filter))))))
-	    ;; (ispell-send-string "!\n")
-	    ;; back to terse mode.
-	    ;; Remove leading empty element
-	    (setq ispell-filter (cdr ispell-filter))
-	    ;; ispell process should return something after word is sent.
-	    ;; Tag word as valid (i.e., skip) otherwise
-	    (or ispell-filter
-		(setq ispell-filter '(*)))
-	    (if (consp ispell-filter)
-		(setq poss (ispell-parse-output (car ispell-filter))))
+            (if (not known-misspelling)
+                (progn
+                  (ispell-send-string "%\n")
+                  ;; put in verbose mode
+                  (ispell-send-string (concat "^" word "\n"))
+                  ;; we mark the ispell process so it can be killed
+                  ;; when emacs is exited without query
+                  (set-process-query-on-exit-flag ispell-process nil)
+                  ;; Wait until ispell has processed word.  Since this
+                  ;; code is often executed from post-command-hook but
+                  ;; the ispell process may not be responsive, it's
+                  ;; important to make sure we re-enable C-g.
+                  (with-local-quit
+                    (while (progn
+                             (accept-process-output ispell-process)
+                             (not (string= "" (car ispell-filter))))))
+                  ;; (ispell-send-string "!\n")
+                  ;; back to terse mode.
+                  ;; Remove leading empty element
+                  (setq ispell-filter (cdr ispell-filter))
+                  ;; ispell process should return something after word is sent.
+                  ;; Tag word as valid (i.e., skip) otherwise
+                  (or ispell-filter
+                      (setq ispell-filter '(*)))
+                  (if (consp ispell-filter)
+                      (setq poss (ispell-parse-output (car ispell-filter)))))
+              ;; Else, this was a known misspelling to begin with, and
+              ;; we should forge an ispell return value.
+              (setq poss (list word 0 '() '())))
 	    (let ((res (cond ((eq poss t)
 			      ;; correct
 			      (setq flyspell-word-cache-result t)
@@ -1424,7 +1430,7 @@
 					t
 				      nil))))
 			(setq keep nil)
-			(flyspell-word)
+			(flyspell-word nil t)
 			;; Search for next misspelled word will begin from
 			;; end of last validated match.
 			(setq buffer-scan-pos (point))))
[Message part 3 (text/plain, inline)]
-- 
Brandon Craig Rhodes   brandon <at> rhodesmill.org   http://rhodesmill.org/brandon

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#7343; Package emacs. (Mon, 08 Nov 2010 18:07:02 GMT) Full text and rfc822 format available.

Message #8 received at 7343 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Brandon Craig Rhodes <brandon <at> rhodesmill.org>
Cc: Agustin Martin <agustin.martin <at> hispalinux.es>, 7343 <at> debbugs.gnu.org
Subject: Re: bug#7343: Making flyspell incredibly fast when checking whole
	files
Date: Mon, 08 Nov 2010 13:10:42 -0500
> Anyway, flyspell does try to take advantage of the above behavior by
> checking whether a region is larger than flyspell-large-region
> characters, and if so then it runs the spell checker as a separate
> process with "-l".  But then it does something that, in many cases, is
> rather ruinous: it takes every misspelling so identified, and passes it
> *back* through the normal interactive spell-checking logic!  This is
> because all of the real logic of what to do with a misspelling - how to
> highlight it, how to search for nearby instances of the same word, how
> to cache spellings, and so forth - is bound up in flyspell-word, so the
> flyspell-external-point-words function, which processes the actual
> misspellings discovered by flyspell-large-region, really has no other
> choice but to call flyspell-word for each misspelling.

IIUC this sounds very good (tho it only speeds up flyspell-region and
not flyspell-post-command-hook) and the patch looks good and small
enough for inclusion as a "tiny patch".  Agustin, could you double check
that it's OK and install it in the trunk if so?


        Stefan




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#7343; Package emacs. (Mon, 08 Nov 2010 18:42:02 GMT) Full text and rfc822 format available.

Message #11 received at 7343 <at> debbugs.gnu.org (full text, mbox):

From: Agustin Martin <agustin.martin <at> hispalinux.es>
To: 7343 <at> debbugs.gnu.org, Brandon Craig Rhodes <brandon <at> rhodesmill.org>
Subject: Re: bug#7343: Making flyspell incredibly fast when checking whole
	files
Date: Mon, 8 Nov 2010 19:46:37 +0100
On Mon, Nov 08, 2010 at 01:10:42PM -0500, Stefan Monnier wrote:
> > Anyway, flyspell does try to take advantage of the above behavior by
> > checking whether a region is larger than flyspell-large-region
> > characters, and if so then it runs the spell checker as a separate
> > process with "-l".  But then it does something that, in many cases, is
> > rather ruinous: it takes every misspelling so identified, and passes it
> > *back* through the normal interactive spell-checking logic!  This is
> > because all of the real logic of what to do with a misspelling - how to
> > highlight it, how to search for nearby instances of the same word, how
> > to cache spellings, and so forth - is bound up in flyspell-word, so the
> > flyspell-external-point-words function, which processes the actual
> > misspellings discovered by flyspell-large-region, really has no other
> > choice but to call flyspell-word for each misspelling.
> 
> IIUC this sounds very good (tho it only speeds up flyspell-region and
> not flyspell-post-command-hook) and the patch looks good and small
> enough for inclusion as a "tiny patch".  Agustin, could you double check
> that it's OK and install it in the trunk if so?

Hi,

I also agree that Brandon's patch sounds very good, although I could not yet 
really test it. I hope to have time for this in no more than two or three 
days.

I will also add something in the docstring about the new option.

-- 
Agustin




Reply sent to Agustin Martin <agustin.martin <at> hispalinux.es>:
You have taken responsibility. (Wed, 10 Nov 2010 14:40:05 GMT) Full text and rfc822 format available.

Notification sent to Brandon Craig Rhodes <brandon <at> rhodesmill.org>:
bug acknowledged by developer. (Wed, 10 Nov 2010 14:40:05 GMT) Full text and rfc822 format available.

Message #16 received at 7343-done <at> debbugs.gnu.org (full text, mbox):

From: Agustin Martin <agustin.martin <at> hispalinux.es>
To: Brandon Craig Rhodes <brandon <at> rhodesmill.org>, 7343-done <at> debbugs.gnu.org
Subject: Re: bug#7343: Making flyspell incredibly fast when checking whole
	files
Date: Wed, 10 Nov 2010 15:44:14 +0100
On Sat, Nov 06, 2010 at 10:03:52AM -0400, Brandon Craig Rhodes wrote:
> A patch is attached that fixes this problem, and - here on my laptop, at
> least - makes flyspell blazing fast at even large files.  The mechanism
> is simple: I have added a second optional argument to flyspell-word,
> named "known-misspelling", that tells flyspell-word that the word has
> already been checked and is a misspelling and does not need to be
> checked again.  Then, down in the function, I simply placed the entire
> interactive session with ispell/aspell/hunspell inside of an "if".

Installed in the Emacs bzr repo. Closing bug report.

Thanks a lot for your patch,

-- 
Agustin




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 09 Dec 2010 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 14 years and 196 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.