GNU bug report logs - #4511
23.1; flyspell-mode slow editing near end of big html file

Previous Next

Package: emacs;

Reported by: Kevin Ryde <user42 <at> zip.com.au>

Date: Mon, 21 Sep 2009 22:35:10 UTC

Severity: normal

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 4511 in the body.
You can then email your comments to 4511 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4511; Package emacs. (Mon, 21 Sep 2009 22:35:11 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kevin Ryde <user42 <at> zip.com.au>:
New bug report received and forwarded. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Mon, 21 Sep 2009 22:35:11 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Kevin Ryde <user42 <at> zip.com.au>
To: bug-gnu-emacs <at> gnu.org
Subject: 23.1; flyspell-mode slow editing near end of big html file
Date: Tue, 22 Sep 2009 08:24:38 +1000
[Message part 1 (text/plain, inline)]
When flyspell-mode is enabled in a big html file, and point is somewhere
near the end of the buffer, typing text or moving point with C-f and C-b
become sluggish, to the point of being nearly unusable.

(This is a regression from emacs 22, where flyspell-mode was fine on
such files.)

I expect "big file" is relative to cpu speed, but 300 kbytes is bad on
my slow pc (not an outrageously huge file).  To reproduce try this of
about 600 kbytes,

    (progn
      (switch-to-buffer "foo")
      (dotimes (i 50000) (insert (format "<p> abc def\n" i)))

      (html-mode)
      (flyspell-mode))

It takes a few seconds to create the buffer, but of course that's not
the bug.  The bad bit is if you move point around with C-f / C-b near
the end of the buffer, or type some plain text there outside of a <tag>,
where it's sluggish between keystrokes.  (Try upping the 50000 on a fast
cpu if necessary.)


I track the slowness to where `sgml-mode-flyspell-verify' does

    (looking-back "<[^>\n]*")

I take it this func is asking whether point is within a <tag> or not.
Does that regexp end up asking re-search-backward to consider every "<"
in the buffer or something, before deciding no match is possible?

I find it hugely faster to do an old fashioned skip-chars-backward as
below -- assuming I'm not mistaken that the "\n" in the existing
`looking-back' is supposed mean examining no more than the current line.

2009-09-21  Kevin Ryde  <user42 <at> zip.com.au>

	* textmodes/flyspell.el (sgml-mode-flyspell-verify): Use
	skip-chars-backward instead of looking-back, to avoid a very slow
	regexp match when far into a big buffer with a lots of "<" chars.

[flyspell.el.sgml-verify.diff (text/x-diff, inline)]
--- flyspell.el.~1.146.~	2009-09-18 08:23:13.000000000 +1000
+++ flyspell.el	2009-09-21 16:36:12.000000000 +1000
@@ -363,7 +363,9 @@
   "Function used for `flyspell-generic-check-word-predicate' in SGML mode."
   (not (save-excursion
 	 (or (looking-at "[^<\n]*>")
-	     (ispell-looking-back "<[^>\n]*")
+	     (save-excursion
+	       (skip-chars-backward "^<>\n")   ;; \n only look at current line
+	       (not (equal ?< (char-before)))) ;; "<" if in a tag
 	     (and (looking-at "[^&\n]*;")
 		  (ispell-looking-back "&[^;\n]*"))))))
 
[Message part 3 (text/plain, inline)]



In GNU Emacs 23.1.1 (i486-pc-linux-gnu, GTK+ Version 2.16.5)
 of 2009-08-03 on raven, modified by Debian
configured using `configure  '--build=i486-linux-gnu' '--host=i486-linux-gnu' '--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--localstatedir=/var/lib' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--with-pop=yes' '--enable-locallisppath=/etc/emacs23:/etc/emacs:/usr/local/share/emacs/23.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/23.1/site-lisp:/usr/share/emacs/site-lisp:/usr/share/emacs/23.1/leim' '--with-x=yes' '--with-x-toolkit=gtk' '--with-toolkit-scroll-bars' 'build_alias=i486-linux-gnu' 'host_alias=i486-linux-gnu' 'CFLAGS=-DDEBIAN -g -O2' 'LDFLAGS=-g' 'CPPFLAGS=''

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: en_AU
  value of $XMODIFIERS: nil
  locale-coding-system: iso-latin-1-unix
  default-enable-multibyte-characters: t

Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4511; Package emacs. (Tue, 22 Sep 2009 21:50:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stefan Monnier <monnier <at> IRO.UMontreal.CA>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Tue, 22 Sep 2009 21:50:04 GMT) Full text and rfc822 format available.

Message #10 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Kevin Ryde <user42 <at> zip.com.au>
Cc: 4511 <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
Subject: Re: bug#4511: 23.1; flyspell-mode slow editing near end of big html file
Date: Tue, 22 Sep 2009 17:40:03 -0400
> I track the slowness to where `sgml-mode-flyspell-verify' does

>     (looking-back "<[^>\n]*")

> I take it this func is asking whether point is within a <tag> or not.
> Does that regexp end up asking re-search-backward to consider every "<"
> in the buffer or something, before deciding no match is possible?

Yes, looking-back is a dog.  You need to pass it a `limit' argument.
I think the `limit' argument should be mandatory.


        Stefan




Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4511; Package emacs. (Tue, 22 Sep 2009 21:50:08 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stefan Monnier <monnier <at> IRO.UMontreal.CA>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Tue, 22 Sep 2009 21:50:09 GMT) Full text and rfc822 format available.

Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4511; Package emacs. (Wed, 23 Sep 2009 01:05:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kevin Ryde <user42 <at> zip.com.au>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Wed, 23 Sep 2009 01:05:05 GMT) Full text and rfc822 format available.

Message #20 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Kevin Ryde <user42 <at> zip.com.au>
To: 4511 <at> debbugs.gnu.org
Cc: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
Subject: Re: bug#4511: 23.1; flyspell-mode slow editing near end of big html file
Date: Wed, 23 Sep 2009 10:56:07 +1000
Stefan Monnier <monnier <at> IRO.UMontreal.CA> writes:
>
> You need to pass it a `limit' argument.

I thought about that a bit.  The limit would be the immediately
preceding "<", ">", or "\n", since whichever of them is hit first
answers whether you're in a tag or not.

There'd be no need for a separate limit calculation if a regexp could be
cooked up to stop on the first of those three.  I suppose it'd be along
the lines of (untested) ...

     (and (looking-back "\\([<>\n]\\)[^<>\n]*?")
          (equal "<" (match-string 1)))

but `skip-chars-backward' seems clearer to me, and might be a couple of
nanoseconds quicker too in fact.



Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4511; Package emacs. (Wed, 23 Sep 2009 03:20:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Wed, 23 Sep 2009 03:20:04 GMT) Full text and rfc822 format available.

Message #25 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Kevin Ryde <user42 <at> zip.com.au>
Cc: 4511 <at> debbugs.gnu.org
Subject: Re: bug#4511: 23.1; flyspell-mode slow editing near end of big html file
Date: Tue, 22 Sep 2009 23:13:32 -0400
>> You need to pass it a `limit' argument.
> I thought about that a bit.  The limit would be the immediately
> preceding "<", ">", or "\n", since whichever of them is hit first
> answers whether you're in a tag or not.

(line-beginning-position) will do fine.

> There'd be no need for a separate limit calculation if a regexp could be
> cooked up to stop on the first of those three.

The given regexp is actually plenty, in this respect.  It's just that
looking-back is a dog and doesn't make good use of the regexp.


        Stefan



Reply sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
You have taken responsibility. (Wed, 23 Sep 2009 23:15:03 GMT) Full text and rfc822 format available.

Notification sent to Kevin Ryde <user42 <at> zip.com.au>:
bug acknowledged by developer. (Wed, 23 Sep 2009 23:15:03 GMT) Full text and rfc822 format available.

Message #30 received at 4511-done <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Kevin Ryde <user42 <at> zip.com.au>
Subject: Re: bug#4511: 23.1; flyspell-mode slow editing near end of big html file
Date: Wed, 23 Sep 2009 19:06:04 -0400
>>> You need to pass it a `limit' argument.
>> I thought about that a bit.  The limit would be the immediately
>> preceding "<", ">", or "\n", since whichever of them is hit first
>> answers whether you're in a tag or not.

> (line-beginning-position) will do fine.

Installed,


        Stefan



Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4511; Package emacs. (Fri, 16 Oct 2009 22:05:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kevin Ryde <user42 <at> zip.com.au>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Fri, 16 Oct 2009 22:05:06 GMT) Full text and rfc822 format available.

Message #35 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Kevin Ryde <user42 <at> zip.com.au>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 4511 <at> debbugs.gnu.org
Subject: Re: bug#4511: 23.1; flyspell-mode slow editing near end of big html file
Date: Sat, 17 Oct 2009 08:57:06 +1100
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>
> The given regexp is actually plenty, in this respect.  It's just that
> looking-back is a dog and doesn't make good use of the regexp.

Oh, well, I suppose a genuine reverse matcher could do the right thing,
probably if "<" was added to the exclusions like "<[^<>\n]*" -- not that
that helps since there isn't a reverse matcher :-).


But on the principle "why can't someone else do it", what about letting
`sgml-lexical-context' determine the context.  Tested only briefly:

(defun sgml-mode-flyspell-verify ()
  "Function used for `flyspell-generic-check-word-predicate' in SGML mode."
  (not (memq (car (sgml-lexical-context))
             '(tag pi))))

Seems fast enough for me, and I think it means CDATA text is checked,
which I think would be desirable, but I'm not well up on that stuff.



Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4511; Package emacs. (Sat, 17 Oct 2009 02:25:08 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Sat, 17 Oct 2009 02:25:08 GMT) Full text and rfc822 format available.

Message #40 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Kevin Ryde <user42 <at> zip.com.au>
Cc: 4511 <at> debbugs.gnu.org
Subject: Re: bug#4511: 23.1; flyspell-mode slow editing near end of big html file
Date: Fri, 16 Oct 2009 22:14:56 -0400
> But on the principle "why can't someone else do it", what about letting
> `sgml-lexical-context' determine the context.  Tested only briefly:

> (defun sgml-mode-flyspell-verify ()
>   "Function used for `flyspell-generic-check-word-predicate' in SGML mode."
>   (not (memq (car (sgml-lexical-context))
>              '(tag pi))))

> Seems fast enough for me, and I think it means CDATA text is checked,
> which I think would be desirable, but I'm not well up on that stuff.

If performance is good enough, then it's a very good option, indeed.


        Stefan



Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4511; Package emacs. (Sat, 07 Nov 2009 00:30:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kevin Ryde <user42 <at> zip.com.au>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Sat, 07 Nov 2009 00:30:04 GMT) Full text and rfc822 format available.

Message #45 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Kevin Ryde <user42 <at> zip.com.au>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 4511 <at> debbugs.gnu.org
Subject: Re: bug#4511: 23.1; flyspell-mode slow editing near end of big html file
Date: Sat, 07 Nov 2009 11:21:49 +1100
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>
> If performance is good enough,

I've been using it, it seems good.  What chance putting it in for
everyone to have a go?

The only thing to note is right now sgml-lexical-context doesn't
recognise <!-- ... --> comments (bug 4781).  But the current
sgml-mode-flyspell-verify code doesn't recognise such comments either,
so nothing is lost.

I think it makes sense to spell check comments.  The net effect of
excluding just "tag" and "pi" parts is that tag and attribute names are
skipped, but basically everything else is checked.  String valued
attributes are checked, which makes sense for

    <img ... alt="Some text">

though string values which are urls might not want checking.  Maybe some
tag/attribute type info could distinguish the two cases, if it seemed
important enough ...



Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4511; Package emacs. (Tue, 10 Nov 2009 22:25:29 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stefan Monnier <monnier <at> IRO.UMontreal.CA>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Tue, 10 Nov 2009 22:25:30 GMT) Full text and rfc822 format available.

Message #50 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Kevin Ryde <user42 <at> zip.com.au>
Cc: 4511 <at> debbugs.gnu.org
Subject: Re: bug#4511: 23.1; flyspell-mode slow editing near end of big html file
Date: Tue, 10 Nov 2009 17:18:21 -0500
>> If performance is good enough,
> I've been using it, it seems good.  What chance putting it in for
> everyone to have a go?

Try it.


        Stefan



Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4511; Package emacs. (Tue, 17 Nov 2009 00:30:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kevin Ryde <user42 <at> zip.com.au>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Tue, 17 Nov 2009 00:30:04 GMT) Full text and rfc822 format available.

Message #55 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Kevin Ryde <user42 <at> zip.com.au>
To: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
Cc: 4511 <at> debbugs.gnu.org
Subject: Re: bug#4511: 23.1; flyspell-mode slow editing near end of big html file
Date: Tue, 17 Nov 2009 11:22:52 +1100
Stefan Monnier <monnier <at> IRO.UMontreal.CA> writes:
>
> Try it.

Done.  It'll benefit from bug#4781 when that's addressed.



bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> emacsbugs.donarmstrong.com. (Tue, 15 Dec 2009 15:24:14 GMT) Full text and rfc822 format available.

This bug report was last modified 15 years and 184 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.