GNU bug report logs -
#4511
23.1; flyspell-mode slow editing near end of big html file
Previous Next
Reported by: Kevin Ryde <user42 <at> zip.com.au>
Date: Mon, 21 Sep 2009 22:35:10 UTC
Severity: normal
Done: Stefan Monnier <monnier <at> iro.umontreal.ca>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 4511 in the body.
You can then email your comments to 4511 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4511
; Package
emacs
.
(Mon, 21 Sep 2009 22:35:11 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Kevin Ryde <user42 <at> zip.com.au>
:
New bug report received and forwarded. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Mon, 21 Sep 2009 22:35:11 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):
[Message part 1 (text/plain, inline)]
When flyspell-mode is enabled in a big html file, and point is somewhere
near the end of the buffer, typing text or moving point with C-f and C-b
become sluggish, to the point of being nearly unusable.
(This is a regression from emacs 22, where flyspell-mode was fine on
such files.)
I expect "big file" is relative to cpu speed, but 300 kbytes is bad on
my slow pc (not an outrageously huge file). To reproduce try this of
about 600 kbytes,
(progn
(switch-to-buffer "foo")
(dotimes (i 50000) (insert (format "<p> abc def\n" i)))
(html-mode)
(flyspell-mode))
It takes a few seconds to create the buffer, but of course that's not
the bug. The bad bit is if you move point around with C-f / C-b near
the end of the buffer, or type some plain text there outside of a <tag>,
where it's sluggish between keystrokes. (Try upping the 50000 on a fast
cpu if necessary.)
I track the slowness to where `sgml-mode-flyspell-verify' does
(looking-back "<[^>\n]*")
I take it this func is asking whether point is within a <tag> or not.
Does that regexp end up asking re-search-backward to consider every "<"
in the buffer or something, before deciding no match is possible?
I find it hugely faster to do an old fashioned skip-chars-backward as
below -- assuming I'm not mistaken that the "\n" in the existing
`looking-back' is supposed mean examining no more than the current line.
2009-09-21 Kevin Ryde <user42 <at> zip.com.au>
* textmodes/flyspell.el (sgml-mode-flyspell-verify): Use
skip-chars-backward instead of looking-back, to avoid a very slow
regexp match when far into a big buffer with a lots of "<" chars.
[flyspell.el.sgml-verify.diff (text/x-diff, inline)]
--- flyspell.el.~1.146.~ 2009-09-18 08:23:13.000000000 +1000
+++ flyspell.el 2009-09-21 16:36:12.000000000 +1000
@@ -363,7 +363,9 @@
"Function used for `flyspell-generic-check-word-predicate' in SGML mode."
(not (save-excursion
(or (looking-at "[^<\n]*>")
- (ispell-looking-back "<[^>\n]*")
+ (save-excursion
+ (skip-chars-backward "^<>\n") ;; \n only look at current line
+ (not (equal ?< (char-before)))) ;; "<" if in a tag
(and (looking-at "[^&\n]*;")
(ispell-looking-back "&[^;\n]*"))))))
[Message part 3 (text/plain, inline)]
In GNU Emacs 23.1.1 (i486-pc-linux-gnu, GTK+ Version 2.16.5)
of 2009-08-03 on raven, modified by Debian
configured using `configure '--build=i486-linux-gnu' '--host=i486-linux-gnu' '--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--localstatedir=/var/lib' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--with-pop=yes' '--enable-locallisppath=/etc/emacs23:/etc/emacs:/usr/local/share/emacs/23.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/23.1/site-lisp:/usr/share/emacs/site-lisp:/usr/share/emacs/23.1/leim' '--with-x=yes' '--with-x-toolkit=gtk' '--with-toolkit-scroll-bars' 'build_alias=i486-linux-gnu' 'host_alias=i486-linux-gnu' 'CFLAGS=-DDEBIAN -g -O2' 'LDFLAGS=-g' 'CPPFLAGS=''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_AU
value of $XMODIFIERS: nil
locale-coding-system: iso-latin-1-unix
default-enable-multibyte-characters: t
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4511
; Package
emacs
.
(Tue, 22 Sep 2009 21:50:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Stefan Monnier <monnier <at> IRO.UMontreal.CA>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Tue, 22 Sep 2009 21:50:04 GMT)
Full text and
rfc822 format available.
Message #10 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):
> I track the slowness to where `sgml-mode-flyspell-verify' does
> (looking-back "<[^>\n]*")
> I take it this func is asking whether point is within a <tag> or not.
> Does that regexp end up asking re-search-backward to consider every "<"
> in the buffer or something, before deciding no match is possible?
Yes, looking-back is a dog. You need to pass it a `limit' argument.
I think the `limit' argument should be mandatory.
Stefan
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4511
; Package
emacs
.
(Tue, 22 Sep 2009 21:50:08 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Stefan Monnier <monnier <at> IRO.UMontreal.CA>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Tue, 22 Sep 2009 21:50:09 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4511
; Package
emacs
.
(Wed, 23 Sep 2009 01:05:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Kevin Ryde <user42 <at> zip.com.au>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Wed, 23 Sep 2009 01:05:05 GMT)
Full text and
rfc822 format available.
Message #20 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):
Stefan Monnier <monnier <at> IRO.UMontreal.CA> writes:
>
> You need to pass it a `limit' argument.
I thought about that a bit. The limit would be the immediately
preceding "<", ">", or "\n", since whichever of them is hit first
answers whether you're in a tag or not.
There'd be no need for a separate limit calculation if a regexp could be
cooked up to stop on the first of those three. I suppose it'd be along
the lines of (untested) ...
(and (looking-back "\\([<>\n]\\)[^<>\n]*?")
(equal "<" (match-string 1)))
but `skip-chars-backward' seems clearer to me, and might be a couple of
nanoseconds quicker too in fact.
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4511
; Package
emacs
.
(Wed, 23 Sep 2009 03:20:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Stefan Monnier <monnier <at> iro.umontreal.ca>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Wed, 23 Sep 2009 03:20:04 GMT)
Full text and
rfc822 format available.
Message #25 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):
>> You need to pass it a `limit' argument.
> I thought about that a bit. The limit would be the immediately
> preceding "<", ">", or "\n", since whichever of them is hit first
> answers whether you're in a tag or not.
(line-beginning-position) will do fine.
> There'd be no need for a separate limit calculation if a regexp could be
> cooked up to stop on the first of those three.
The given regexp is actually plenty, in this respect. It's just that
looking-back is a dog and doesn't make good use of the regexp.
Stefan
Reply sent
to
Stefan Monnier <monnier <at> iro.umontreal.ca>
:
You have taken responsibility.
(Wed, 23 Sep 2009 23:15:03 GMT)
Full text and
rfc822 format available.
Notification sent
to
Kevin Ryde <user42 <at> zip.com.au>
:
bug acknowledged by developer.
(Wed, 23 Sep 2009 23:15:03 GMT)
Full text and
rfc822 format available.
Message #30 received at 4511-done <at> emacsbugs.donarmstrong.com (full text, mbox):
>>> You need to pass it a `limit' argument.
>> I thought about that a bit. The limit would be the immediately
>> preceding "<", ">", or "\n", since whichever of them is hit first
>> answers whether you're in a tag or not.
> (line-beginning-position) will do fine.
Installed,
Stefan
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4511
; Package
emacs
.
(Fri, 16 Oct 2009 22:05:06 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Kevin Ryde <user42 <at> zip.com.au>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Fri, 16 Oct 2009 22:05:06 GMT)
Full text and
rfc822 format available.
Message #35 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>
> The given regexp is actually plenty, in this respect. It's just that
> looking-back is a dog and doesn't make good use of the regexp.
Oh, well, I suppose a genuine reverse matcher could do the right thing,
probably if "<" was added to the exclusions like "<[^<>\n]*" -- not that
that helps since there isn't a reverse matcher :-).
But on the principle "why can't someone else do it", what about letting
`sgml-lexical-context' determine the context. Tested only briefly:
(defun sgml-mode-flyspell-verify ()
"Function used for `flyspell-generic-check-word-predicate' in SGML mode."
(not (memq (car (sgml-lexical-context))
'(tag pi))))
Seems fast enough for me, and I think it means CDATA text is checked,
which I think would be desirable, but I'm not well up on that stuff.
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4511
; Package
emacs
.
(Sat, 17 Oct 2009 02:25:08 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Stefan Monnier <monnier <at> iro.umontreal.ca>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Sat, 17 Oct 2009 02:25:08 GMT)
Full text and
rfc822 format available.
Message #40 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):
> But on the principle "why can't someone else do it", what about letting
> `sgml-lexical-context' determine the context. Tested only briefly:
> (defun sgml-mode-flyspell-verify ()
> "Function used for `flyspell-generic-check-word-predicate' in SGML mode."
> (not (memq (car (sgml-lexical-context))
> '(tag pi))))
> Seems fast enough for me, and I think it means CDATA text is checked,
> which I think would be desirable, but I'm not well up on that stuff.
If performance is good enough, then it's a very good option, indeed.
Stefan
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4511
; Package
emacs
.
(Sat, 07 Nov 2009 00:30:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Kevin Ryde <user42 <at> zip.com.au>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Sat, 07 Nov 2009 00:30:04 GMT)
Full text and
rfc822 format available.
Message #45 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>
> If performance is good enough,
I've been using it, it seems good. What chance putting it in for
everyone to have a go?
The only thing to note is right now sgml-lexical-context doesn't
recognise <!-- ... --> comments (bug 4781). But the current
sgml-mode-flyspell-verify code doesn't recognise such comments either,
so nothing is lost.
I think it makes sense to spell check comments. The net effect of
excluding just "tag" and "pi" parts is that tag and attribute names are
skipped, but basically everything else is checked. String valued
attributes are checked, which makes sense for
<img ... alt="Some text">
though string values which are urls might not want checking. Maybe some
tag/attribute type info could distinguish the two cases, if it seemed
important enough ...
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4511
; Package
emacs
.
(Tue, 10 Nov 2009 22:25:29 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Stefan Monnier <monnier <at> IRO.UMontreal.CA>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Tue, 10 Nov 2009 22:25:30 GMT)
Full text and
rfc822 format available.
Message #50 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):
>> If performance is good enough,
> I've been using it, it seems good. What chance putting it in for
> everyone to have a go?
Try it.
Stefan
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4511
; Package
emacs
.
(Tue, 17 Nov 2009 00:30:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Kevin Ryde <user42 <at> zip.com.au>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Tue, 17 Nov 2009 00:30:04 GMT)
Full text and
rfc822 format available.
Message #55 received at 4511 <at> emacsbugs.donarmstrong.com (full text, mbox):
Stefan Monnier <monnier <at> IRO.UMontreal.CA> writes:
>
> Try it.
Done. It'll benefit from bug#4781 when that's addressed.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> emacsbugs.donarmstrong.com
.
(Tue, 15 Dec 2009 15:24:14 GMT)
Full text and
rfc822 format available.
This bug report was last modified 15 years and 184 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.