From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 18 Feb 2014 20:59:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 16800@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.139275711429916 (code B ref -1); Tue, 18 Feb 2014 20:59:02 +0000 Received: (at submit) by debbugs.gnu.org; 18 Feb 2014 20:58:34 +0000 Received: from localhost ([127.0.0.1]:58979 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WFrkP-0007mP-BK for submit@debbugs.gnu.org; Tue, 18 Feb 2014 15:58:34 -0500 Received: from eggs.gnu.org ([208.118.235.92]:45516) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WFrjU-0007kO-ID for submit@debbugs.gnu.org; Tue, 18 Feb 2014 15:57:37 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WFrjJ-0008ST-CI for submit@debbugs.gnu.org; Tue, 18 Feb 2014 15:57:31 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:52014) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WFrjJ-0008SP-9W for submit@debbugs.gnu.org; Tue, 18 Feb 2014 15:57:25 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47205) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WFrjD-0000TE-8q for bug-gnu-emacs@gnu.org; Tue, 18 Feb 2014 15:57:25 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WFrj7-0008QJ-Ct for bug-gnu-emacs@gnu.org; Tue, 18 Feb 2014 15:57:19 -0500 Received: from mail-lb0-x22f.google.com ([2a00:1450:4010:c04::22f]:55379) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WFrj7-0008Q1-5A for bug-gnu-emacs@gnu.org; Tue, 18 Feb 2014 15:57:13 -0500 Received: by mail-lb0-f175.google.com with SMTP id p9so12512074lbv.6 for ; Tue, 18 Feb 2014 12:57:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:subject:date:message-id:mime-version:content-type; bh=8zuBCm6c4EwDny8RU7eDAh1FMMINcJHudOVjrroOOTU=; b=BWEH9qsdEP/v8TkyBuRBKbxERRC6btm40cxYhy98Kh/srOPNgch1jtgxQGAMYpad+t VHfWdVjyBE4uOC75Xx6+jK/EWjbWfecbz8nSr8iX/yog9JEe2EoR3Wc8oouFMo/7YFQ6 hlZgYcX9p1QAlKFb+03ybDUGrN99OmYeTIMHSgXJC1aLvHk3br3mTd0XcnyXIFhwNaok 4Va6a2uD+XK/tZj1Eb9I6vEqYdWaA9b2sfH3ID0FD1srHlCd62ywTKibg20MEimGGHli yWFT8lrXTck1qFzPA5JuUZHhiNMjf8oFwqGq9NFN/N118ZVgp4pkBvkmyWjbkXOeDRzJ ZjrA== X-Received: by 10.152.234.3 with SMTP id ua3mr142928lac.63.1392757031511; Tue, 18 Feb 2014 12:57:11 -0800 (PST) Received: from debian ([188.123.230.115]) by mx.google.com with ESMTPSA id y2sm33703583lal.10.2014.02.18.12.57.10 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Tue, 18 Feb 2014 12:57:10 -0800 (PST) From: Aleksey Cherepanov Date: Wed, 19 Feb 2014 00:56:45 +0400 Message-ID: <85zjlo5ecy.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Mailman-Approved-At: Tue, 18 Feb 2014 15:58:32 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) Package: emacs Version: 24.3 Severity: normal Dear Maintainers, It is a copy of bug #739412 in Debian. Debian uses bug tracker similar to this one. The bug on web: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=739412 Address to continue thread there: 739412@bugs.debian.org * What led up to the situation? I faced a problem editing my big .org file (2mb+) with flyspell-mode enabled. I edit it every day, regularly mistype and get words of one or two letters that are wrong in Russian and cause flyspell work slow. This one-liner produces "good" file to reproduce the bug. perl -e 'print(((join " ", ("met and") x 10) . "\n") x 30000)' > t.txt Typing "nd" at the end of file gives a huge pause even on a fast computer. But "mw" or "md" does not give pauses because they are not substrings in this file. It is repeatable with emacs -Q. * What exactly did you do (or not do) that was effective (or ineffective)? So exact sequence is $ emacs --version GNU Emacs 24.3.1 $ emacs23 --version GNU Emacs 23.4.1 $ perl -e 'print(((join " ", ("met and") x 10) . "\n") x 30000)' > t.txt $ LANG=C emacs -Q t.txt Then in emacs: M-x flyspell-mode RET M-> nd SPC 'emacs23 -Q t.txt' works the same way. LANG=C affects regular words because default dictionary is Russian on my system so without LANG=C all words ("met" and "and") are considered misspelled. But it does not affect huge pause at the end. * What was the outcome of this action? Huge pause when emacs does not react on keys except C-g. Word "nd" is colored as misspelled after the pause. C-g stops emacs internal thinking and I could work without waiting but word "nd" is not colored as misspelled word. * What outcome did you expect instead? I expect it to work as fast as with other words like "md" or "mw" that does not produce a pause and are colored immediately. I tried to patch flyspell-word-search-backward and flyspell-word-search-forward functions from flyspell.el replacing search-backward with word-search-backward and search-forward with word-search-forward (perl -pe 's/\(search-/(word-search-/' ). It solved the problem but I do not know what it broke. I expect problems with this solution because I do not know if flyspell's meaning of word is the same as emacs' one. I think it is described in flyspell-get-word function that is called after search-* in the patched functions. flyspell-duplicate-distance variable on its own could mitigate the problem but it changes the behaviour so I do not want to use this variable. Thanks! -- Regards, Aleksey Cherepanov In GNU Emacs 24.3.1 (x86_64-pc-linux-gnu, GTK+ Version 3.8.6) of 2013-12-23 on brahms, modified by Debian Windowing system distributor `The X.Org Foundation', version 11.0.11405000 System Description: Debian GNU/Linux testing (jessie) Configured using: `configure '--build' 'x86_64-linux-gnu' '--build' 'x86_64-linux-gnu' '--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--localstatedir=/var/lib' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--with-pop=yes' '--enable-locallisppath=/etc/emacs24:/etc/emacs:/usr/local/share/emacs/24.3/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/24.3/site-lisp:/usr/share/emacs/site-lisp' '--with-crt-dir=/usr/lib/x86_64-linux-gnu' '--with-x=yes' '--with-x-toolkit=gtk3' '--with-toolkit-scroll-bars' 'build_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Wall' 'LDFLAGS=-Wl,-z,relro' 'CPPFLAGS=-D_FORTIFY_SOURCE=2'' Important settings: value of $LANG: ru_RU.UTF-8 value of $XMODIFIERS: @im=ibus locale-coding-system: utf-8-unix default enable-multibyte-characters: t From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 21 Feb 2014 10:15:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Aleksey Cherepanov Cc: 16800@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139297769421572 (code B ref 16800); Fri, 21 Feb 2014 10:15:01 +0000 Received: (at 16800) by debbugs.gnu.org; 21 Feb 2014 10:14:54 +0000 Received: from localhost ([127.0.0.1]:34286 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WGn89-0005bs-SN for submit@debbugs.gnu.org; Fri, 21 Feb 2014 05:14:54 -0500 Received: from mtaout21.012.net.il ([80.179.55.169]:43730) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WGn86-0005bZ-MB for 16800@debbugs.gnu.org; Fri, 21 Feb 2014 05:14:52 -0500 Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0N1C00J00C3K3500@a-mtaout21.012.net.il> for 16800@debbugs.gnu.org; Fri, 21 Feb 2014 12:14:44 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N1C00IWACGJYJ50@a-mtaout21.012.net.il>; Fri, 21 Feb 2014 12:14:44 +0200 (IST) Date: Fri, 21 Feb 2014 12:15:00 +0200 From: Eli Zaretskii In-reply-to: <85zjlo5ecy.fsf@gmail.com> X-012-Sender: halo1@inter.net.il Message-id: <83ob204vrv.fsf@gnu.org> References: <85zjlo5ecy.fsf@gmail.com> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Aleksey Cherepanov > Date: Wed, 19 Feb 2014 00:56:45 +0400 > > I faced a problem editing my big .org file (2mb+) with flyspell-mode > enabled. I edit it every day, regularly mistype and get words of one > or two letters that are wrong in Russian and cause flyspell work slow. > > This one-liner produces "good" file to reproduce the bug. > perl -e 'print(((join " ", ("met and") x 10) . "\n") x 30000)' > t.txt > > Typing "nd" at the end of file gives a huge pause even on a fast > computer. But "mw" or "md" does not give pauses because they are not > substrings in this file. It is repeatable with emacs -Q. This seems to be due to the Flyspell's feature of recognizing duplicates of mis-spelled words, and, if found, highlighting such duplicates in a different face. If you customize the variable flyspell-duplicate-distance to some small value (or even zero), the delay goes away. Evidently, with the default value of -1, Flyspell searches all the way to the beginning of the giant buffer, looking for a duplicate of "nd". Interestingly, I don't see this when the speller is Ispell, but I do see it with Hunspell. Not sure how using Ispell avoids this problem. From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Agustin Martin Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 21 Feb 2014 14:40:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Aleksey Cherepanov Cc: 16800@debbugs.gnu.org Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139299355021044 (code B ref 16800); Fri, 21 Feb 2014 14:40:02 +0000 Received: (at 16800) by debbugs.gnu.org; 21 Feb 2014 14:39:10 +0000 Received: from localhost ([127.0.0.1]:34459 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WGrFt-0005TK-CI for submit@debbugs.gnu.org; Fri, 21 Feb 2014 09:39:10 -0500 Received: from edison.ccupm.upm.es ([138.100.198.71]:54198) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WGrFm-0005St-Lv for 16800@debbugs.gnu.org; Fri, 21 Feb 2014 09:39:03 -0500 Received: from agmartin.aq.upm.es (Agmartin.aq.upm.es [138.100.41.131]) by smtp.upm.es (8.14.3/8.14.3/edison-001) with ESMTP id s1LEcu68012033; Fri, 21 Feb 2014 15:38:56 +0100 Received: by agmartin.aq.upm.es (Postfix, from userid 1000) id ECB3D3FFCB; Fri, 21 Feb 2014 15:38:55 +0100 (CET) Date: Fri, 21 Feb 2014 15:38:55 +0100 From: Agustin Martin Message-ID: <20140221143855.GA6018@agmartin.aq.upm.es> References: <85zjlo5ecy.fsf@gmail.com> <83ob204vrv.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <83ob204vrv.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -3.3 (---) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) On Fri, Feb 21, 2014 at 12:15:00PM +0200, Eli Zaretskii wrote: > > From: Aleksey Cherepanov > > Date: Wed, 19 Feb 2014 00:56:45 +0400 > > > > I faced a problem editing my big .org file (2mb+) with flyspell-mode > > enabled. I edit it every day, regularly mistype and get words of one > > or two letters that are wrong in Russian and cause flyspell work slow. > > > > This one-liner produces "good" file to reproduce the bug. > > perl -e 'print(((join " ", ("met and") x 10) . "\n") x 30000)' > t.txt > > > > Typing "nd" at the end of file gives a huge pause even on a fast > > computer. But "mw" or "md" does not give pauses because they are not > > substrings in this file. It is repeatable with emacs -Q. > > This seems to be due to the Flyspell's feature of recognizing > duplicates of mis-spelled words, and, if found, highlighting such > duplicates in a different face. If you customize the variable > flyspell-duplicate-distance to some small value (or even zero), the > delay goes away. Evidently, with the default value of -1, Flyspell > searches all the way to the beginning of the giant buffer, looking for > a duplicate of "nd". > > Interestingly, I don't see this when the speller is Ispell, but I do > see it with Hunspell. Not sure how using Ispell avoids this problem. Hi, On the other hand, I can reproduce this also with ispell, as well as with aspell and hunspell. On Wed, Feb 19, 2014 at 12:56:45AM +0400, Aleksey Cherepanov wrote: > flyspell-duplicate-distance variable on its own could mitigate the > problem but it changes the behaviour so I do not want to use this > variable. For the records, I was playing with a customized value of 50000 for that distance and even if there is still a minor delay it is reasonable. I am in a fast box, do not know in other boxes. > I tried to patch flyspell-word-search-backward and > flyspell-word-search-forward functions from flyspell.el replacing > search-backward with word-search-backward and search-forward with > word-search-forward (perl -pe 's/\(search-/(word-search-/' ). It > solved the problem but I do not know what it broke. > > I expect problems with this solution because I do not know if > flyspell's meaning of word is the same as emacs' one. I think it is > described in flyspell-get-word function that is called after search-* > in the patched functions. I have never played with Emacs syntax tables, but I'd expect differences only if there is a mismatch between chars in OTHERCHARS and non alphabetic chars that Emacs considers as possible parts of a word. Regards, -- Agustin From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 21 Feb 2014 15:13:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org, aleksey.4erepanov@gmail.com Reply-To: Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139299556824577 (code B ref 16800); Fri, 21 Feb 2014 15:13:02 +0000 Received: (at 16800) by debbugs.gnu.org; 21 Feb 2014 15:12:48 +0000 Received: from localhost ([127.0.0.1]:34896 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WGrmR-0006OK-Lq for submit@debbugs.gnu.org; Fri, 21 Feb 2014 10:12:48 -0500 Received: from mtaout29.012.net.il ([80.179.55.185]:43585) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WGrmO-0006O2-7W for 16800@debbugs.gnu.org; Fri, 21 Feb 2014 10:12:45 -0500 Received: from conversion-daemon.mtaout29.012.net.il by mtaout29.012.net.il (HyperSendmail v2007.08) id <0N1C00K00PVFDX00@mtaout29.012.net.il> for 16800@debbugs.gnu.org; Fri, 21 Feb 2014 17:15:15 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout29.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N1C00FXCQDFG860@mtaout29.012.net.il>; Fri, 21 Feb 2014 17:15:15 +0200 (IST) Date: Fri, 21 Feb 2014 17:12:54 +0200 From: Eli Zaretskii In-reply-to: <20140221143855.GA6018@agmartin.aq.upm.es> X-012-Sender: halo1@inter.net.il Message-id: <83k3co4hzd.fsf@gnu.org> References: <85zjlo5ecy.fsf@gmail.com> <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Fri, 21 Feb 2014 15:38:55 +0100 > From: Agustin Martin > Cc: 16800@debbugs.gnu.org > > On Wed, Feb 19, 2014 at 12:56:45AM +0400, Aleksey Cherepanov wrote: > > flyspell-duplicate-distance variable on its own could mitigate the > > problem but it changes the behaviour so I do not want to use this > > variable. What behavior does it change? Do you really care to have a mis-spelled word be highlighted in a different face just because there's an identical mis-spelling half a megabyte away? > For the records, I was playing with a customized value of 50000 for that > distance and even if there is still a minor delay it is reasonable. I am > in a fast box, do not know in other boxes. I would suggest to change the default to something finite, like 20000 perhaps. Having it set to -1 by default is IMO unwieldy, since buffers can be very large. > > I tried to patch flyspell-word-search-backward and > > flyspell-word-search-forward functions from flyspell.el replacing > > search-backward with word-search-backward and search-forward with > > word-search-forward (perl -pe 's/\(search-/(word-search-/' ). It > > solved the problem but I do not know what it broke. And this doesn't change behavior? See below. > > I expect problems with this solution because I do not know if > > flyspell's meaning of word is the same as emacs' one. I think it is > > described in flyspell-get-word function that is called after search-* > > in the patched functions. > > I have never played with Emacs syntax tables, but I'd expect differences > only if there is a mismatch between chars in OTHERCHARS and non > alphabetic chars that Emacs considers as possible parts of a word. The effect depends on the language, I think. From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 21 Feb 2014 15:22:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: agustin.martin@hispalinux.es Cc: 16800@debbugs.gnu.org, aleksey.4erepanov@gmail.com Reply-To: Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139299611025569 (code B ref 16800); Fri, 21 Feb 2014 15:22:02 +0000 Received: (at 16800) by debbugs.gnu.org; 21 Feb 2014 15:21:50 +0000 Received: from localhost ([127.0.0.1]:34900 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WGrvC-0006eL-2n for submit@debbugs.gnu.org; Fri, 21 Feb 2014 10:21:50 -0500 Received: from mtaout21.012.net.il ([80.179.55.169]:53603) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WGrv5-0006e3-FX for 16800@debbugs.gnu.org; Fri, 21 Feb 2014 10:21:47 -0500 Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0N1C00K00QGEIZ00@a-mtaout21.012.net.il> for 16800@debbugs.gnu.org; Fri, 21 Feb 2014 17:21:37 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N1C00K3MQO0EH70@a-mtaout21.012.net.il>; Fri, 21 Feb 2014 17:21:37 +0200 (IST) Date: Fri, 21 Feb 2014 17:21:54 +0200 From: Eli Zaretskii In-reply-to: <83k3co4hzd.fsf@gnu.org> X-012-Sender: halo1@inter.net.il Message-id: <83ios84hkd.fsf@gnu.org> References: <85zjlo5ecy.fsf@gmail.com> <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Fri, 21 Feb 2014 17:12:54 +0200 > From: Eli Zaretskii > Cc: 16800@debbugs.gnu.org, aleksey.4erepanov@gmail.com > > > > I expect problems with this solution because I do not know if > > > flyspell's meaning of word is the same as emacs' one. I think it is > > > described in flyspell-get-word function that is called after search-* > > > in the patched functions. > > > > I have never played with Emacs syntax tables, but I'd expect differences > > only if there is a mismatch between chars in OTHERCHARS and non > > alphabetic chars that Emacs considers as possible parts of a word. > > The effect depends on the language, I think. Actually, having looked at flyspell-word-search-backward, it is quite clear to me that replacing that with word-search-backward will change behavior: for example, it will disregard the current spelling language. In general, flyspell-word-search-backward or its replacement absolutely must agree with flyspell-word about what is a "word", because words is what flyspell feeds to the speller. From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 22 Feb 2014 12:45:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 16800@debbugs.gnu.org, Agustin Martin Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139307306613315 (code B ref 16800); Sat, 22 Feb 2014 12:45:02 +0000 Received: (at 16800) by debbugs.gnu.org; 22 Feb 2014 12:44:26 +0000 Received: from localhost ([127.0.0.1]:35438 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHBwP-0003Sg-Op for submit@debbugs.gnu.org; Sat, 22 Feb 2014 07:44:26 -0500 Received: from mail-la0-f54.google.com ([209.85.215.54]:63924) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHBwN-0003SQ-64 for 16800@debbugs.gnu.org; Sat, 22 Feb 2014 07:44:24 -0500 Received: by mail-la0-f54.google.com with SMTP id y1so3471688lam.13 for <16800@debbugs.gnu.org>; Sat, 22 Feb 2014 04:44:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=qqwyZANC0ODhGkqPiRL6RnhC6Jq77Dm6EI7OqZmdRkg=; b=V9QYHPkD53xIoZGvqtuSuuuSv7rBqbJHPO44gwiDQssxbnZBhGvbL74RYAz6YRWACD DXB0fRuysUFnlT7LRqY3qUEznqvEIyD9ICBcw/YGDXVNYkzMAJfwW4YzEqULLyvc6fwW J5Hk1G4L7y12hQr79m5ak9b7zEIZbyZ4PIKsMQOp5pxaP/pzD9mpb4BNWHAxX0+P8KHT NqmE7YfRMRASSUciTPYMkBSgqY/+a/ZxFmIVw6ZgDiqFRary0WFJBlAHnJB6YPubJji1 sGBZ/j141HqNiEmasXa4/EZGgrwHRllSm+wseI7E70wY2lQvBb6VCi4mAXNr0mTk92m7 G02Q== X-Received: by 10.112.148.104 with SMTP id tr8mr6525700lbb.62.1393073056921; Sat, 22 Feb 2014 04:44:16 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id cl5sm11196971lbb.14.2014.02.22.04.44.15 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sat, 22 Feb 2014 04:44:16 -0800 (PST) Date: Sat, 22 Feb 2014 16:44:13 +0400 From: Aleksey Cherepanov Message-ID: <20140222124413.GA4971@openwall.com> References: <85zjlo5ecy.fsf@gmail.com> <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <83k3co4hzd.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Fri, Feb 21, 2014 at 05:12:54PM +0200, Eli Zaretskii wrote: > > Date: Fri, 21 Feb 2014 15:38:55 +0100 > > From: Agustin Martin > > Cc: 16800@debbugs.gnu.org > > > > On Wed, Feb 19, 2014 at 12:56:45AM +0400, Aleksey Cherepanov wrote: > > > flyspell-duplicate-distance variable on its own could mitigate the > > > problem but it changes the behaviour so I do not want to use this > > > variable. > > What behavior does it change? Do you really care to have a > mis-spelled word be highlighted in a different face just because > there's an identical mis-spelling half a megabyte away? Yes, as a user I really care about this and as a programmer I believe the bug could be solved well. Also GNU coding standards say to avoid arbitrary limits (parts 2.1 and 4.2). http://www.gnu.org/prep/standards/standards.html But I could accept that this bug has low severity because there is flyspell-duplicate-distance variable that could be used as a workaround. > > For the records, I was playing with a customized value of 50000 for that > > distance and even if there is still a minor delay it is reasonable. I am > > in a fast box, do not know in other boxes. > > I would suggest to change the default to something finite, like 20000 > perhaps. Having it set to -1 by default is IMO unwieldy, since > buffers can be very large. > > > > I tried to patch flyspell-word-search-backward and > > > flyspell-word-search-forward functions from flyspell.el replacing > > > search-backward with word-search-backward and search-forward with > > > word-search-forward (perl -pe 's/\(search-/(word-search-/' ). It > > > solved the problem but I do not know what it broke. > > And this doesn't change behavior? See below. No, it seems that my setup works the same. See below. > > > I expect problems with this solution because I do not know if > > > flyspell's meaning of word is the same as emacs' one. I think it is > > > described in flyspell-get-word function that is called after search-* > > > in the patched functions. > > > > I have never played with Emacs syntax tables, but I'd expect differences > > only if there is a mismatch between chars in OTHERCHARS and non > > alphabetic chars that Emacs considers as possible parts of a word. > > The effect depends on the language, I think. If I'd believe that it is a right solution I'd send a patch. The difference is in word bounds. We are in trouble if flyspell's word on its ends does not have ends of emacs' word. If flyspell's word has ends of emacs' word on its ends and even contain them inside then we are ok (try to search "a b" over "aa bb a b aa bb"). So could ends of flyspell's word do not match with ends of emacs' word? So I think word search instead of regular search does not change behaviour of my setup with EN and RU languages (both at the same time) and excplicitly specified ispell-dictionary-alist (I used some popular instructions as is to setup it so long time ago). ispell-dictionary-alist contains character sets used by flyspell-get-word. As an alternative I think we could generate regexps on the fly with flyspell's word boundaries around words and search them. It would be like (re-search-forward (concat "\\<" (regexp-quote word) "\\>") bound t) instead of (word-search-forward word bound t) but with flyspell's word boundaries instead of "\\<" and "\\>". Thanks! -- Regards, Aleksey Cherepanov From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 22 Feb 2014 13:10:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Aleksey Cherepanov Cc: 16800@debbugs.gnu.org, agustin.martin@hispalinux.es Reply-To: Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139307459016147 (code B ref 16800); Sat, 22 Feb 2014 13:10:01 +0000 Received: (at 16800) by debbugs.gnu.org; 22 Feb 2014 13:09:50 +0000 Received: from localhost ([127.0.0.1]:35467 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHCL0-0004CM-E4 for submit@debbugs.gnu.org; Sat, 22 Feb 2014 08:09:50 -0500 Received: from mtaout24.012.net.il ([80.179.55.180]:52899) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHCKy-0004C6-8p for 16800@debbugs.gnu.org; Sat, 22 Feb 2014 08:09:49 -0500 Received: from conversion-daemon.mtaout24.012.net.il by mtaout24.012.net.il (HyperSendmail v2007.08) id <0N1E00900ED01N00@mtaout24.012.net.il> for 16800@debbugs.gnu.org; Sat, 22 Feb 2014 15:08:21 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout24.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N1E006PZF5XPU50@mtaout24.012.net.il>; Sat, 22 Feb 2014 15:08:21 +0200 (IST) Date: Sat, 22 Feb 2014 15:10:02 +0200 From: Eli Zaretskii In-reply-to: <20140222124413.GA4971@openwall.com> X-012-Sender: halo1@inter.net.il Message-id: <83vbw72t05.fsf@gnu.org> References: <85zjlo5ecy.fsf@gmail.com> <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Sat, 22 Feb 2014 16:44:13 +0400 > From: Aleksey Cherepanov > Cc: Agustin Martin , 16800@debbugs.gnu.org > > Also GNU coding standards say to avoid arbitrary limits (parts 2.1 > and 4.2). > http://www.gnu.org/prep/standards/standards.html This limit is not arbitrary. > > > > I tried to patch flyspell-word-search-backward and > > > > flyspell-word-search-forward functions from flyspell.el replacing > > > > search-backward with word-search-backward and search-forward with > > > > word-search-forward (perl -pe 's/\(search-/(word-search-/' ). It > > > > solved the problem but I do not know what it broke. > > > > And this doesn't change behavior? See below. > > No, it seems that my setup works the same. See below. Your setup _might_ work the same, especially if you don't mix different languages in the same buffer. But in general, your change does affect behavior. > The difference is in word bounds. We are in trouble if flyspell's word > on its ends does not have ends of emacs' word. If flyspell's word has > ends of emacs' word on its ends and even contain them inside then we > are ok (try to search "a b" over "aa bb a b aa bb"). So could ends of > flyspell's word do not match with ends of emacs' word? Yes, definitely. See what flyspell-get-word does to find where the word begins and where it ends. Flyspell's "words" are language-sensitive, whereas Emacs's words are not. From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 22 Feb 2014 16:03:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 16800@debbugs.gnu.org, agustin.martin@hispalinux.es Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.13930849512010 (code B ref 16800); Sat, 22 Feb 2014 16:03:01 +0000 Received: (at 16800) by debbugs.gnu.org; 22 Feb 2014 16:02:31 +0000 Received: from localhost ([127.0.0.1]:35834 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHF25-0000WK-KI for submit@debbugs.gnu.org; Sat, 22 Feb 2014 11:02:30 -0500 Received: from mail-la0-f48.google.com ([209.85.215.48]:51136) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHF22-0000W6-Bd for 16800@debbugs.gnu.org; Sat, 22 Feb 2014 11:02:27 -0500 Received: by mail-la0-f48.google.com with SMTP id gf5so580408lab.21 for <16800@debbugs.gnu.org>; Sat, 22 Feb 2014 08:02:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=rOGfM68JouqkI5GV9FbYWhwl15vm8alb+j8G/gWvgvQ=; b=Dvhya7npSDaEGSv/B36l+TXQnkYjm+GDgYMxg59wIn6AKciNkLXRb6N9zIAweley8Y dpnKk4keZoJH1hOLNrfP4qjc5D+EbsbGVhLRRYYw4LHKoueFGNHuUxGFkJwC5IMtCawf 9aVCfIe4k3/TXk3PoYla9up4FFtRdwrS8ZIgwuEtJZqz0n5anDCV/3A8ybTwOlH+ZMC0 V195qdf1iezIMW5WtA3kd0BTEq+snekOrt6gZf4IwYFRlVA8DRwKc7kSRTpmo2lBPSZl /PGBnIlt+MUtH6rlX2G0o7x8xdYSzGysg1DH3bdIpKMsKNvwMS6SNZDkcHMKY5k7Ekw3 yNIA== X-Received: by 10.112.39.167 with SMTP id q7mr6806808lbk.82.1393084940089; Sat, 22 Feb 2014 08:02:20 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id y2sm16253477lal.10.2014.02.22.08.02.18 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sat, 22 Feb 2014 08:02:19 -0800 (PST) Date: Sat, 22 Feb 2014 20:02:17 +0400 From: Aleksey Cherepanov Message-ID: <20140222160217.GA15616@openwall.com> References: <85zjlo5ecy.fsf@gmail.com> <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <83vbw72t05.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Sat, Feb 22, 2014 at 03:10:02PM +0200, Eli Zaretskii wrote: > > Date: Sat, 22 Feb 2014 16:44:13 +0400 > > From: Aleksey Cherepanov > > Cc: Agustin Martin , 16800@debbugs.gnu.org > > > > Also GNU coding standards say to avoid arbitrary limits (parts 2.1 > > and 4.2). > > http://www.gnu.org/prep/standards/standards.html > > This limit is not arbitrary. Anyway it is a limit that could be avoided. > > > > > I tried to patch flyspell-word-search-backward and > > > > > flyspell-word-search-forward functions from flyspell.el replacing > > > > > search-backward with word-search-backward and search-forward with > > > > > word-search-forward (perl -pe 's/\(search-/(word-search-/' ). It > > > > > solved the problem but I do not know what it broke. > > > > > > And this doesn't change behavior? See below. > > > > No, it seems that my setup works the same. See below. > > Your setup _might_ work the same, especially if you don't mix > different languages in the same buffer. But in general, your change > does affect behavior. I mix languages. I am pretty sure that my setup works the same. BTW solution around reduction of jump points does not not affect faces: "nd" or "badnd" at the end of "good badnd good " does not call spell check on the first "badnd". > > The difference is in word bounds. We are in trouble if flyspell's word > > on its ends does not have ends of emacs' word. If flyspell's word has > > ends of emacs' word on its ends and even contain them inside then we > > are ok (try to search "a b" over "aa bb a b aa bb"). So could ends of > > flyspell's word do not match with ends of emacs' word? > > Yes, definitely. See what flyspell-get-word does to find where the > word begins and where it ends. Flyspell's "words" are > language-sensitive, whereas Emacs's words are not. I saw this function. Emacs words are language sensitive too. Emacs jumps through all ends of words from RU and EN languages even if the words are not separated. Example: Word of 3 parts: English "asdf", Russian "фыва" (execute-kbd-macro (kbd "C-q 02104 RET C-q 02113 RET C-q 02062 RET C-q 02060 RET")), English "asdf". asdfфываasdf ^ ^ ^ ^ b b b f f f M-b and C-M-r \< jumps through positions marked by b. M-f and C-M-r \> jumps through positions marked by f. It is one word for flyspell in my setup and in LANG=C emacs -Q. But I do not know if it is applicable to other languages and/or other setups. How could it be improved? Other solutions? I'd propose as a variant to use emacs' words for flyspell and vice versa but it would a bad idea. My emacs jumps over asdf'asdf in one hop and similar behaviour could be done for other languages with their respective 'otherchars' but it would be inconvenient for some users including me (python-mode has _ as a part of word by default, both questions how to enable it everywhere and how to disable it in python-mode exist). Thanks! -- Regards, Aleksey Cherepanov From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 22 Feb 2014 16:42:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Aleksey Cherepanov Cc: 16800@debbugs.gnu.org, agustin.martin@hispalinux.es Reply-To: Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.13930872826300 (code B ref 16800); Sat, 22 Feb 2014 16:42:01 +0000 Received: (at 16800) by debbugs.gnu.org; 22 Feb 2014 16:41:22 +0000 Received: from localhost ([127.0.0.1]:35852 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHFdh-0001dW-R3 for submit@debbugs.gnu.org; Sat, 22 Feb 2014 11:41:22 -0500 Received: from mtaout26.012.net.il ([80.179.55.182]:58947) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHFde-0001d8-4c for 16800@debbugs.gnu.org; Sat, 22 Feb 2014 11:41:19 -0500 Received: from conversion-daemon.mtaout26.012.net.il by mtaout26.012.net.il (HyperSendmail v2007.08) id <0N1E00800OJLRQ00@mtaout26.012.net.il> for 16800@debbugs.gnu.org; Sat, 22 Feb 2014 18:39:24 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout26.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N1E001H5OXOW480@mtaout26.012.net.il>; Sat, 22 Feb 2014 18:39:24 +0200 (IST) Date: Sat, 22 Feb 2014 18:41:08 +0200 From: Eli Zaretskii In-reply-to: <20140222160217.GA15616@openwall.com> X-012-Sender: halo1@inter.net.il Message-id: <83ios72j8b.fsf@gnu.org> References: <85zjlo5ecy.fsf@gmail.com> <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Sat, 22 Feb 2014 20:02:17 +0400 > From: Aleksey Cherepanov > Cc: agustin.martin@hispalinux.es, 16800@debbugs.gnu.org > > > Your setup _might_ work the same, especially if you don't mix > > different languages in the same buffer. But in general, your change > > does affect behavior. > > I mix languages. I am pretty sure that my setup works the same. Not in general, it isn't. See below. > BTW solution around reduction of jump points does not not affect > faces: "nd" or "badnd" at the end of "good badnd good " does not call > spell check on the first "badnd". Not sure I understand what you are saying here. What "first badnd"? you have only one in this example. > Emacs words are language sensitive too. But not in the same way as ispell/flyspell is. The CASECHARS, NON-CASECHARS, and OTHERCHARS parameters of the dictionary are only taken into account by ispell/flyspell. From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 22 Feb 2014 18:56:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 16800@debbugs.gnu.org, agustin.martin@hispalinux.es Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139309532422005 (code B ref 16800); Sat, 22 Feb 2014 18:56:02 +0000 Received: (at 16800) by debbugs.gnu.org; 22 Feb 2014 18:55:24 +0000 Received: from localhost ([127.0.0.1]:35895 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHHjO-0005ip-QV for submit@debbugs.gnu.org; Sat, 22 Feb 2014 13:55:23 -0500 Received: from mail-la0-f54.google.com ([209.85.215.54]:65387) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHHjM-0005iZ-7o for 16800@debbugs.gnu.org; Sat, 22 Feb 2014 13:55:21 -0500 Received: by mail-la0-f54.google.com with SMTP id mc6so252983lab.13 for <16800@debbugs.gnu.org>; Sat, 22 Feb 2014 10:55:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=8DYvz8opW+XJeDnO2m+QL23ctWLkWiAarwEhU+zEofg=; b=Sgka+epAMuV7eclsKXGXG3hOYtHUnRacNsdUecAGDr39+ftZtq+oCkxYOiVeKVu449 z9sdGanWi/91rVkdwb5Qp7nQUEQshTMFU8PkNZzTx1ZrY2WjTz/ZbFL5P9+oVURiqoK4 auYalWch6sjT5sZ+uPiUv23nigYapB+1st4RXkf673kRuyP1Zcj880pRUGwWCE+wzdQ3 gf/iTOhZd+lY6LOWIe0BGguknsDygmmj73/cQBJ2WdTqbmhWWY/gTUC97vDWlDYr1c34 Sd5MbCYo7bputQn6nZKEeQ+t/966kku3q0FKK/bzJ08LVnl+wJLaK3RaN34kJzYtnB/C 439w== X-Received: by 10.152.206.104 with SMTP id ln8mr7526393lac.67.1393095313746; Sat, 22 Feb 2014 10:55:13 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id n1sm16878470lae.6.2014.02.22.10.55.12 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sat, 22 Feb 2014 10:55:13 -0800 (PST) Date: Sat, 22 Feb 2014 22:55:11 +0400 From: Aleksey Cherepanov Message-ID: <20140222185511.GA23643@openwall.com> References: <85zjlo5ecy.fsf@gmail.com> <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <83ios72j8b.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Sat, Feb 22, 2014 at 06:41:08PM +0200, Eli Zaretskii wrote: > > Date: Sat, 22 Feb 2014 20:02:17 +0400 > > From: Aleksey Cherepanov > > Cc: agustin.martin@hispalinux.es, 16800@debbugs.gnu.org > > > > > Your setup _might_ work the same, especially if you don't mix > > > different languages in the same buffer. But in general, your change > > > does affect behavior. > > > > I mix languages. I am pretty sure that my setup works the same. > > Not in general, it isn't. See below. I agree. Oh, not even for my setup. But for my setup together with my files. I've got an example. > > BTW solution around reduction of jump points does not not affect > > faces: "nd" or "badnd" at the end of "good badnd good " does not call > > spell check on the first "badnd". > > Not sure I understand what you are saying here. What "first badnd"? > you have only one in this example. "nd" does not cause spell check of "badnd". Another "badnd" at the end does not cause spell check of the first "badnd". > > Emacs words are language sensitive too. > > But not in the same way as ispell/flyspell is. The CASECHARS, > NON-CASECHARS, and OTHERCHARS parameters of the dictionary are only > taken into account by ispell/flyspell. I think one could define a dictionary like: ("my" "[a]" "[^a]" "" ...) So the only letter for flyspell words is "a". That way "qqaaqqaaqq" is one word for emacs and two words with garbage around for flyspell. I think my solution fails in such case. So flyspell's set should be consisted of full emacs categories to make my solution work. Code for emacs word boundaries is in category.h, macro WORD_BOUNDARY_P. We could use regular search for bad setups and word search for good setups. Though it does not seem trivial to check if flyspell's dictionary setup is good for my solution. Russian alphabet is not a full emacs (Unicode, I guess) category. The full category is Cyrillic script (or even wider). My solution does not work if there is a letter from the complement (for instance, Lje 02131) right near my mis-spelling word. So I was wrong about the behaviour: it is not the same, I just do not see differences in my files. We could mix: regular search for short distance and word search for longer distance. Though it seems ugly for me. I still think that we could make regexps with word boundaries according to flyspell's meaning of word. Thanks! -- Regards, Aleksey Cherepanov From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 22 Feb 2014 20:18:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 16800@debbugs.gnu.org, agustin.martin@hispalinux.es Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139310022231042 (code B ref 16800); Sat, 22 Feb 2014 20:18:01 +0000 Received: (at 16800) by debbugs.gnu.org; 22 Feb 2014 20:17:02 +0000 Received: from localhost ([127.0.0.1]:35947 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHJ0Q-00084c-8E for submit@debbugs.gnu.org; Sat, 22 Feb 2014 15:17:02 -0500 Received: from mail-la0-f46.google.com ([209.85.215.46]:65235) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHJ0N-000841-A0 for 16800@debbugs.gnu.org; Sat, 22 Feb 2014 15:16:59 -0500 Received: by mail-la0-f46.google.com with SMTP id b8so3800828lan.19 for <16800@debbugs.gnu.org>; Sat, 22 Feb 2014 12:16:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=wq7RuRcqV65aVZ8HE6EpF5y62jkG+m725DCr8ZGs2go=; b=X4J5Cwg7YC8Sibmf7MfuQzq3iSk0XtBmv0f0yThv30Cu0PCL6lFOPIC18BXV4wW7ph +gn7P2vRvBcBDyY1pG+XXg22EXnHsYEivs0y62egeadYtEWl38Sb5dPVAHSNDJp5nPSp KIvKQuA4R5Fcm744j9bpBW+n0Gm7SxXqCU41y5H1ulqpV1KQn8Q+bRkPknBMvm/84HjH k4bYHDvRoswI7DOVOdPG3zMsIeZdTP+52lCQ3CXHTfJr1vQfd9rpDMZsrq3vYvEhwFr5 W2onVL8w2XaLX/qJyZISLpCKCTTMQXuClneHwZipXcM6UyUMVb2db0rCVudI8+nYPeGs KpPg== X-Received: by 10.152.203.193 with SMTP id ks1mr7895409lac.0.1393100213090; Sat, 22 Feb 2014 12:16:53 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id gb8sm12387332lbc.13.2014.02.22.12.16.52 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sat, 22 Feb 2014 12:16:52 -0800 (PST) Date: Sun, 23 Feb 2014 00:16:50 +0400 From: Aleksey Cherepanov Message-ID: <20140222201650.GA30683@openwall.com> References: <85zjlo5ecy.fsf@gmail.com> <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140222185511.GA23643@openwall.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Sat, Feb 22, 2014 at 10:55:11PM +0400, Aleksey Cherepanov wrote: > On Sat, Feb 22, 2014 at 06:41:08PM +0200, Eli Zaretskii wrote: > > > Date: Sat, 22 Feb 2014 20:02:17 +0400 > > > From: Aleksey Cherepanov > > > Cc: agustin.martin@hispalinux.es, 16800@debbugs.gnu.org > > > > > > > Your setup _might_ work the same, especially if you don't mix > > > > different languages in the same buffer. But in general, your change > > > > does affect behavior. > > > > > > I mix languages. I am pretty sure that my setup works the same. > > > > Not in general, it isn't. See below. > > I agree. > > Oh, not even for my setup. But for my setup together with my files. > I've got an example. > > > > BTW solution around reduction of jump points does not not affect > > > faces: "nd" or "badnd" at the end of "good badnd good " does not call > > > spell check on the first "badnd". > > > > Not sure I understand what you are saying here. What "first badnd"? > > you have only one in this example. > > "nd" does not cause spell check of "badnd". Another "badnd" at the end > does not cause spell check of the first "badnd". > > > > Emacs words are language sensitive too. > > > > But not in the same way as ispell/flyspell is. The CASECHARS, > > NON-CASECHARS, and OTHERCHARS parameters of the dictionary are only > > taken into account by ispell/flyspell. > > I think one could define a dictionary like: ("my" "[a]" "[^a]" "" ...) > So the only letter for flyspell words is "a". That way "qqaaqqaaqq" is > one word for emacs and two words with garbage around for flyspell. I > think my solution fails in such case. > > So flyspell's set should be consisted of full emacs categories to make > my solution work. Code for emacs word boundaries is in category.h, > macro WORD_BOUNDARY_P. We could use regular search for bad setups and > word search for good setups. Though it does not seem trivial to check > if flyspell's dictionary setup is good for my solution. > > Russian alphabet is not a full emacs (Unicode, I guess) category. The > full category is Cyrillic script (or even wider). My solution does not > work if there is a letter from the complement (for instance, Lje > 02131) right near my mis-spelling word. So I was wrong about the > behaviour: it is not the same, I just do not see differences in my > files. Oh, my setup is wrong. Default setup uses "[[:alpha:]]" ; casechars "[^[:alpha:]]" ; not-casechars due to ispell-set-spellchecker-params function: ;; If Emacs flavor supports [:alpha:] use it for global dicts. If ;; spellchecker also supports UTF-8 via command-line option use it ;; in communication. This does not affect definitions in your ;; init file. My solution should work well with such setup. Thanks! -- Regards, Aleksey Cherepanov From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 22 Feb 2014 21:04:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Aleksey Cherepanov Cc: 16800@debbugs.gnu.org, agustin.martin@hispalinux.es Reply-To: Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.13931029988341 (code B ref 16800); Sat, 22 Feb 2014 21:04:01 +0000 Received: (at 16800) by debbugs.gnu.org; 22 Feb 2014 21:03:18 +0000 Received: from localhost ([127.0.0.1]:35997 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHJjC-0002AT-5o for submit@debbugs.gnu.org; Sat, 22 Feb 2014 16:03:18 -0500 Received: from mtaout25.012.net.il ([80.179.55.181]:33924) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHJjA-0002AB-Av for 16800@debbugs.gnu.org; Sat, 22 Feb 2014 16:03:17 -0500 Received: from conversion-daemon.mtaout25.012.net.il by mtaout25.012.net.il (HyperSendmail v2007.08) id <0N1F00F000YDPY00@mtaout25.012.net.il> for 16800@debbugs.gnu.org; Sat, 22 Feb 2014 23:01:48 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout25.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N1F006GB130DH90@mtaout25.012.net.il>; Sat, 22 Feb 2014 23:01:48 +0200 (IST) Date: Sat, 22 Feb 2014 23:03:02 +0200 From: Eli Zaretskii In-reply-to: <20140222185511.GA23643@openwall.com> X-012-Sender: halo1@inter.net.il Message-id: <838ut23lo9.fsf@gnu.org> References: <85zjlo5ecy.fsf@gmail.com> <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> X-Spam-Score: 3.7 (+++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: > Date: Sat, 22 Feb 2014 22:55:11 +0400 > From: Aleksey Cherepanov > Cc: agustin.martin@hispalinux.es, 16800@debbugs.gnu.org > > > > BTW solution around reduction of jump points does not not affect > > > faces: "nd" or "badnd" at the end of "good badnd good " does not call > > > spell check on the first "badnd". > > > > Not sure I understand what you are saying here. What "first badnd"? > > you have only one in this example. > > "nd" does not cause spell check of "badnd". Another "badnd" at the end > does not cause spell check of the first "badnd". [...] Content analysis details: (3.7 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 2.7 RCVD_IN_PSBL RBL: Received via a relay in PSBL [80.179.55.181 listed in psbl.surriel.com] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 3.7 (+++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: > Date: Sat, 22 Feb 2014 22:55:11 +0400 > From: Aleksey Cherepanov > Cc: agustin.martin@hispalinux.es, 16800@debbugs.gnu.org > > > > BTW solution around reduction of jump points does not not affect > > > faces: "nd" or "badnd" at the end of "good badnd good " does not call > > > spell check on the first "badnd". > > > > Not sure I understand what you are saying here. What "first badnd"? > > you have only one in this example. > > "nd" does not cause spell check of "badnd". Another "badnd" at the end > does not cause spell check of the first "badnd". [...] Content analysis details: (3.7 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 2.7 RCVD_IN_PSBL RBL: Received via a relay in PSBL [80.179.55.181 listed in psbl.surriel.com] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) > Date: Sat, 22 Feb 2014 22:55:11 +0400 > From: Aleksey Cherepanov > Cc: agustin.martin@hispalinux.es, 16800@debbugs.gnu.org > > > > BTW solution around reduction of jump points does not not affect > > > faces: "nd" or "badnd" at the end of "good badnd good " does not call > > > spell check on the first "badnd". > > > > Not sure I understand what you are saying here. What "first badnd"? > > you have only one in this example. > > "nd" does not cause spell check of "badnd". Another "badnd" at the end > does not cause spell check of the first "badnd". Of course, it isn't; why should it? > > > Emacs words are language sensitive too. > > > > But not in the same way as ispell/flyspell is. The CASECHARS, > > NON-CASECHARS, and OTHERCHARS parameters of the dictionary are only > > taken into account by ispell/flyspell. > > I think one could define a dictionary like: ("my" "[a]" "[^a]" "" ...) > So the only letter for flyspell words is "a". That way "qqaaqqaaqq" is > one word for emacs and two words with garbage around for flyspell. I > think my solution fails in such case. It's more complex than that: with some languages, and at least with aspell, we take these parameters from the dictionary. So they cannot be known in advance in some cases. From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Agustin Martin Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Feb 2014 01:27:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 16800@debbugs.gnu.org Cc: Aleksey Cherepanov Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.13931187703954 (code B ref 16800); Sun, 23 Feb 2014 01:27:02 +0000 Received: (at 16800) by debbugs.gnu.org; 23 Feb 2014 01:26:10 +0000 Received: from localhost ([127.0.0.1]:36154 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHNpZ-00011g-Ac for submit@debbugs.gnu.org; Sat, 22 Feb 2014 20:26:09 -0500 Received: from mail-la0-f46.google.com ([209.85.215.46]:64318) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHNpW-000118-W3 for 16800@debbugs.gnu.org; Sat, 22 Feb 2014 20:26:07 -0500 Received: by mail-la0-f46.google.com with SMTP id b8so3968850lan.19 for <16800@debbugs.gnu.org>; Sat, 22 Feb 2014 17:26:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=llbQGLA6310BQB2spz72BZcFGZBZAVWscyljwg65cx0=; b=Xyl2Dvc+KcQmGjtuzM9ODebZh8YnoAJdUR19eQ1laqkJnWqOhLsKgqZOj9f2K1CHJ8 hP76i+r6UD76Obl0sM8AEEN4dEhrTRUM4H79Lk2Mb0hE1e+lgtG7iCDLr77IjKCBSVHw 2nTk5Rig32ktspufQ83fw7Ucd4r8EINN56qd0+5yKoYfQu//2wtziqcbR1Osr45zmCz6 D/TxOLv3NAuYQONZw/XzOtpPG37n3vXs9/85bnUuD6Mq6q3MH0tF1CQNZcQya5bNN2R4 dXvQnZxcFCY4mesGDdWpOGtue51Sl+riKRX99f2kzrHc36AlbuLSlCs3fI/Lf5Met82f T5dw== MIME-Version: 1.0 X-Received: by 10.152.190.69 with SMTP id go5mr8031621lac.79.1393118760923; Sat, 22 Feb 2014 17:26:00 -0800 (PST) Received: by 10.112.44.163 with HTTP; Sat, 22 Feb 2014 17:26:00 -0800 (PST) In-Reply-To: <838ut23lo9.fsf@gnu.org> References: <85zjlo5ecy.fsf@gmail.com> <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> Date: Sun, 23 Feb 2014 02:26:00 +0100 X-Google-Sender-Auth: k2E8KPPifoPzmrXnolMW-qhiyKo Message-ID: From: Agustin Martin Content-Type: multipart/mixed; boundary=001a1136c86ce00c1004f308bc55 X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a1136c86ce00c1004f308bc55 Content-Type: multipart/alternative; boundary=001a1136c86ce00c0d04f308bc53 --001a1136c86ce00c0d04f308bc53 Content-Type: text/plain; charset=ISO-8859-1 2014-02-22 22:03 GMT+01:00 Eli Zaretskii : > > Date: Sat, 22 Feb 2014 22:55:11 +0400 > > From: Aleksey Cherepanov > > > > > > Emacs words are language sensitive too. > > > > > > But not in the same way as ispell/flyspell is. The CASECHARS, > > > NON-CASECHARS, and OTHERCHARS parameters of the dictionary are only > > > taken into account by ispell/flyspell. > > > > I think one could define a dictionary like: ("my" "[a]" "[^a]" "" ...) > > So the only letter for flyspell words is "a". That way "qqaaqqaaqq" is > > one word for emacs and two words with garbage around for flyspell. I > > think my solution fails in such case. > > It's more complex than that: with some languages, and at least with > aspell, we take these parameters from the dictionary. So they cannot > be known in advance in some cases. > Hi, Not yet sure if I am missing something important, but I am playing with a regexp search in flyspell-word-search-* functions based on what flyspell thinks is the word to spellcheck (`word') and what thinks should not be part of a word (`NOTCASECHARS'). Since no OTHERCHARS is used there may be some intermediate matches being false positives that will be discarded once flyspell-word checks them. I have tested this in Alekseys's file and is apparently working well and in this particular case with much better efficiency. Need to think about more ad-hoc situations where it may fail or slow down things. Suggestions for possible failures are welcome. Patch is attached. I did the tests against an old and patched version of flyspell.el (that shipped with Debian stable) and built the patch for it. Should apply and work similarly in trunk's flyspell.el. -- Agustin --001a1136c86ce00c0d04f308bc53 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
2014= -02-22 22:03 GMT+01:00 Eli Zaretskii <eliz@gnu.org>:
> Date: Sat, 22 Feb 2014 22:55:11 +0400
> From: Aleksey Cherepanov <aleksey.4erepanov@gmail.com>
>
> > > Emacs words are language sensitive too.
> >
> > But not in the same way as ispell/flyspell is. =A0The CASECHARS,<= br> > > NON-CASECHARS, and OTHERCHARS parameters of the dictionary are on= ly
> > taken into account by ispell/flyspell.
>
> I think one could define a dictionary like: ("my" "[a]&= quot; "[^a]" "" ...)
> So the only letter for flyspell words is "a". That way "= ;qqaaqqaaqq" is
> one word for emacs and two words with garbage around for flyspell. I > think my solution fails in such case.

It's more complex than that: with some languages, and at least wi= th
aspell, we take these parameters from the dictionary. =A0So they cannot
be known in advance in some cases.

Hi,<= br>

Not yet sure if=A0 I am missing something impo= rtant, but I am playing with a regexp search in flyspell-word-search-* func= tions based on what flyspell thinks is the word to spellcheck (`word') = and what thinks should not be part of a word (`NOTCASECHARS'). Since no= OTHERCHARS is used there may be some intermediate matches being false posi= tives that will be discarded once flyspell-word checks them.

I have tested this in Alekseys's file and is apparently = working well and in this particular case with much better efficiency. Need = to think about more ad-hoc situations where it may fail or slow down things= . Suggestions for possible failures are welcome.

Patch is attached. I did t= he tests against an old and patched version of flyspell.el (that shipped wi= th Debian stable) and built the patch for it. Should apply and work similar= ly in trunk's flyspell.el.

--
Agustin
--001a1136c86ce00c0d04f308bc53-- --001a1136c86ce00c1004f308bc55 Content-Type: text/plain; charset=US-ASCII; name="flyspell.el_flyspell-word-search.2.diff" Content-Disposition: attachment; filename="flyspell.el_flyspell-word-search.2.diff" Content-Transfer-Encoding: base64 X-Attachment-Id: f_hrzooe670 LS0tIGZseXNwZWxsLmVsLm9yaWcJMjAxNC0wMi0yMyAwMjoxNzowMy42ODAxMDc1MTkgKzAxMDAK KysrIGZseXNwZWxsLmVsCTIwMTQtMDItMjMgMDI6NTA6NTAuNjM0NjI1MjQ4ICswMTAwCkBAIC0x MDUwLDggKzEwNTAsMTkgQEAKICAgKHNhdmUtZXhjdXJzaW9uCiAgICAgKGxldCAoKHIgJygpKQog CSAgKGluaGliaXQtcG9pbnQtbW90aW9uLWhvb2tzIHQpCisJICAoZmx5c3BlbGwtbm90LWNhc2Vj aGFycyAoZmx5c3BlbGwtZ2V0LW5vdC1jYXNlY2hhcnMpKQogCSAgcCkKLSAgICAgICh3aGlsZSAo YW5kIChub3QgcikgKHNldHEgcCAoc2VhcmNoLWJhY2t3YXJkIHdvcmQgYm91bmQgdCkpKQorICAg ICAgKHdoaWxlIAorCSAgKGFuZCAobm90IHIpIAorCSAgICAgICAoc2V0cSBwIAorCQkgICAgIChy ZS1zZWFyY2gtYmFja3dhcmQKKwkJICAgICAgKGNvbmNhdAorCQkgICAgICAgIlxcKCIgZmx5c3Bl bGwtbm90LWNhc2VjaGFycyAiXFx8XFxiXFwpIgorCQkgICAgICAgIlxcKCIgd29yZCAiXFwpIgor CQkgICAgICAgZmx5c3BlbGwtbm90LWNhc2VjaGFycworCQkgICAgICAgKQorCQkgICAgICBib3Vu ZCB0KSkpCisJKGdvdG8tY2hhciAobWF0Y2gtYmVnaW5uaW5nIDIpKQogCShsZXQgKChsdyAoZmx5 c3BlbGwtZ2V0LXdvcmQpKSkKIAkgIChpZiAoYW5kIChjb25zcCBsdykKIAkJICAgKGlmIGlnbm9y ZS1jYXNlCkBAIC0xMDY4LDggKzEwNzksMTkgQEAKICAgKHNhdmUtZXhjdXJzaW9uCiAgICAgKGxl dCAoKHIgJygpKQogCSAgKGluaGliaXQtcG9pbnQtbW90aW9uLWhvb2tzIHQpCisJICAoZmx5c3Bl bGwtbm90LWNhc2VjaGFycyAoZmx5c3BlbGwtZ2V0LW5vdC1jYXNlY2hhcnMpKQogCSAgcCkKLSAg ICAgICh3aGlsZSAoYW5kIChub3QgcikgKHNldHEgcCAoc2VhcmNoLWZvcndhcmQgd29yZCBib3Vu ZCB0KSkpCisgICAgICAod2hpbGUgCisJICAoYW5kIChub3QgcikgCisJICAgICAgIChzZXRxIHAg CisJCSAgICAgKHJlLXNlYXJjaC1mb3J3YXJkIAorCQkgICAgICAoY29uY2F0CisJCSAgICAgICBm bHlzcGVsbC1ub3QtY2FzZWNoYXJzCisJCSAgICAgICAiXFwoIiB3b3JkICJcXCkiCisJCSAgICAg ICAiXFwoIiBmbHlzcGVsbC1ub3QtY2FzZWNoYXJzICJcXHxcXGJcXCkiCisJCSAgICAgICApCisJ CSAgICAgIGJvdW5kIHQpKSkKKwkoZ290by1jaGFyIChtYXRjaC1iZWdpbm5pbmcgMSkpCiAJKGxl dCAoKGx3IChmbHlzcGVsbC1nZXQtd29yZCkpKQogCSAgKGlmIChhbmQgKGNvbnNwIGx3KSAoc3Ry aW5nLWVxdWFsIChjYXIgbHcpIHdvcmQpKQogCSAgICAgIChzZXRxIHIgcCkK --001a1136c86ce00c1004f308bc55-- From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Feb 2014 18:37:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org, aleksey.4erepanov@gmail.com Reply-To: Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139318058720070 (code B ref 16800); Sun, 23 Feb 2014 18:37:01 +0000 Received: (at 16800) by debbugs.gnu.org; 23 Feb 2014 18:36:27 +0000 Received: from localhost ([127.0.0.1]:36887 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHduc-0005Dd-HF for submit@debbugs.gnu.org; Sun, 23 Feb 2014 13:36:26 -0500 Received: from mtaout20.012.net.il ([80.179.55.166]:39011) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHduZ-0005DM-Dr for 16800@debbugs.gnu.org; Sun, 23 Feb 2014 13:36:24 -0500 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0N1G00E00OQ6V500@a-mtaout20.012.net.il> for 16800@debbugs.gnu.org; Sun, 23 Feb 2014 20:36:16 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N1G00EADP0GS220@a-mtaout20.012.net.il>; Sun, 23 Feb 2014 20:36:16 +0200 (IST) Date: Sun, 23 Feb 2014 20:36:04 +0200 From: Eli Zaretskii In-reply-to: X-012-Sender: halo1@inter.net.il Message-id: <83mwhh1xt7.fsf@gnu.org> References: <85zjlo5ecy.fsf@gmail.com> <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Sun, 23 Feb 2014 02:26:00 +0100 > From: Agustin Martin > Cc: Aleksey Cherepanov > > Not yet sure if I am missing something important, but I am playing with a > regexp search in flyspell-word-search-* functions based on what flyspell > thinks is the word to spellcheck (`word') and what thinks should not be > part of a word (`NOTCASECHARS'). Since no OTHERCHARS is used there may be > some intermediate matches being false positives that will be discarded once > flyspell-word checks them. > > I have tested this in Alekseys's file and is apparently working well and in > this particular case with much better efficiency. Need to think about more > ad-hoc situations where it may fail or slow down things. Suggestions for > possible failures are welcome. > > Patch is attached. I did the tests against an old and patched version of > flyspell.el (that shipped with Debian stable) and built the patch for it. > Should apply and work similarly in trunk's flyspell.el. Thanks, it's good to know it's possible to speed up the search for duplicate mis-spellings without sacrificing correctness. However, for any speedup that we will be able to come up with, there can always be a buffer large enough to make the delay annoyingly long. Therefore, I think the default of flyspell-duplicate-distance should not be -1, but some finite and reasonably small value. Or maybe we should turn off this feature by default. From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Feb 2014 19:58:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org, Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139318543328730 (code B ref 16800); Sun, 23 Feb 2014 19:58:02 +0000 Received: (at 16800) by debbugs.gnu.org; 23 Feb 2014 19:57:13 +0000 Received: from localhost ([127.0.0.1]:36948 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHfAm-0007TJ-8v for submit@debbugs.gnu.org; Sun, 23 Feb 2014 14:57:12 -0500 Received: from mail-la0-f50.google.com ([209.85.215.50]:38857) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHfAj-0007Sz-1c for 16800@debbugs.gnu.org; Sun, 23 Feb 2014 14:57:10 -0500 Received: by mail-la0-f50.google.com with SMTP id y1so631788lam.37 for <16800@debbugs.gnu.org>; Sun, 23 Feb 2014 11:57:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=3hZJBPbMpa61YYp350tKJ2aV2neKhwAM05H7kk/ZNCw=; b=pMLuSvUlN5/6kvZsgm8KaHDqNKPEm+wf4+lLOOFNnTseemFTheRhY0PNKmaAS9Pbzg QmaC1cEvG/ezWXJ41jTwK84AZLRBEDk8Ro8yzPo61fcqmQcq0yiI//qNQd8BnCQbkjaS uaAMDbLPr3+kc14IsMeX+QkIYencqJ8NcWJgUvtdTCH46rWg6eawezVs3VIfWqm63c1l zKQptuaJUcARea53echH9q9uCkSv/XBOqjm4CXc8CT5SlWnoSKsucaKbXX8KLFmWaNN0 dZLXtuMEu0z5J9Ru13Ug/wGW4xcBEqUmnLOosRPqfKs3e92IEaKmDQKw501fVwwIc/HT wecw== X-Received: by 10.112.134.134 with SMTP id pk6mr9414741lbb.85.1393185422714; Sun, 23 Feb 2014 11:57:02 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id yq2sm22012891lab.3.2014.02.23.11.57.01 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sun, 23 Feb 2014 11:57:01 -0800 (PST) Date: Sun, 23 Feb 2014 23:56:59 +0400 From: Aleksey Cherepanov Message-ID: <20140223195659.GA23581@openwall.com> References: <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Sun, Feb 23, 2014 at 02:26:00AM +0100, Agustin Martin wrote: > 2014-02-22 22:03 GMT+01:00 Eli Zaretskii : > > > > Date: Sat, 22 Feb 2014 22:55:11 +0400 > > > From: Aleksey Cherepanov > > > > > > > > Emacs words are language sensitive too. > > > > > > > > But not in the same way as ispell/flyspell is. The CASECHARS, > > > > NON-CASECHARS, and OTHERCHARS parameters of the dictionary are only > > > > taken into account by ispell/flyspell. > > > > > > I think one could define a dictionary like: ("my" "[a]" "[^a]" "" ...) > > > So the only letter for flyspell words is "a". That way "qqaaqqaaqq" is > > > one word for emacs and two words with garbage around for flyspell. I > > > think my solution fails in such case. > > > > It's more complex than that: with some languages, and at least with > > aspell, we take these parameters from the dictionary. So they cannot > > be known in advance in some cases. > > > > Hi, > > Not yet sure if I am missing something important, but I am playing with a > regexp search in flyspell-word-search-* functions based on what flyspell > thinks is the word to spellcheck (`word') and what thinks should not be > part of a word (`NOTCASECHARS'). Since no OTHERCHARS is used there may be > some intermediate matches being false positives that will be discarded once > flyspell-word checks them. > > I have tested this in Alekseys's file and is apparently working well and in > this particular case with much better efficiency. Need to think about more > ad-hoc situations where it may fail or slow down things. Suggestions for > possible failures are welcome. > > Patch is attached. I did the tests against an old and patched version of > flyspell.el (that shipped with Debian stable) and built the patch for it. > Should apply and work similarly in trunk's flyspell.el. > > --- flyspell.el.orig 2014-02-23 02:17:03.680107519 +0100 > +++ flyspell.el 2014-02-23 02:50:50.634625248 +0100 > @@ -1050,8 +1050,19 @@ > (save-excursion > (let ((r '()) > (inhibit-point-motion-hooks t) > + (flyspell-not-casechars (flyspell-get-not-casechars)) I'd move concat here too so it is out of inner loop. > p) > - (while (and (not r) (setq p (search-backward word bound t))) > + (while > + (and (not r) > + (setq p > + (re-search-backward > + (concat > + "\\(" flyspell-not-casechars "\\|\\b\\)" I think \b here could be replaced with \` (beginning of buffer). I think it is the only boundary we need that is not described by not-casechars, word sequence. Similarly \' (end of buffer) could be used for forward search. Also not capturing group ("\\(?:") could be used because we do not need a match data of the first group. It should work faster but I don't really know. Maybe it would be faster to not capture word but capture one char or void but I doubt the difference would be noticable. > + "\\(" word "\\)" I think regexp-quote around the word is necessary here. > + flyspell-not-casechars > + ) > + bound t))) > + (goto-char (match-beginning 2)) s/2/1/ if the first group is not capturing. > (let ((lw (flyspell-get-word))) > (if (and (consp lw) > (if ignore-case > @@ -1068,8 +1079,19 @@ > (save-excursion > (let ((r '()) > (inhibit-point-motion-hooks t) > + (flyspell-not-casechars (flyspell-get-not-casechars)) concat here as above. > p) > - (while (and (not r) (setq p (search-forward word bound t))) > + (while > + (and (not r) > + (setq p > + (re-search-forward > + (concat > + flyspell-not-casechars > + "\\(" word "\\)" regexp-quote as above. > + "\\(" flyspell-not-casechars "\\|\\b\\)" I think \b could be replaced by \' here as described above. The second group could be not capturing here. > + ) > + bound t))) > + (goto-char (match-beginning 1)) I guess match-end should here. > (let ((lw (flyspell-get-word))) > (if (and (consp lw) (string-equal (car lw) word)) > (setq r p) I guess that \b would work faster than the group so we could have 'if' statement around the whole loop that has one implementation with \b for case when casechars are "[[:alpha:]]" and not-casechars are "[^[:alpha:]]" and another implementation as above for other cases. But it seems cumbersome. Thanks! -- Regards, Aleksey Cherepanov From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Feb 2014 20:41:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org, Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.1393188011800 (code B ref 16800); Sun, 23 Feb 2014 20:41:02 +0000 Received: (at 16800) by debbugs.gnu.org; 23 Feb 2014 20:40:11 +0000 Received: from localhost ([127.0.0.1]:36970 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHfqM-0000Cp-L5 for submit@debbugs.gnu.org; Sun, 23 Feb 2014 15:40:11 -0500 Received: from mail-la0-f48.google.com ([209.85.215.48]:57871) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHfqJ-0000C9-Dv for 16800@debbugs.gnu.org; Sun, 23 Feb 2014 15:40:08 -0500 Received: by mail-la0-f48.google.com with SMTP id gf5so1586811lab.7 for <16800@debbugs.gnu.org>; Sun, 23 Feb 2014 12:40:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=UHMtVDqz4N8n4i+5An4pOw9rrQyMFMAAMIv4IlPTqWA=; b=Bimntl+ph/HzDZmx2GBkZMI0+cmVV9hiak4aCyvQo4hEPser/+WhqxBjmOaGx0bIRN aLfjpVdFwiaH1dPmk8Zqr5ximL1CNJgXn7XxDZ90w5cmSKfdqrc0IQicin8tQOfEewPm PRrPJgeQ+xC/KOTkdWwHqA81z9E4rW1ER9m2NV4NQla34Up9YTY32SNHjR2l+Ysu49B5 cFYh468T5GrC4/q1HtwVU6hzPvXhUo0hOOy/q9Hv2HRoGTvUozfmrpEdmheTPOX+vW4I L5vBZZ4xze8FY93ECDewBsshxa4RPkJ1yZ+RGvta24fOp3Sv2TuRUfB5TyN/89TzHD0b uNFA== X-Received: by 10.112.125.225 with SMTP id mt1mr9533251lbb.35.1393188001142; Sun, 23 Feb 2014 12:40:01 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id jf8sm16061492lbc.8.2014.02.23.12.40.00 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sun, 23 Feb 2014 12:40:00 -0800 (PST) Date: Mon, 24 Feb 2014 00:39:58 +0400 From: Aleksey Cherepanov Message-ID: <20140223203958.GA26665@openwall.com> References: <83ob204vrv.fsf@gnu.org> <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Sun, Feb 23, 2014 at 02:26:00AM +0100, Agustin Martin wrote: > --- flyspell.el.orig 2014-02-23 02:17:03.680107519 +0100 > +++ flyspell.el 2014-02-23 02:50:50.634625248 +0100 > @@ -1050,8 +1050,19 @@ > (save-excursion > (let ((r '()) > (inhibit-point-motion-hooks t) > + (flyspell-not-casechars (flyspell-get-not-casechars)) > p) > - (while (and (not r) (setq p (search-backward word bound t))) > + (while > + (and (not r) > + (setq p > + (re-search-backward > + (concat > + "\\(" flyspell-not-casechars "\\|\\b\\)" > + "\\(" word "\\)" > + flyspell-not-casechars > + ) > + bound t))) > + (goto-char (match-beginning 2)) I am not yet sure that it is important. But as written we store position - 1 instead of position if we matched with flyspell-not-casechars branch. (goto-char ...) could be placed inside (setq p ...) to fix it: (setq p (and (re-search ...) (goto-char ...))) I do not know real difference yet though. Respectively we store position + 1 in forward search. Thanks! -- Regards, Aleksey Cherepanov From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Feb 2014 23:04:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org, Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139319658415482 (code B ref 16800); Sun, 23 Feb 2014 23:04:02 +0000 Received: (at 16800) by debbugs.gnu.org; 23 Feb 2014 23:03:04 +0000 Received: from localhost ([127.0.0.1]:37023 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHi4e-00041e-B8 for submit@debbugs.gnu.org; Sun, 23 Feb 2014 18:03:04 -0500 Received: from mail-la0-f43.google.com ([209.85.215.43]:40122) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHi4a-000415-Lh for 16800@debbugs.gnu.org; Sun, 23 Feb 2014 18:03:01 -0500 Received: by mail-la0-f43.google.com with SMTP id pv20so4688342lab.2 for <16800@debbugs.gnu.org>; Sun, 23 Feb 2014 15:02:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=gX8OXt6LM/Owk+QtnJ/rECpZznDQLe/WmlSPsUmF++8=; b=mb+XX92gLuMvUhACBscv+lkWqoUx0q+tIWj7lt7m+GO3S7SfYHUWvInwp8r4joM1X2 DoCS42vJYOusH61Y9FukAxdd6aEs1KGnOI6d2la4t0F7GSd+PBOY33BzN1XQ66eX6lv1 UCPcm+VYQ4/qOdDRjfGnKwCbtXl8ZURVxNQPZuPRensVdhM7W/luS+W1WAXBOyZCAeJg 89ICThkHRvmNfd0O+Jma/4+priRD0alvFq1eIZbxOFVs5zAr1xGwf6A+MyLB+Ei2G/iD UKxVDegErTo4CP5Es6lRdgg75uOVsrPBrWZDkMeUaTWOAGAC+MleKfJtDHxQL+HeGxnW bHGg== X-Received: by 10.113.5.167 with SMTP id cn7mr9567316lbd.1.1393196574255; Sun, 23 Feb 2014 15:02:54 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id jt7sm11595790lbc.15.2014.02.23.15.02.53 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sun, 23 Feb 2014 15:02:53 -0800 (PST) Date: Mon, 24 Feb 2014 03:02:51 +0400 From: Aleksey Cherepanov Message-ID: <20140223230251.GA30257@openwall.com> References: <20140221143855.GA6018@agmartin.aq.upm.es> <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140223195659.GA23581@openwall.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) I've performed some tests against my .org file (not in emacs -Q): (insert (mapconcat (lambda (re) (save-excursion (let ((time (current-time)) (count 0)) (while (re-search-backward re nil t) (setq count (1+ count))) (format "%d: %S :: %s" count (subtract-time (current-time) time) re)))) '("\\<[[:alpha:]]" "\\b[[:alpha:]]" "\\([^[:alpha:]]\\|\\b\\)[[:alpha:]]" "\\([^[:alpha:]]\\|\\`\\)[[:alpha:]]" "\\(?:[^[:alpha:]]\\|\\`\\)[[:alpha:]]" "\\(?:[^[:alpha:]]\\)[[:alpha:]]" "[^[:alpha:]][[:alpha:]]" "\\(?:\\b\\|'\\)[[:alpha:]]" "\\(?:[^[:alpha:]]\\|\\`\\)\\([[:alpha:]]+\\)" "\\([^[:alpha:]]\\|\\`\\)\\(?:[[:alpha:]]+\\)" "\\([^[:alpha:]]\\|\\`\\)[[:alpha:]]+") "\n")) Matches| Time | Regexp tried 299158: (0 2 841190 614000) :: \<[[:alpha:]] 299158: (0 2 876846 547000) :: \b[[:alpha:]] 307919: (0 3 321676 163000) :: \([^[:alpha:]]\|\b\)[[:alpha:]] 307899: (0 3 291931 838000) :: \([^[:alpha:]]\|\`\)[[:alpha:]] 307899: (0 2 821347 257000) :: \(?:[^[:alpha:]]\|\`\)[[:alpha:]] 307899: (0 2 760125 839000) :: \(?:[^[:alpha:]]\)[[:alpha:]] 307899: (0 2 765410 758000) :: [^[:alpha:]][[:alpha:]] 299518: (0 2 998895 976000) :: \(?:\b\|'\)[[:alpha:]] 307899: (0 3 174172 939000) :: \(?:[^[:alpha:]]\|\`\)\([[:alpha:]]+\) 307899: (0 3 250515 907000) :: \([^[:alpha:]]\|\`\)\(?:[[:alpha:]]+\) 307899: (0 3 218270 136000) :: \([^[:alpha:]]\|\`\)[[:alpha:]]+ I should admit that word search breaks things even for setup with [[:alpha:]]: a0a is 1 word for emacs and 2 for flyspell. I missed it because Russian behaves differently (there is word boundary on border between digits and Russian letters). My bad. 307899: (0 2 760125 839000) :: \(?:[^[:alpha:]]\)[[:alpha:]] 307899: (0 2 765410 758000) :: [^[:alpha:]][[:alpha:]] These two suggest that it may provide a speed up if we do not check beginning of buffer in regexp but check it separately. But I doubt it is worth it. On Sun, Feb 23, 2014 at 11:56:59PM +0400, Aleksey Cherepanov wrote: > Also not capturing group ("\\(?:") could be used because we do not > need a match data of the first group. It should work faster but I > don't really know. 307899: (0 3 291931 838000) :: \([^[:alpha:]]\|\`\)[[:alpha:]] 307899: (0 2 821347 257000) :: \(?:[^[:alpha:]]\|\`\)[[:alpha:]] The test shows that not capturing group is faster. > Maybe it would be faster to not capture word but capture one char or > void but I doubt the difference would be noticable. 307899: (0 3 174172 939000) :: \(?:[^[:alpha:]]\|\`\)\([[:alpha:]]+\) 307899: (0 3 250515 907000) :: \([^[:alpha:]]\|\`\)\(?:[[:alpha:]]+\) 307899: (0 3 218270 136000) :: \([^[:alpha:]]\|\`\)[[:alpha:]]+ Unexpectedly capturing of word works a bit faster. Maybe it is not a word but the second group and it would work differently for search forward. Or alpha+ instead of fixed word caused it. Anyway the difference is very small. Capturing word allows us to make a function to wrap a word into regexp like word-search-regexp function wraps a word for word-search-forward/-backward functions. > I guess that \b would work faster than the group so we could have 'if' > statement around the whole loop that has one implementation with \b > for case when casechars are "[[:alpha:]]" and not-casechars are > "[^[:alpha:]]" and another implementation as above for other cases. > But it seems cumbersome. My guess is wrong: \b works slower than the group. Also it is inappropriate at all. Thanks! -- Regards, Aleksey Cherepanov From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 24 Feb 2014 16:04:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org, Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139325781217409 (code B ref 16800); Mon, 24 Feb 2014 16:04:01 +0000 Received: (at 16800) by debbugs.gnu.org; 24 Feb 2014 16:03:32 +0000 Received: from localhost ([127.0.0.1]:37982 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHy0B-0004Wi-9B for submit@debbugs.gnu.org; Mon, 24 Feb 2014 11:03:32 -0500 Received: from mail-la0-f54.google.com ([209.85.215.54]:55912) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHy07-0004WR-0J for 16800@debbugs.gnu.org; Mon, 24 Feb 2014 11:03:28 -0500 Received: by mail-la0-f54.google.com with SMTP id mc6so2539172lab.13 for <16800@debbugs.gnu.org>; Mon, 24 Feb 2014 08:03:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=BIbyzYMNZsJxY09IsdFgbZ4dyHXFEzP7WAWm+Tkgk2E=; b=Ov1gE2imE24ZNhcuryfUoW0CFVgNs2Mm1/pnlrS1t0wa+I9B0hTtoMr4QM5C2sqQtm pPXTWj6D6BehaeEeZQsODcotxF89IafajqMZItbvtmT0SrRDiP4vN9MrXRJ63KTqCxh2 Q3lMhEMbViHauwrqHBnZkoNmRQnCmKZSuCA/8gkVKSsDE6crakWHgDbpXjSQUPIknDr8 DfCRy+ISbtDDy4cPrrWCAxYE4wsn418OFFQZ3/KJeMtEE0b9v8ok7so55PuA560Y8m0G ERcvkagk1eyPnDbC3+KC3jel/1MrNt6xtsnzPGBb8MwSvKCKYPSKXVLlK582IfBT7q2m yQ8Q== X-Received: by 10.112.72.170 with SMTP id e10mr11891369lbv.43.1393257800736; Mon, 24 Feb 2014 08:03:20 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id cl5sm19046701lbb.14.2014.02.24.08.03.19 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Mon, 24 Feb 2014 08:03:19 -0800 (PST) Date: Mon, 24 Feb 2014 20:03:17 +0400 From: Aleksey Cherepanov Message-ID: <20140224160317.GA2475@openwall.com> References: <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="SUOF0GtieIMvvwua" Content-Disposition: inline In-Reply-To: <20140223230251.GA30257@openwall.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --SUOF0GtieIMvvwua Content-Type: text/plain; charset=us-ascii Content-Disposition: inline I played with different (maybe wrong) implementations of flyspell-word-search-backward and measured time against t.txt (produced by the one-liner). All implementations are attached. perl -e 'print(((join " ", ("met and") x 10) . "\n") x 30000)' > t.txt my-test-agustin - Implementation from Agustin Martin with regexp-quote my-test-concat-up - and concat moved upper my-test-concat-up-goto - and goto-char moved into setq my-test-concat-up-goto-notcap - and ?: added to the first group my-test-concat-up-goto-notcap-bob - and \b replaced by \` my-test-concat-up-goto-notcap-bob-bobp - and goto-char replaced with conditional forward-char (on bobp) my-test-concat-up-goto-notcap-nobob-bobp - and the first group is removed, this case is handled separately, my-test-concat-up-goto-notcap-nobob-nobobp - and bobp check is replaced by progn due to separate handling my-test-goto-notcap-nobob-nobobp - and concat moved down (back), my-test-concat-up-goto-notcap-nobob-bobp-fixed - fixed for correct handling of beginning of buffer. # |String| Time |Result| Function name 1 nd (0 0 192227 640000) nil my-test-agustin 2 nd (0 0 192569 63000) nil my-test-concat-up 3 nd (0 0 193895 468000) nil my-test-concat-up-goto 4 nd (0 0 194372 743000) nil my-test-concat-up-goto-notcap 5 nd (0 0 151535 868000) nil my-test-concat-up-goto-notcap-bob 6 nd (0 0 131831 49000) nil my-test-concat-up-goto-notcap-bob-bobp 7 nd (0 0 92012 191000) nil my-test-concat-up-goto-notcap-nobob-bobp 8 nd (0 0 93928 281000) nil my-test-concat-up-goto-notcap-nobob-nobobp 9 nd (0 0 93796 52000) nil my-test-goto-notcap-nobob-nobobp 10 nd (0 0 94061 645000) nil my-test-concat-up-goto-notcap-nobob-bobp-fixed It is from Messages of (my-try "nd") in t.txt. The last 4 functions are quite close and often mixes differently due to fluctuations. Really they could not be measured against this file because re-search-forward always should return nil, I think. Functions 7, 8, 9 are not correct: they find a word if we search a word at the beginning of buffer staying at the middle of it. Function 10 has logic to handle this case. Other corner cases should be thought and tried too. The times could be different for other files and other words. On Mon, Feb 24, 2014 at 03:02:51AM +0400, Aleksey Cherepanov wrote: > I've performed some tests against my .org file (not in emacs -Q): > On Sun, Feb 23, 2014 at 11:56:59PM +0400, Aleksey Cherepanov wrote: > > Maybe it would be faster to not capture word but capture one char or > > void but I doubt the difference would be noticable. > > 307899: (0 3 174172 939000) :: \(?:[^[:alpha:]]\|\`\)\([[:alpha:]]+\) > 307899: (0 3 250515 907000) :: \([^[:alpha:]]\|\`\)\(?:[[:alpha:]]+\) > 307899: (0 3 218270 136000) :: \([^[:alpha:]]\|\`\)[[:alpha:]]+ > Unexpectedly capturing of word works a bit faster. Maybe it is not a > word but the second group and it would work differently for search > forward. Or alpha+ instead of fixed word caused it. Anyway the > difference is very small. We could avoid capturing at all. And it works faster as shown by 4 last functions. Thanks! -- Regards, Aleksey Cherepanov --SUOF0GtieIMvvwua Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="t.el" ;; Implementation from Agustin Martin with additional regexp-quote (defun my-test-agustin (word bound &optional ignore-case) (save-excursion (let ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) p) (while (and (not r) (setq p (re-search-backward (concat "\\(" flyspell-not-casechars "\\|\\b\\)" "\\(" (regexp-quote word) "\\)" flyspell-not-casechars ) bound t))) (goto-char (match-beginning 2)) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-concat-up (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat "\\(" flyspell-not-casechars "\\|\\b\\)" "\\(" (regexp-quote word) "\\)" flyspell-not-casechars)) p) (while (and (not r) (setq p (re-search-backward word-re bound t))) (goto-char (match-beginning 2)) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-concat-up-goto (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat "\\(" flyspell-not-casechars "\\|\\b\\)" "\\(" (regexp-quote word) "\\)" flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (goto-char (match-beginning 2))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-concat-up-goto-notcap (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat "\\(?:" flyspell-not-casechars "\\|\\b\\)" "\\(" (regexp-quote word) "\\)" flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (goto-char (match-beginning 2))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-concat-up-goto-notcap-bob (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat "\\(?:" flyspell-not-casechars "\\|\\`\\)" "\\(" (regexp-quote word) "\\)" flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (goto-char (match-beginning 2))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-concat-up-goto-notcap-bob-bobp (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat "\\(?:" flyspell-not-casechars "\\|\\`\\)" (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (unless (bobp) (forward-char) (point))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) ;; Wrong (defun my-test-concat-up-goto-notcap-nobob-bobp (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (unless (bobp) (forward-char) (point))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) (unless r (setq p (goto-char (point-min))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p)))) r))) ;; Wrong (defun my-test-concat-up-goto-notcap-nobob-nobobp (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (progn (forward-char) (point))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) (unless r (setq p (goto-char (point-min))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p)))) r))) ;; Wrong (defun my-test-goto-notcap-nobob-nobobp (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars) bound t) (progn (forward-char) (point))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) (unless r (setq p (goto-char (point-min))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p)))) r))) (defun my-test-concat-up-goto-notcap-nobob-bobp-fixed (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (unless (bobp) (forward-char) (point))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) (unless r (let ((pos (point))) (setq p (goto-char (point-min))) (and (search-forward word (length word) t) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p)))))) r))) (defun my-try (word) (message "%s" (mapconcat (lambda (func) (end-of-buffer) (let* ((time (current-time)) (res (apply func '("nd" nil)))) (format ":>: %s %S =%S %S" word (subtract-time (current-time) time) res func))) (let ((lst '(my-test-agustin my-test-concat-up my-test-concat-up-goto my-test-concat-up-goto-notcap my-test-concat-up-goto-notcap-bob my-test-concat-up-goto-notcap-bob-bobp my-test-concat-up-goto-notcap-nobob-bobp my-test-concat-up-goto-notcap-nobob-nobobp my-test-goto-notcap-nobob-nobobp my-test-concat-up-goto-notcap-nobob-bobp-fixed))) (concatenate 'list lst lst)) "\n"))) --SUOF0GtieIMvvwua-- From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Agustin Martin Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 26 Feb 2014 20:33:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Aleksey Cherepanov , 16800@debbugs.gnu.org Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139344672720575 (code B ref 16800); Wed, 26 Feb 2014 20:33:02 +0000 Received: (at 16800) by debbugs.gnu.org; 26 Feb 2014 20:32:07 +0000 Received: from localhost ([127.0.0.1]:41861 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WIl9D-0005Ln-9g for submit@debbugs.gnu.org; Wed, 26 Feb 2014 15:32:07 -0500 Received: from fibonacci.ccupm.upm.es ([138.100.198.70]:33477) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WIl9A-0005La-NC for 16800@debbugs.gnu.org; Wed, 26 Feb 2014 15:32:05 -0500 Received: from agmartin.aq.upm.es (Agmartin.aq.upm.es [138.100.41.131]) by smtp.upm.es (8.14.3/8.14.3/fibonacci-001) with ESMTP id s1QKW2mA016894; Wed, 26 Feb 2014 21:32:03 +0100 Received: by agmartin.aq.upm.es (Postfix, from userid 1000) id DD177407A6; Wed, 26 Feb 2014 21:32:02 +0100 (CET) Date: Wed, 26 Feb 2014 21:32:02 +0100 From: Agustin Martin Message-ID: <20140226203202.GA23749@agmartin.aq.upm.es> References: <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140224160317.GA2475@openwall.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) On Mon, Feb 24, 2014 at 08:03:17PM +0400, Aleksey Cherepanov wrote: > I played with different (maybe wrong) implementations of > flyspell-word-search-backward and measured time against t.txt > (produced by the one-liner). All implementations are attached. [ ... Tons of extensive and impressive debugging ... ] > We could avoid capturing at all. And it works faster as shown by 4 > last functions. Hi, Thanks a lot for the extensive debugging and for all the suggestions. I have been playing with something based in your last function, but trying to get something more compact, see below current status ;; ----------------------------------- (defun my-test-concat-up-goto-notcap-nobob-bobp-if (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (if (re-search-backward word-re bound t) (progn (forward-char) (point)) ;; Check if word is at bob (goto-char (point-min)) (search-forward word (length word) t)))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) ;; ----------------------------------- I did some efficiency test and it seemed similar to those of your efficient functions. Need to check further for corner cases, bugs, etc ... Big thanks! -- Agustin From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Agustin Martin Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 28 Feb 2014 11:46:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Aleksey Cherepanov , 16800@debbugs.gnu.org Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139358795113923 (code B ref 16800); Fri, 28 Feb 2014 11:46:02 +0000 Received: (at 16800) by debbugs.gnu.org; 28 Feb 2014 11:45:51 +0000 Received: from localhost ([127.0.0.1]:43709 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJLt0-0003cU-7v for submit@debbugs.gnu.org; Fri, 28 Feb 2014 06:45:50 -0500 Received: from edison.ccupm.upm.es ([138.100.198.71]:56767) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJLsx-0003cL-H3 for 16800@debbugs.gnu.org; Fri, 28 Feb 2014 06:45:48 -0500 Received: from agmartin.aq.upm.es (Agmartin.aq.upm.es [138.100.41.131]) by smtp.upm.es (8.14.3/8.14.3/edison-001) with ESMTP id s1SBjjZ1016831; Fri, 28 Feb 2014 12:45:45 +0100 Received: by agmartin.aq.upm.es (Postfix, from userid 1000) id A2C53401C2; Fri, 28 Feb 2014 12:45:45 +0100 (CET) Date: Fri, 28 Feb 2014 12:45:45 +0100 From: Agustin Martin Message-ID: <20140228114545.GA8669@agmartin.aq.upm.es> References: <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="TB36FDmn/VVEgNH/" Content-Disposition: inline In-Reply-To: <20140226203202.GA23749@agmartin.aq.upm.es> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --TB36FDmn/VVEgNH/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Feb 26, 2014 at 09:32:02PM +0100, Agustin Martin wrote: > On Mon, Feb 24, 2014 at 08:03:17PM +0400, Aleksey Cherepanov wrote: > > I played with different (maybe wrong) implementations of > > flyspell-word-search-backward and measured time against t.txt > > (produced by the one-liner). All implementations are attached. > > [ ... Tons of extensive and impressive debugging ... ] > > > We could avoid capturing at all. And it works faster as shown by 4 > > last functions. > > Hi, > > Thanks a lot for the extensive debugging and for all the suggestions. I > have been playing with something based in your last function, but trying > to get something more compact, see below current status [ ... ] > I did some efficiency test and it seemed similar to those of your efficient > functions. Need to check further for corner cases, bugs, etc ... Hi, Aleksey Please find attached my first candidate for commit. Is similar to what I sent before, but needed to add an explicit check for word at eob in `flyspell-word-search-forward'. Will try to have more testing before committing. Seems to work well with the file generated by your one-liner, even with corner cases like new misspellings added at bob or eob, but the wider the testing the better. Hope no one will generate files with words containing something in OTHERCHARS. Thanks for all your help -- Agustin --TB36FDmn/VVEgNH/ Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="flyspell.el_flyspell-word-search.3.diff" --- flyspell.el.orig 2014-02-26 19:05:29.651986038 +0100 +++ flyspell.el 2014-02-28 12:01:03.930010553 +0100 @@ -1048,10 +1048,21 @@ ;;*---------------------------------------------------------------------*/ (defun flyspell-word-search-backward (word bound &optional ignore-case) (save-excursion - (let ((r '()) - (inhibit-point-motion-hooks t) - p) - (while (and (not r) (setq p (search-backward word bound t))) + (let* ((r '()) + (inhibit-point-motion-hooks t) + (flyspell-not-casechars (flyspell-get-not-casechars)) + (word-re (concat flyspell-not-casechars + (regexp-quote word) + flyspell-not-casechars)) + p) + (while + (and (not r) + (setq p (if (re-search-backward word-re bound t) + ;; word-re match begins one char before word + (progn (forward-char) (point)) + ;; Check above does not match similar word at b-o-b + (goto-char (point-min)) + (search-forward word (length word) t)))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case @@ -1066,10 +1077,25 @@ ;;*---------------------------------------------------------------------*/ (defun flyspell-word-search-forward (word bound) (save-excursion - (let ((r '()) - (inhibit-point-motion-hooks t) - p) - (while (and (not r) (setq p (search-forward word bound t))) + (let* ((r '()) + (inhibit-point-motion-hooks t) + (word-end (nth 2 (flyspell-get-word))) + (flyspell-not-casechars (flyspell-get-not-casechars)) + (word-re (concat flyspell-not-casechars + (regexp-quote word) + flyspell-not-casechars)) + p) + (while + (and (not r) + (setq p (if (= word-end (point-max)) + nil ;; Current word is at e-o-b. No forward search + (if (re-search-forward word-re bound t) + ;; word-re match ends one char after word + (progn (backward-char) (point)) + ;; Check above does not match similar word at e-o-b + (goto-char (point-max)) + (search-backward word (- (point-max) + (length word)) t))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) --TB36FDmn/VVEgNH/-- From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 28 Feb 2014 11:52:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org, aleksey.4erepanov@gmail.com Reply-To: Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139358830314513 (code B ref 16800); Fri, 28 Feb 2014 11:52:01 +0000 Received: (at 16800) by debbugs.gnu.org; 28 Feb 2014 11:51:43 +0000 Received: from localhost ([127.0.0.1]:43717 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJLyh-0003m1-ES for submit@debbugs.gnu.org; Fri, 28 Feb 2014 06:51:43 -0500 Received: from mtaout21.012.net.il ([80.179.55.169]:35313) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJLyf-0003lq-C4 for 16800@debbugs.gnu.org; Fri, 28 Feb 2014 06:51:42 -0500 Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0N1P00400ESVT600@a-mtaout21.012.net.il> for 16800@debbugs.gnu.org; Fri, 28 Feb 2014 13:51:39 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N1P004K2FM3NV80@a-mtaout21.012.net.il>; Fri, 28 Feb 2014 13:51:39 +0200 (IST) Date: Fri, 28 Feb 2014 13:51:41 +0200 From: Eli Zaretskii In-reply-to: <20140228114545.GA8669@agmartin.aq.upm.es> X-012-Sender: halo1@inter.net.il Message-id: <83fvn3wj3m.fsf@gnu.org> References: <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Fri, 28 Feb 2014 12:45:45 +0100 > From: Agustin Martin > > Please find attached my first candidate for commit. Is similar to what I > sent before, but needed to add an explicit check for word at eob in > `flyspell-word-search-forward'. > > Will try to have more testing before committing. Seems to work well with the > file generated by your one-liner, even with corner cases like new > misspellings added at bob or eob, but the wider the testing the better. > > Hope no one will generate files with words containing something in > OTHERCHARS. > > Thanks for all your help Thanks to both of you, but I still think that having flyspell search without limits for duplicate mis-spellings is not a good idea. We have no control on how big user buffers could be. So I think we should limit that search by default. From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 28 Feb 2014 23:12:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.1393629110317 (code B ref 16800); Fri, 28 Feb 2014 23:12:02 +0000 Received: (at 16800) by debbugs.gnu.org; 28 Feb 2014 23:11:50 +0000 Received: from localhost ([127.0.0.1]:45046 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJWar-000052-OI for submit@debbugs.gnu.org; Fri, 28 Feb 2014 18:11:50 -0500 Received: from mail-lb0-f173.google.com ([209.85.217.173]:52964) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJWao-0008WV-Gr for 16800@debbugs.gnu.org; Fri, 28 Feb 2014 18:11:47 -0500 Received: by mail-lb0-f173.google.com with SMTP id p9so3152115lbv.18 for <16800@debbugs.gnu.org>; Fri, 28 Feb 2014 15:11:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=IKqrCq06YDoP+9ocfuegmp1Hz7qMJLL5xdhFt70tah4=; b=rIK+aF9KDOnwqsRDYJjhur480BYqHNu/Hd3rU7xVFpc8l7MtdnEPfhN3T0h1ocLiS5 9UC86EkkGRsnxf1vAfWwMQDZDq7lMd4s85+AScDciIf3EABjTaUxA2sXMdR2hK3G8taU uJat/q2iZOQ9K+dO6i8KRvhBUeNJQo+AooIV8+jzHdz1cb0aK8yst38nnTi8vQSmGsCq AqN3xJXbWWCu0TWGDKNgokUvcrb585FqmJPe1KPuifqSTyco/rvurXnDUs9AiMxrKG9l 4dsP7o8d+hbBUIZsE5EMc3kx2670NkfANrHvkd8sk9jDkKschosfPoqD74yIWJx1HR3J jejw== X-Received: by 10.112.172.98 with SMTP id bb2mr93887lbc.69.1393629105191; Fri, 28 Feb 2014 15:11:45 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id gi5sm5709086lbc.4.2014.02.28.15.11.43 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Fri, 28 Feb 2014 15:11:43 -0800 (PST) Date: Sat, 1 Mar 2014 03:11:41 +0400 From: Aleksey Cherepanov Message-ID: <20140228231141.GA20782@openwall.com> References: <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="KsGdsel6WgEHnImy" Content-Disposition: inline In-Reply-To: <20140228114545.GA8669@agmartin.aq.upm.es> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --KsGdsel6WgEHnImy Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi, Agustin! Wow! Your 'if' in 'while's condition is very elegant. Nice! On Fri, Feb 28, 2014 at 12:45:45PM +0100, Agustin Martin wrote: > On Wed, Feb 26, 2014 at 09:32:02PM +0100, Agustin Martin wrote: > > On Mon, Feb 24, 2014 at 08:03:17PM +0400, Aleksey Cherepanov wrote: > > > I played with different (maybe wrong) implementations of > > > flyspell-word-search-backward and measured time against t.txt > > > (produced by the one-liner). All implementations are attached. > > > > [ ... Tons of extensive and impressive debugging ... ] > > > > > We could avoid capturing at all. And it works faster as shown by 4 > > > last functions. > > > > Hi, > > > > Thanks a lot for the extensive debugging and for all the suggestions. I > > have been playing with something based in your last function, but trying > > to get something more compact, see below current status > [ ... ] > > I did some efficiency test and it seemed similar to those of your efficient > > functions. Need to check further for corner cases, bugs, etc ... > > Hi, Aleksey > > Please find attached my first candidate for commit. Is similar to what I > sent before, but needed to add an explicit check for word at eob in > `flyspell-word-search-forward'. > > Will try to have more testing before committing. Seems to work well with the > file generated by your one-liner, even with corner cases like new > misspellings added at bob or eob, but the wider the testing the better. I've wrote a small fuzzer. It is in attach. To run it: $ LANG=C emacs -Q --eval '(load-file "t2.el")' Then C-j to start. It modifies buffer you are in. Your -forward function gets stuck. (kbd "nd SPC and C-a") could repeat it. my-test-forward-agustin-fixed contains fix. It incorporates simplified word-end logic: we slip forward using flyspell-get-word, then we check eobp. Though I did not understand why -backward does not need a similar fix and I got the answer: my mistake with (length word) did not allow one word to be marked as duplicate. (if condition nil ...) could be replaced with (unless condition ...) but I do not know what one is more readable. (kbd "nd SPC and SPC nd C-b") fails to highlight the second "nd" as duplicate. It is a problem with bound equal to (length word) in -backward function. I did not check it when I wrote it. > + (search-forward word (length word) t)))) (search-forward word (1+ (length word)) t)))) One "nd" is colored as duplicate due to -backward function after that fix. I did not touch it yet because it is a time for a break for me. > Hope no one will generate files with words containing something in > OTHERCHARS. Why? Otherchars are not rare as of ' is there for "american" dictionary. So even this email contains such words ("while's"). BTW quite interesting flyspell behaviour could be observed with "met'met'and": if you jump back and forth over this word then met'met is highlighted when you are at the beginning and met'and is highlighted when you are at the end. Also "met'met'and met'and" highlights both met'and as mis-spelled (the second met'and is not marked as duplicate). Are there any variables that could affect search like case-fold-search? My fuzzer does not set them but users could have them set. Thanks! -- Regards, Aleksey Cherepanov --KsGdsel6WgEHnImy Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="t2.el" (require 'cl) (require 'flyspell) (setq my-fuzzer-buffer-name "*temp for fuzzer*") (switch-to-buffer my-fuzzer-buffer-name) (unless (= (point-min) (point-max)) (error "Could not operate on non-empty buffer")) (flyspell-mode 1) (random t) ;; Orig (defun my-test-backward-orig (word bound &optional ignore-case) (save-excursion (let ((r '()) (inhibit-point-motion-hooks t) p) (while (and (not r) (setq p (search-backward word bound t))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-forward-orig (word bound) (save-excursion (let ((r '()) (inhibit-point-motion-hooks t) p) (while (and (not r) (setq p (search-forward word bound t))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) (goto-char (1+ p))))) r))) ;; Agustin Martin (defun my-test-backward-agustin (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (if (re-search-backward word-re bound t) (progn (forward-char) (point)) ;; Check if word is at bob (goto-char (point-min)) (search-forward word (length word) t)))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-forward-agustin (word bound) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (word-end (nth 2 (flyspell-get-word))) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (if (= word-end (point-max)) nil ;; Current word is at e-o-b. No forward search (if (re-search-forward word-re bound t) ;; word-re match ends one char after word (progn (backward-char) (point)) ;; Check above does not match similar word at e-o-b (goto-char (point-max)) (search-backward word (- (point-max) (length word)) t))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) (goto-char (1+ p))))) r))) ;; Fixed (defun my-test-backward-agustin-fixed (word bound &optional ignore-case) ;; (my-test-backward-agustin word bound ignore-case)) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (if (re-search-backward word-re bound t) (progn (forward-char) (point)) ;; Check if word is at bob (goto-char (point-min)) (search-forward word (1+ (length word)) t)))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-forward-agustin-fixed (word bound) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (flyspell-get-word) (while (and (not r) (setq p (if (eobp) nil ;; Current word is at e-o-b. No forward search (if (re-search-forward word-re bound t) ;; word-re match ends one char after word (progn (backward-char) (point)) ;; Check above does not match similar word at e-o-b (goto-char (point-max)) (and (search-backward word (- (point-max) (length word)) t) (goto-char (point-max))))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) (goto-char (1+ p))))) r))) (defun my-make-test-macro () (let* ((good "met") (sep "SPC") (bad "nd") (oc "'") (bol "C-a") ;; not really eol but enough (eol "C-e") (parts (list good sep bad oc bol eol)) (len (length parts))) (eval `(kbd ,(mapconcat (lambda (a) (nth (random len) parts)) (make-list (1+ (random 100)) 0) " "))))) ;; nil if everythings is equal, ;; 'badtext if text is not equal, ;; position is the first position with different properties. (defun my-compare-strings-with-properties (a b) (if (string= (car a) (car b)) (let ((len (length (car a))) (pos 0) (badpos nil) (faces1 (cadr a)) (faces2 (cadr b))) (while (and (not badpos) (< pos len)) (unless (equal (nth pos faces1) (nth pos faces2)) (setq badpos pos)) ;; (message ">> %d" pos) (setq pos (1+ pos))) ;; (if badpos ;; (progn ;; (message ":>> faces1 %S" faces1) ;; (message ":>> faces2 %S" faces2))) badpos) 'badtext)) (defun my-try-macro (macro) (let ((strings ;; (message ">> count = %d, macro = %S" count macro) (mapcar (lambda (name) (delete-region (point-min) (point-max)) (letf (((symbol-function 'flyspell-word-search-forward) (intern (concat "my-test-forward-" (symbol-name name)))) ((symbol-function 'flyspell-word-search-backward) (intern (concat "my-test-backward-" (symbol-name name))))) ;; (message ">> pre %S %d" name count) (execute-kbd-macro macro) ;; (message ">> post %S %d" name count) ) (list (buffer-string) (mapcar (lambda (pos) (get-char-property pos 'face)) (number-sequence (point-min) (point-max))))) '(orig new)))) (my-compare-strings-with-properties (car strings) (cadr strings)))) ;; It may not reduce to the minimun in one run. It fails at reductions ;; if 2 or more chars should be removed at the same time. (defun my-reduce (macro) (let ((bad (my-try-macro macro)) (fails 0) newmacro) (if bad (while (< fails 100) (let ((pos (random (length macro)))) (setq newmacro (concat (substring macro 0 pos) (substring macro (1+ pos)))) ;; (message ">> %S" macro) ;; (message ">> %S" newmacro) (if (my-try-macro newmacro) (progn (setq fails 0) (setq macro newmacro)) (setq fails (1+ fails))))) (message ":>> We reduce only faulty macros")) macro)) (defun my-reset-new () (defun my-test-backward-new (word bound &optional ignore-case) (my-test-backward-agustin-fixed word bound ignore-case)) (defun my-test-forward-new (word bound) (my-test-forward-agustin-fixed word bound))) (my-reset-new) (defun my-try-mixed-pairs (macro) (if (my-try-macro macro) (progn (my-reset-new) (defun my-test-backward-new (word bound &optional ignore-case) (my-test-backward-orig word bound ignore-case)) (if (my-try-macro macro) (message ":>> Difference is from -forward function")) (my-reset-new) (defun my-test-forward-new (word bound) (my-test-forward-orig word bound)) (if (my-try-macro macro) (message ":>> Difference is from -backward function"))) (message ":>> We mix pairs only for faulty macros"))) (defun my-fuzz () (interactive) (unless (string= (ispell-get-otherchars) "[']") (error "Unexpected not-casechars value")) (buffer-disable-undo) (unwind-protect (let ((more t) (count 0) (time (current-time))) (while (and more (< count (if my-macro 1 15))) (let* ((macro (or my-macro (my-make-test-macro))) (bad (my-try-macro macro))) (setq more (not bad)) (unless more (message ":>> Bad at %S running %S" bad macro) (my-try-mixed-pairs macro) (message ":>> Reduced macro: %S" (my-reduce macro)))) (setq count (1+ count))) (message ":>> Fuzzing: %d macros are finished in %S" count (subtract-time (current-time) time)) (message ":>> %s" (if more "Without differences" "There are differences"))) (buffer-enable-undo)) nil) (global-set-key (kbd "C-j") 'my-fuzz) (split-window-right) (other-window 1) (view-echo-area-messages) (other-window 1) ;; For manual debug ;; (defun flyspell-word-search-backward (word bound &optional ignore-case) ;; (my-test-backward-agustin-fixed word bound ignore-case)) ;; (defun flyspell-word-search-forward (word bound) ;; (my-test-forward-agustin-fixed word bound)) ;; Define non-nil to run only one test with this macro not randomly (setq my-macro nil) ;; (setq my-macro (kbd "nd SPC and SPC nd C-a")) ;; (setq my-macro (kbd "nd SPC nd C-a")) ;; (setq my-macro (kbd "nd SPC and C-a")) ;; (setq my-macro (kbd "n SPC n C-a")) (setq my-macro (kbd "nd C-e")) --KsGdsel6WgEHnImy-- From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 01 Mar 2014 10:34:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139366999514448 (code B ref 16800); Sat, 01 Mar 2014 10:34:02 +0000 Received: (at 16800) by debbugs.gnu.org; 1 Mar 2014 10:33:15 +0000 Received: from localhost ([127.0.0.1]:45350 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJhEI-0003kx-3x for submit@debbugs.gnu.org; Sat, 01 Mar 2014 05:33:15 -0500 Received: from mail-la0-f52.google.com ([209.85.215.52]:45157) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJhEE-0003kl-CU for 16800@debbugs.gnu.org; Sat, 01 Mar 2014 05:33:11 -0500 Received: by mail-la0-f52.google.com with SMTP id ec20so2656591lab.11 for <16800@debbugs.gnu.org>; Sat, 01 Mar 2014 02:33:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=g29TMMy4cKgaVeAE5dZG98wixsPcUtRFxDUJ32kI0FY=; b=VHyxvYt1xABbKfBXN7CpyBm0qVddP3m0WiRhZJ62UiJ4W46WLxdHDZ2iDGJgZZ0K4N LqGoQZeXSl59t5qDmas8bKgHuJrlsjQbBhEgaRaTJvpbshka8FkIroyfI7G15Fghwa62 M9LAygkynPrcN6XePydvAWJ/+EP/b1tGSALRCv1FSGgB7Nr59KSvqwGCKa1oGPdN57My Kj8RCNRE7b/pyWD2MLF3WuhWriMI3Hb5vfLtRTNeXrLU4FEaMQL7V9phZ8xwvGR2+sMX +8862/vixiPsnG6rW9rQiiNdAOw5RekFxeSCsa9jOWT2iSnePxSX4gZLH2VHWLYfCXum saPA== X-Received: by 10.112.29.236 with SMTP id n12mr777140lbh.61.1393669989203; Sat, 01 Mar 2014 02:33:09 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id h7sm7573245lbj.1.2014.03.01.02.33.07 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sat, 01 Mar 2014 02:33:07 -0800 (PST) Date: Sat, 1 Mar 2014 14:33:05 +0400 From: Aleksey Cherepanov Message-ID: <20140301103305.GA600@openwall.com> References: <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> <20140228231141.GA20782@openwall.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="azLHFNyN32YCQGCU" Content-Disposition: inline In-Reply-To: <20140228231141.GA20782@openwall.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --azLHFNyN32YCQGCU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sat, Mar 01, 2014 at 03:11:41AM +0400, Aleksey Cherepanov wrote: > Wow! Your 'if' in 'while's condition is very elegant. Nice! > > On Fri, Feb 28, 2014 at 12:45:45PM +0100, Agustin Martin wrote: > > Please find attached my first candidate for commit. Is similar to what I > > sent before, but needed to add an explicit check for word at eob in > > `flyspell-word-search-forward'. > > > > Will try to have more testing before committing. Seems to work well with the > > file generated by your one-liner, even with corner cases like new > > misspellings added at bob or eob, but the wider the testing the better. > > I've wrote a small fuzzer. It is in attach. To run it: > $ LANG=C emacs -Q --eval '(load-file "t2.el")' > Then C-j to start. It modifies buffer you are in. There is a mistake in my-try-mixed-pairs, fixed version is attached. > (kbd "nd SPC and SPC nd C-b") fails to highlight the second "nd" as > duplicate. It is a problem with bound equal to (length word) in > -backward function. I did not check it when I wrote it. > > + (search-forward word (length word) t)))) > (search-forward word (1+ (length word)) t)))) (1+ ...) is wrong, it should be similar to -forward: (+ (point-min) ...) because (point-min) is not always 1 (narrowing could change this). BTW flyspell does not escape restrictions/narrowing when it searches for duplicate. Would not it be more convenient to widen before search? Like (save-restriction (widen) ... search ... > One "nd" is colored as duplicate due to -backward function after that > fix. I did not touch it yet because it is a time for a break for me. Thanks! -- Regards, Aleksey Cherepanov --azLHFNyN32YCQGCU Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="t2.el" (require 'cl) (require 'flyspell) (setq my-fuzzer-buffer-name "*temp for fuzzer*") (switch-to-buffer my-fuzzer-buffer-name) (unless (= (point-min) (point-max)) (error "Could not operate on non-empty buffer")) (flyspell-mode 1) (random t) ;; Orig (defun my-test-backward-orig (word bound &optional ignore-case) (save-excursion (let ((r '()) (inhibit-point-motion-hooks t) p) (while (and (not r) (setq p (search-backward word bound t))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-forward-orig (word bound) (save-excursion (let ((r '()) (inhibit-point-motion-hooks t) p) (while (and (not r) (setq p (search-forward word bound t))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) (goto-char (1+ p))))) r))) ;; Agustin Martin (defun my-test-backward-agustin (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (if (re-search-backward word-re bound t) (progn (forward-char) (point)) ;; Check if word is at bob (goto-char (point-min)) (search-forward word (length word) t)))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-forward-agustin (word bound) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (word-end (nth 2 (flyspell-get-word))) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (if (= word-end (point-max)) nil ;; Current word is at e-o-b. No forward search (if (re-search-forward word-re bound t) ;; word-re match ends one char after word (progn (backward-char) (point)) ;; Check above does not match similar word at e-o-b (goto-char (point-max)) (search-backward word (- (point-max) (length word)) t))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) (goto-char (1+ p))))) r))) ;; Fixed (defun my-test-backward-agustin-fixed (word bound &optional ignore-case) ;; (my-test-backward-agustin word bound ignore-case)) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (if (re-search-backward word-re bound t) (progn (forward-char) (point)) ;; Check if word is at bob (goto-char (point-min)) (search-forward word (+ (point-min) (length word)) t)))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-forward-agustin-fixed (word bound) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (flyspell-get-word) (while (and (not r) (setq p (if (eobp) nil ;; Current word is at e-o-b. No forward search (if (re-search-forward word-re bound t) ;; word-re match ends one char after word (progn (backward-char) (point)) ;; Check above does not match similar word at e-o-b (goto-char (point-max)) (and (search-backward word (- (point-max) (length word)) t) (goto-char (point-max))))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) (goto-char (1+ p))))) r))) (defun my-make-test-macro () (let* ((good "met") (sep "SPC") (bad "nd") (oc "'") (bol "C-a") ;; not really eol but enough (eol "C-e") (parts (list good sep bad oc bol eol)) (len (length parts))) (eval `(kbd ,(mapconcat (lambda (a) (nth (random len) parts)) (make-list (1+ (random 100)) 0) " "))))) ;; nil if everythings is equal, ;; 'badtext if text is not equal, ;; position is the first position with different properties. (defun my-compare-strings-with-properties (a b) (if (string= (car a) (car b)) (let ((len (length (car a))) (pos 0) (badpos nil) (faces1 (cadr a)) (faces2 (cadr b))) (while (and (not badpos) (< pos len)) (unless (equal (nth pos faces1) (nth pos faces2)) (setq badpos pos)) ;; (message ">> %d" pos) (setq pos (1+ pos))) ;; (if badpos ;; (progn ;; (message ":>> faces1 %S" faces1) ;; (message ":>> faces2 %S" faces2))) badpos) 'badtext)) (defun my-try-macro (macro) (let ((strings ;; (message ">> count = %d, macro = %S" count macro) (mapcar (lambda (name) (delete-region (point-min) (point-max)) (letf (((symbol-function 'flyspell-word-search-forward) (intern (concat "my-test-forward-" (symbol-name name)))) ((symbol-function 'flyspell-word-search-backward) (intern (concat "my-test-backward-" (symbol-name name))))) ;; (message ">> pre %S %d" name count) (execute-kbd-macro macro) ;; (message ">> post %S %d" name count) ) (list (buffer-string) (mapcar (lambda (pos) (get-char-property pos 'face)) (number-sequence (point-min) (point-max))))) '(orig new)))) (my-compare-strings-with-properties (car strings) (cadr strings)))) ;; It may not reduce to the minimun in one run. It fails at reductions ;; if 2 or more chars should be removed at the same time. (defun my-reduce (macro) (let ((bad (my-try-macro macro)) (fails 0) newmacro) (if bad (while (< fails 100) (let ((pos (random (length macro)))) (setq newmacro (concat (substring macro 0 pos) (substring macro (1+ pos)))) ;; (message ">> %S" macro) ;; (message ">> %S" newmacro) (if (my-try-macro newmacro) (progn (setq fails 0) (setq macro newmacro)) (setq fails (1+ fails))))) (message ":>> We reduce only faulty macros")) macro)) ;; Change this to use other functions instead of -agustin-fixed (defun my-reset-new () (defun my-test-backw+ard-new (word bound &optional ignore-case) (my-test-backward-agustin-fixed word bound ignore-case)) (defun my-test-forward-new (word bound) (my-test-forward-agustin-fixed word bound))) (my-reset-new) (defun my-try-mixed-pairs (macro) (unwind-protect (if (my-try-macro macro) (progn (my-reset-new) (defun my-test-backward-new (word bound &optional ignore-case) (my-test-backward-orig word bound ignore-case)) (if (my-try-macro macro) (message ":>> Difference is from -forward function")) (my-reset-new) (defun my-test-forward-new (word bound) (my-test-forward-orig word bound)) (if (my-try-macro macro) (message ":>> Difference is from -backward function"))) (message ":>> We mix pairs only for faulty macros")) (my-reset-new))) (defun my-fuzz () (interactive) (unless (string= (ispell-get-otherchars) "[']") (error "Unexpected not-casechars value")) (buffer-disable-undo) (unwind-protect (let ((more t) (count 0) (time (current-time))) (while (and more (< count (if my-macro 1 15))) (let* ((macro (or my-macro (my-make-test-macro))) (bad (my-try-macro macro))) (setq more (not bad)) (unless more (message ":>> Bad at %S running %S" bad macro) (my-try-mixed-pairs macro) (message ":>> Reduced macro: %S" (my-reduce macro)))) (setq count (1+ count))) (message ":>> Fuzzing: %d macros are finished in %S" count (subtract-time (current-time) time)) (message ":>> %s" (if more "Without differences" "There are differences"))) (buffer-enable-undo)) nil) (global-set-key (kbd "C-j") 'my-fuzz) (split-window-right) (other-window 1) (view-echo-area-messages) (other-window 1) ;; For manual debug ;; (defun flyspell-word-search-backward (word bound &optional ignore-case) ;; (my-test-backward-agustin-fixed word bound ignore-case)) ;; (defun flyspell-word-search-forward (word bound) ;; (my-test-forward-agustin-fixed word bound)) ;; Define non-nil to run only one test with this macro not randomly (setq my-macro nil) ;; (setq my-macro (kbd "nd SPC and SPC nd C-a")) ;; (setq my-macro (kbd "nd SPC nd C-a")) ;; (setq my-macro (kbd "nd SPC and C-a")) (setq my-macro (kbd "n SPC n C-a")) ;; (setq my-macro (kbd "nd C-e")) --azLHFNyN32YCQGCU-- From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 01 Mar 2014 15:51:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139368903820249 (code B ref 16800); Sat, 01 Mar 2014 15:51:02 +0000 Received: (at 16800) by debbugs.gnu.org; 1 Mar 2014 15:50:38 +0000 Received: from localhost ([127.0.0.1]:46851 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJmBR-0005GX-Uy for submit@debbugs.gnu.org; Sat, 01 Mar 2014 10:50:38 -0500 Received: from mail-lb0-f177.google.com ([209.85.217.177]:51852) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJmBN-0005GM-Dc for 16800@debbugs.gnu.org; Sat, 01 Mar 2014 10:50:34 -0500 Received: by mail-lb0-f177.google.com with SMTP id z11so3513865lbi.36 for <16800@debbugs.gnu.org>; Sat, 01 Mar 2014 07:50:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=W7BohbB8v10rwwzoS7B0EhyWSNvZMPfeYIGWD7GJ2Mw=; b=jY4XWCRdk0Ag1EdubQC4UM2h8H8bywRnRYco+B9aZXXEbUw0sF/9G/kzgyCxHCyOfS 8yNYuBmX8v3asLvMb+05PCJhDnu0jPHqYKAinv4Po9Bch7dNyvHh90QsrddAAwl/4TJo cw0k3X65tEZ5CU3CmuLop25q36XnkSHAt/VYX66xBILZGrx2oV/zs0buPkmtVwWgjXvT woyibtey48qCZRyZmgmCYB31ole1img2jLPGNFZuY/sHxAY80LfRlOBjT+XAc+wXZTPP Rti4POXNhkCitFzN2t7xtSKyOEJggN36KJAp7jPrVHRJYmVMEaQ4C4aD6krYyLhWs3SD De1A== X-Received: by 10.112.63.193 with SMTP id i1mr1912382lbs.54.1393689031989; Sat, 01 Mar 2014 07:50:31 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id sx1sm2690128lac.1.2014.03.01.07.50.30 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sat, 01 Mar 2014 07:50:31 -0800 (PST) Date: Sat, 1 Mar 2014 19:50:29 +0400 From: Aleksey Cherepanov Message-ID: <20140301155029.GA6421@openwall.com> References: <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> <20140228231141.GA20782@openwall.com> <20140301103305.GA600@openwall.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140301103305.GA600@openwall.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Sat, Mar 01, 2014 at 02:33:05PM +0400, Aleksey Cherepanov wrote: > ;; Change this to use other functions instead of -agustin-fixed > (defun my-reset-new () > (defun my-test-backw+ard-new (word bound &optional ignore-case) Sorry! '+' is here by mistake. It should be (defun my-test-backward-new (word bound &optional ignore-case) Thanks! -- Regards, Aleksey Cherepanov From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 01 Mar 2014 21:40:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139370995629255 (code B ref 16800); Sat, 01 Mar 2014 21:40:02 +0000 Received: (at 16800) by debbugs.gnu.org; 1 Mar 2014 21:39:16 +0000 Received: from localhost ([127.0.0.1]:47034 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJrco-0007bm-IL for submit@debbugs.gnu.org; Sat, 01 Mar 2014 16:39:15 -0500 Received: from mail-lb0-f176.google.com ([209.85.217.176]:61720) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJrck-0007bb-7d for 16800@debbugs.gnu.org; Sat, 01 Mar 2014 16:39:11 -0500 Received: by mail-lb0-f176.google.com with SMTP id 10so3601972lbg.7 for <16800@debbugs.gnu.org>; Sat, 01 Mar 2014 13:39:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=9m1HZPjZMlr+lhVUxZn1QmTp35QfMVwnJRS8oSz0/zY=; b=JCEY91xGZqpk+hiLKkPhOqEWC0soSg2f6BMkUFn/m8z4pJQNtV+nlVtksZqSuNb4Ac /qms56erFUYpTW+HR3t4Y/efnrUxrdUvMUerVEY3m9VLd8zqdWnJT6AVLwOD9BYaBPUz gp/nrjlV9eelLfMndk8h3Ep5BhOEsUTCFgfWfOo2aq6YnZJV1UOxGEZ/bzSK7LkEZKAd 5EMz1Wq3iHmiv5biCIk3m656AIMTRoiZcnyAWISMYgs0BLnSTHYbq1Ng9e5fFI2ZG3/U /cphHxyB6drmMC0XX7/1ELoQb3gz9YWhgJMG8sMQ74/8gJpvw0LVkc6LdIlpLzlexfew nOrA== X-Received: by 10.112.14.1 with SMTP id l1mr16290224lbc.39.1393709949000; Sat, 01 Mar 2014 13:39:09 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id pz10sm9517213lbb.10.2014.03.01.13.39.07 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sat, 01 Mar 2014 13:39:08 -0800 (PST) Date: Sun, 2 Mar 2014 01:39:06 +0400 From: Aleksey Cherepanov Message-ID: <20140301213906.GA13523@openwall.com> References: <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> <20140228231141.GA20782@openwall.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="liOOAslEiF7prFVr" Content-Disposition: inline In-Reply-To: <20140228231141.GA20782@openwall.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --liOOAslEiF7prFVr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sat, Mar 01, 2014 at 03:11:41AM +0400, Aleksey Cherepanov wrote: > I've wrote a small fuzzer. It is in attach. To run it: > $ LANG=C emacs -Q --eval '(load-file "t2.el")' > Then C-j to start. It modifies buffer you are in. New version is attached. M-j tries last macro or macro specified in my-macro variable. For manual experiments C-o and C-u C-o defines flyspell-word-search-* as my-test-*-(orig|new). Though I improved output so C-j should be enough. > > Hope no one will generate files with words containing something in > > OTHERCHARS. > > Why? > > Otherchars are not rare as of ' is there for "american" dictionary. So > even this email contains such words ("while's"). > > BTW quite interesting flyspell behaviour could be observed with > "met'met'and": if you jump back and forth over this word then met'met > is highlighted when you are at the beginning and met'and is > highlighted when you are at the end. > > Also "met'met'and met'and" highlights both met'and as mis-spelled (the > second met'and is not marked as duplicate). I think original search of "n'n" against "n'n'n'n" finds only (n'n)'(n'n) but not n'(n'n)'n. Our search marks the first word as duplicate running (kbd "n'n SPC en'n'n C-a") while original search does not. What behaviour is preferable? Should the first word of "n'n en'n'n" be marked as duplicate? > Are there any variables that could affect search like > case-fold-search? My fuzzer does not set them but users could have > them set. Also my fuzzer does not try bounds for the search. But we will be in trouble if the search bound is at word bound because we want one more char. Though we could extend bound by 1 char to solve that. Now only forward search is enabled in my fuzzer. Setup it at the end of file as you need. I've implemented a variant of forward search using regexp. It seems that forward search does not get slow from the group in regexp. I did not measured well though. The function is shorter with regexp. Maybe we should make a correct variant before fast one... %-) Also forward search works a bit faster in general. So we could try to implement backward search though forward search. I've removed (goto-char (1+ p)) to not fail on (kbd "nd SPC d'nd SPC nd SPC met C-a"). At the moment the fuzzer could pass several thousands of tests well. You need to wait for fails or improve test generator. Thanks! -- Regards, Aleksey Cherepanov --liOOAslEiF7prFVr Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="t2.el" Content-Transfer-Encoding: quoted-printable (require 'cl) (require 'flyspell) (setq my-fuzzer-buffer-name "*temp for fuzzer*") (switch-to-buffer my-fuzzer-buffer-name) (unless (=3D (point-min) (point-max)) (error "Could not operate on non-empty buffer")) (flyspell-mode 1) (random t) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;= ;;;;; ;; Implementations ;; Orig (defun my-test-backward-orig (word bound &optional ignore-case) (save-excursion (let ((r '()) (inhibit-point-motion-hooks t) p) (while (and (not r) (setq p (search-backward word bound t))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-forward-orig (word bound) (save-excursion (let ((r '()) (inhibit-point-motion-hooks t) p) (while (and (not r) (setq p (search-forward word bound t))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) (goto-char (1+ p))))) r))) ;; Agustin Martin (defun my-test-backward-agustin (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (if (re-search-backward word-re bound t) (progn (forward-char) (point)) ;; Check if word is at bob (goto-char (point-min)) (search-forward word (length word) t)))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-forward-agustin (word bound) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (word-end (nth 2 (flyspell-get-word))) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (if (=3D word-end (point-max)) nil ;; Current word is at e-o-b. No forward search (if (re-search-forward word-re bound t) ;; word-re match ends one char after word (progn (backward-char) (point)) ;; Check above does not match similar word at e-o= -b (goto-char (point-max)) (search-backward word (- (point-max) (length word)) t))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) (goto-char (1+ p))))) r))) ;; Fixed (defun my-test-backward-agustin-fixed (word bound &optional ignore-case) ;; (my-test-backward-agustin word bound ignore-case)) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (if (re-search-backward word-re bound t) (progn (forward-char) (point)) ;; Check if word is at bob (goto-char (point-min)) (search-forward word (+ (point-min) (length word)) t)))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-forward-agustin-fixed (word bound) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (flyspell-get-word) (while (and (not r) (setq p (if (eobp) nil ;; Current word is at e-o-b. No forward sear= ch (if (re-search-forward word-re bound t) ;; word-re match ends one char after word (progn (backward-char) (point)) ;; Check above does not match similar word at e-= o-b (goto-char (point-max)) (and (search-backward word (- (point-max) (length word)) t) (goto-char (point-max))))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) ;; We don't need to move forward due to additional char ;; before word in regexp ;; (goto-char (1+ p)) ))) r))) ;; With eob in regexp (defun my-test-forward-eob (word bound) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) "\\(?:" flyspell-not-casechars "\\|\\'\\)")) p) (while (and (not r) (setq p (and (re-search-forward word-re bound t) (if (eobp) (point) (backward-char) (point))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) ;; We don't need to move forward due to additional char ;; before word in regexp ;; (goto-char (1+ p)) ))) r))) ;; End of Implementations ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;= ;;;;; ;; Fuzzer (defun my-make-test-macro () (let* ((good "met") (sep "SPC") (bad "nd") (oc "'") (bol "C-a") ;; not really eol but enough (eol "C-e") (parts (list good sep bad oc bol eol)) (len (length parts))) (eval `(kbd ,(mapconcat (lambda (a) (nth (random len) parts)) (make-list (1+ (random 100)) 0) " "))))) ;; nil if everythings is equal, ;; 'badtext if text is not equal, ;; position is the first position with different properties. (defun my-compare-strings-with-properties (a b) (if (string=3D (car a) (car b)) (let ((len (length (car a))) (pos 0) (badpos nil) (faces1 (cadr a)) (faces2 (cadr b))) (while (and (not badpos) (< pos len)) (unless (equal (nth pos faces1) (nth pos faces2)) (setq badpos pos)) ;; (message ">> %d" pos) (setq pos (1+ pos))) (if my-show-faces (if badpos (progn (message ":>> faces1 %S" faces1) (message ":>> faces2 %S" faces2)) (message ":>> No diff"))) badpos) 'badtext)) (defun my-make-string-with-faces (a) (let ((str (car a)) (faces (cadr a))) (mapcar (lambda (pos) (set-text-properties pos (1+ pos) `(fontified t font-lock-face ,(nth pos f= aces)) str)) (number-sequence 0 (1- (length str)))) str)) (defun my-make-strings-with-faces (a b) (concat "\n:>> orig:" (my-make-string-with-faces a) "\n:>> new:" (my-make-string-with-faces b) "\n")) (setq my-show-nice-faces nil) (defun my-try-macro (macro) (let ((strings ;; (message ">> count =3D %d, macro =3D %S" count macro) (mapcar (lambda (name) (delete-region (point-min) (point-max)) (letf (((symbol-function 'flyspell-word-search-forward) (intern (concat "my-test-forward-" (symbol-name name)))) ((symbol-function 'flyspell-word-search-backward) (intern (concat "my-test-backward-" (symbol-name name))= ))) ;; (message ">> pre %S %d" name count) (execute-kbd-macro macro) ;; (message ">> post %S %d" name count) ) (list (buffer-string) (mapcar (lambda (pos) (get-char-property pos 'face)) (number-sequence (point-min) (point-max))))) '(orig new)))) (let ((bad (my-compare-strings-with-properties (car strings) (cadr stri= ngs)))) (if (and bad my-show-nice-faces) (with-current-buffer "*Messages*" (insert (my-make-strings-with-faces (car strings) (cadr strings= ))))) bad))) ;; It may not reduce to the minimun in one run. It fails at reductions ;; if 2 or more chars should be removed at the same time. (defun my-reduce (macro) (let ((bad (my-try-macro macro)) (fails 0) newmacro) (if bad (while (< fails 100) (let ((pos (random (length macro)))) (setq newmacro (concat (substring macro 0 pos) (substring macro (1+= pos)))) ;; (message ">> %S" macro) ;; (message ">> %S" newmacro) (if (my-try-macro newmacro) (progn (setq fails 0) (setq macro newmacro)) (setq fails (1+ fails))))) (message ":>> We reduce only faulty macros")) macro)) (defun my-try-mixed-pairs (macro) (unwind-protect (if (my-try-macro macro) (progn (my-reset-new) (defun my-test-backward-new (word bound &optional ignore-case) (my-test-backward-orig word bound ignore-case)) (if (my-try-macro macro) (message ":>> Difference is from -forward function")) (my-reset-new) (defun my-test-forward-new (word bound) (my-test-forward-orig word bound)) (if (my-try-macro macro) (message ":>> Difference is from -backward function"))) (message ":>> We mix pairs only for faulty macros")) (my-reset-new))) (defun my-fuzz () (interactive) (unless (string=3D (ispell-get-otherchars) "[']") (error "Unexpected not-casechars value")) (buffer-disable-undo) (unwind-protect (let ((more t) (count 0) (update-step 100) (time (current-time))) (while (and more (< count (if my-macro 1 1000))) (let* ((macro (or my-macro (my-make-test-macro))) (bad (let ((my-show-nice-faces t)) (my-try-macro macro)))) (setq more (not bad)) (unless more (if (numberp bad) (message ":>> pos :%s^" (make-string bad ? ))) (message ":>> Bad at %S running %S" bad macro) (my-try-mixed-pairs macro) (setq my-macro-last (my-reduce macro)) (message ":>> Reduced macro: %S" my-macro-last)) (if (=3D 0 (% count update-step)) (message ":>> In progress, count =3D %d (shows between ever= y %d)" count update-step))) (setq count (1+ count))) (message ":>> Fuzzing: %d macros are finished in %S" count (subtract-time (current-time) time)) (message ":>> %s" (if more "Without differences" "There are differe= nces"))) (buffer-enable-undo)) nil) (global-set-key (kbd "C-j") 'my-fuzz) ;; use -orig with prefix arg, ;; use -new without prefix arg (defun my-choose-flyspell-funcs (arg) (interactive "P") (if arg (progn (defun flyspell-word-search-backward (word bound &optional ignore-c= ase) (my-test-backward-orig word bound ignore-case)) (defun flyspell-word-search-forward (word bound) (my-test-forward-orig word bound)) (message ">> Using orig")) (defun flyspell-word-search-backward (word bound &optional ignore-case) (my-test-backward-new word bound ignore-case)) (defun flyspell-word-search-forward (word bound) (my-test-forward-new word bound)) (message ">> Using new"))) (global-set-key (kbd "C-o") 'my-choose-flyspell-funcs) (setq my-macro-last nil) (setq my-show-faces nil) (defun my-show-faces-func () (interactive) (let ((macro (or my-macro my-macro-last))) (if macro (let ((my-show-faces t)) (my-try-macro macro)) (error "No macro specified")))) (global-set-key (kbd "M-j") 'my-show-faces-func) (split-window-right) (other-window 1) (view-echo-area-messages) (other-window 1) ;; For manual debug ;; (defun flyspell-word-search-backward (word bound &optional ignore-case) ;; (my-test-backward-agustin-fixed word bound ignore-case)) ;; (defun flyspell-word-search-forward (word bound) ;; (my-test-forward-agustin-fixed word bound)) ;; Change this to use other functions instead of -agustin-fixed (defun my-reset-new () (defun my-test-backward-new (word bound &optional ignore-case) ;; (my-test-backward-agustin-fixed word bound ignore-case)) (my-test-backward-orig word bound ignore-case)) (defun my-test-forward-new (word bound) ;; (my-test-forward-agustin-fixed word bound))) ;; (my-test-forward-agustin word bound))) (my-test-forward-eob word bound))) (my-reset-new) ;; Define non-nil to run only one test with this macro not randomly (setq my-macro nil) ;; (setq my-macro (kbd "nd SPC and SPC nd C-a")) ;; (setq my-macro (kbd "nd SPC nd C-a")) ;; (setq my-macro (kbd "nd SPC and C-a")) ;; (setq my-macro (kbd "n SPC n C-a")) ;; (setq my-macro (kbd "nd C-e")) ;; (setq my-macro "'nd end'nd'nd=01nd=01") ;; (setq my-macro "'n en'n'n=01n=01") (setq my-macro ;; "'nd end'nd'nd=01nd=01" ;; "'n en'n'n=01n=01" ;; "n'n en'n'n=01" "a n'n en'n'n=01" ;; "n'n n'n'n=01" ;; "n'n n'n'n=02=02=02=02=02=02=02=02" ;; "d'nd=01ndmet =05met ndmet=01" ;; "d'n=01nd =05d nd=01" ;; "nd d'nd nd=01" ;; "nd d'nd nd met=01" ) --liOOAslEiF7prFVr-- From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 01 Mar 2014 21:45:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 16800@debbugs.gnu.org, Agustin Martin Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139371027429810 (code B ref 16800); Sat, 01 Mar 2014 21:45:02 +0000 Received: (at 16800) by debbugs.gnu.org; 1 Mar 2014 21:44:34 +0000 Received: from localhost ([127.0.0.1]:47040 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJrhy-0007kj-9z for submit@debbugs.gnu.org; Sat, 01 Mar 2014 16:44:34 -0500 Received: from mail-lb0-f178.google.com ([209.85.217.178]:44546) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJrhv-0007kV-HZ for 16800@debbugs.gnu.org; Sat, 01 Mar 2014 16:44:32 -0500 Received: by mail-lb0-f178.google.com with SMTP id s7so3612691lbd.23 for <16800@debbugs.gnu.org>; Sat, 01 Mar 2014 13:44:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=Mu27ARoqz6yzItRg76Vvj6hUXrkPuWosCCQrsSYarbQ=; b=ogXM8W24seZ6jczwASLlpVUtVy45ujYIgmB7JxN/rWxi5VGlY9sIqIHyh1DqIFeoPm Wv46EoivMjdrERFcAddu5PW+g9LMF3JhBSIyX1GrR2k3XywWhUyoeQITeAOmND+S1zmb WGI2/hRpBfLamsjmtqzbKaxa9vomGO3pIBdYnjeQZRXPfnADu040Hkw5/yT+mWwor5Nd 2T3LdObc+BIg+pBcQtyryWJSzXb5X5SGI1Xe3ZTVcqagUAPvIzof8IBiAzWYoWfXQVnE 8Iyh6yJorRccukykeonaHp6ipuT2KJipZToIhyTDbgb+OQU5GCheGyeRjTbzOVlANydH kYqA== X-Received: by 10.152.43.103 with SMTP id v7mr157115lal.46.1393710270255; Sat, 01 Mar 2014 13:44:30 -0800 (PST) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id pz10sm9538012lbb.10.2014.03.01.13.44.29 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sat, 01 Mar 2014 13:44:29 -0800 (PST) Date: Sun, 2 Mar 2014 01:44:27 +0400 From: Aleksey Cherepanov Message-ID: <20140301214427.GA14106@openwall.com> References: <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> <83fvn3wj3m.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <83fvn3wj3m.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Fri, Feb 28, 2014 at 01:51:41PM +0200, Eli Zaretskii wrote: > Thanks to both of you, but I still think that having flyspell search > without limits for duplicate mis-spellings is not a good idea. We > have no control on how big user buffers could be. > > So I think we should limit that search by default. May be. We could try to limit execution time. Though checks for that could slow down the search. Thanks! -- Regards, Aleksey Cherepanov From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 02 Mar 2014 03:58:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Aleksey Cherepanov Cc: 16800@debbugs.gnu.org, agustin.martin@hispalinux.es Reply-To: Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.13937326353131 (code B ref 16800); Sun, 02 Mar 2014 03:58:01 +0000 Received: (at 16800) by debbugs.gnu.org; 2 Mar 2014 03:57:15 +0000 Received: from localhost ([127.0.0.1]:47250 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJxWb-0000oO-P6 for submit@debbugs.gnu.org; Sat, 01 Mar 2014 22:57:14 -0500 Received: from mtaout26.012.net.il ([80.179.55.182]:56320) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJxWW-0000oB-GE for 16800@debbugs.gnu.org; Sat, 01 Mar 2014 22:57:09 -0500 Received: from conversion-daemon.mtaout26.012.net.il by mtaout26.012.net.il (HyperSendmail v2007.08) id <0N1S00M00GHI5200@mtaout26.012.net.il> for 16800@debbugs.gnu.org; Sun, 02 Mar 2014 05:55:03 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout26.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N1S003V8IVQ3510@mtaout26.012.net.il>; Sun, 02 Mar 2014 05:55:03 +0200 (IST) Date: Sun, 02 Mar 2014 05:56:48 +0200 From: Eli Zaretskii In-reply-to: <20140301214427.GA14106@openwall.com> X-012-Sender: halo1@inter.net.il Message-id: <83ob1ptfr3.fsf@gnu.org> References: <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> <83fvn3wj3m.fsf@gnu.org> <20140301214427.GA14106@openwall.com> X-Spam-Score: 3.7 (+++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: > Date: Sun, 2 Mar 2014 01:44:27 +0400 > From: Aleksey Cherepanov > Cc: Agustin Martin , 16800@debbugs.gnu.org > > We could try to limit execution time. Though checks for that could > slow down the search. [...] Content analysis details: (3.7 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 2.7 RCVD_IN_PSBL RBL: Received via a relay in PSBL [80.179.55.182 listed in psbl.surriel.com] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 3.7 (+++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: > Date: Sun, 2 Mar 2014 01:44:27 +0400 > From: Aleksey Cherepanov > Cc: Agustin Martin , 16800@debbugs.gnu.org > > We could try to limit execution time. Though checks for that could > slow down the search. [...] Content analysis details: (3.7 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 2.7 RCVD_IN_PSBL RBL: Received via a relay in PSBL [80.179.55.182 listed in psbl.surriel.com] 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) > Date: Sun, 2 Mar 2014 01:44:27 +0400 > From: Aleksey Cherepanov > Cc: Agustin Martin , 16800@debbugs.gnu.org > > We could try to limit execution time. Though checks for that could > slow down the search. Exactly. From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Agustin Martin Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 09 Mar 2014 17:26:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Aleksey Cherepanov , 16800@debbugs.gnu.org Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.1394385956566 (code B ref 16800); Sun, 09 Mar 2014 17:26:03 +0000 Received: (at 16800) by debbugs.gnu.org; 9 Mar 2014 17:25:56 +0000 Received: from localhost ([127.0.0.1]:58317 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WMhU3-000093-65 for submit@debbugs.gnu.org; Sun, 09 Mar 2014 13:25:56 -0400 Received: from mail-la0-f49.google.com ([209.85.215.49]:53561) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WMhTv-00008h-RK for 16800@debbugs.gnu.org; Sun, 09 Mar 2014 13:25:49 -0400 Received: by mail-la0-f49.google.com with SMTP id mc6so4036956lab.36 for <16800@debbugs.gnu.org>; Sun, 09 Mar 2014 10:25:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Y8jpy6RxV2capscZbmxEoVPijP00mii1rt4ySL05ORs=; b=NqVHd2RTN58RTte9zDSUNGxre1zPCh5D4kC3iTY4dBBtBa44pd7MV2vw90K7slmND/ ZLzIMZWS++WyiM3xAHPD2mt/FbSI1wCfWNwnne9b5xylqmQ1iOpzWYR7G+1p3OgcSvXS lTlKdWndhO704rwNtRry3C7dqPr6ZgFsv0+ZEjD25DXKE2xgNM0RfEfy3rXD5WnnWR0m MCib/DnN1Q1jpZh7OCRF0EzXlASRNTm0OJU2v6ZCoiGfPHvfJYmozVEiS7Swb4diMlqB tyXJxmIsSdPh52sfZnFkbpbpleTrWejGTaQEt+AMjkEZ4HDLq/mXKSNzuHCeaDFnC4gv ZXGQ== MIME-Version: 1.0 X-Received: by 10.112.142.161 with SMTP id rx1mr18941781lbb.33.1394385946609; Sun, 09 Mar 2014 10:25:46 -0700 (PDT) Received: by 10.112.201.165 with HTTP; Sun, 9 Mar 2014 10:25:46 -0700 (PDT) In-Reply-To: <20140228231141.GA20782@openwall.com> References: <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> <20140228231141.GA20782@openwall.com> Date: Sun, 9 Mar 2014 18:25:46 +0100 Message-ID: From: Agustin Martin Content-Type: multipart/alternative; boundary=089e0115fb16070dd304f42fc723 X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --089e0115fb16070dd304f42fc723 Content-Type: text/plain; charset=ISO-8859-1 2014-03-01 0:11 GMT+01:00 Aleksey Cherepanov : > On Fri, Feb 28, 2014 at 12:45:45PM +0100, Agustin Martin wrote: > > > > Please find attached my first candidate for commit. Is similar to what I > > sent before, but needed to add an explicit check for word at eob in > > `flyspell-word-search-forward'. > > > > Will try to have more testing before committing. Seems to work well with > the > > file generated by your one-liner, even with corner cases like new > > misspellings added at bob or eob, but the wider the testing the better. > > I've wrote a small fuzzer. It is in attach. To run it: > $ LANG=C emacs -Q --eval '(load-file "t2.el")' > Then C-j to start. It modifies buffer you are in. > > Your -forward function gets stuck. (kbd "nd SPC and C-a") could repeat > it. my-test-forward-agustin-fixed contains fix. It incorporates > simplified word-end logic: we slip forward using flyspell-get-word, > then we check eobp. Though I did not understand why -backward does not > need a similar fix and I got the answer: my mistake with (length word) > did not allow one word to be marked as duplicate. > Hi, Sorry for the delay, I am rather busy lately. The problem is that final condition can cause an endless loop. I'd prefer to use a flag to make sure that test is passed only once as much, see below > (if condition nil ...) could be replaced with (unless condition ...) > but I do not know what one is more readable. > While there was `nil' in the version I attached, the actual code in my personal debug code was not exactly `nil' but something like (progn (message "word %s is at eob" word) nil) Together with being more readable, that is why the structure was left there. I have to thing about this, but with the full word-re test we may even get rid of that test and all the flyspell.-get-word call, but I need some time to test all that. If still needed (unless ...) may be used in the final version. > (kbd "nd SPC and SPC nd C-b") fails to highlight the second "nd" as > duplicate. It is a problem with bound equal to (length word) in > -backward function. I did not check it when I wrote it. > > + (search-forward word (length word) t)))) > (search-forward word (1+ (length word)) t)))) One "nd" is colored as duplicate due to -backward function after that > fix. I did not touch it yet because it is a time for a break for me. > FIxed in my copy. Later noticed that backward function has some other glitches, like flyspell-get-word looking at wrong word. Fixed in my copy, but I will look at it later. > Hope no one will generate files with words containing something in > > OTHERCHARS. > > Why? > Sorry for not being clear, I was thinking about a really rare corner case current code may not handle well Write 'nd-et' some thousand times and then write 'et' in a language where 'nd' and 'et' are misspellings and '-' is part of otherchars. I'd expect to have a lot of false positives in search. > BTW quite interesting flyspell behaviour could be observed with > "met'met'and": if you jump back and forth over this word then met'met > is highlighted when you are at the beginning and met'and is > highlighted when you are at the end. > > Also "met'met'and met'and" highlights both met'and as mis-spelled (the > second met'and is not marked as duplicate). > Funny :-) I do not expect to have time to check everything shortly. Also, note that Emacs 24.4 is on the way, so trunk is frozen for anything potentially problematic. Unless I consider the code absolutely rock solid, I'd prefer to wait until 24.4 is released. Sorry I could not yet play with your fuzzer, will try next week. Thanks for all the work you are putting here. This is what I am using to break the loop in the forward function (see `keep'). min limit in last condition should also be adjusted to bound if not unlimited. ;; --------------------------- 8< ---------------------------------------------------------- (defun flyspell-word-search-forward (word bound) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (word-end (nth 2 (flyspell-get-word))) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) (keep t) ;; Control flag to exit loop below after check word at eob. p) (while (and (not r) keep (setq p (if (= word-end (point-max)) nil ;; Current word is at e-o-b. No forward search (if (re-search-forward word-re bound t) ;; word-re match ends one char after word (progn (backward-char) (point) ) ;; Check above does not match similar word at e-o-b (goto-char (point-max)) (setq keep nil) ;; Ensure this is last iteration (search-backward word (- (point-max) (length word)) t))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) (goto-char (1+ p))))) r))) ;; --------------------------- 8< ---------------------------------------------------------- Regards, -- Agustin --089e0115fb16070dd304f42fc723 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
2014= -03-01 0:11 GMT+01:00 Aleksey Cherepanov <aleksey.4erepanov@gmai= l.com>:
On Fri, Feb 28, 2014 at 12:45:45PM +0100, Agustin Martin wrote:
>
> Please find attached my first candidate for commit. Is similar to what= I
> sent before, but needed to add an explicit check for word at eob in > `flyspell-word-search-forward'.
>
> Will try to have more testing before committing. Seems to work well wi= th the
> file generated by your one-liner, even with corner cases like new
> misspellings added at bob or eob, but the wider the testing the better= .

I've wrote a small fuzzer. It is in attach. To run it:
$ LANG=3DC emacs -Q --eval '(load-file "t2.el")'
Then C-j to start. It modifies buffer you are in.

Your -forward function gets stuck. (kbd "nd SPC and C-a") could r= epeat
it. my-test-forward-agustin-fixed contains fix. It incorporates
simplified word-end logic: we slip forward using flyspell-get-word,
then we check eobp. Though I did not understand why -backward does not
need a similar fix and I got the answer: my mistake with (length word)
did not allow one word to be marked as duplicate.

=
Hi,

Sorry for the delay, I am rather busy lately.

The problem is that final condition can cause an end= less loop. I'd prefer to use a flag to make sure that test is passed on= ly once as much, see below
=A0
(= if condition nil ...) could be replaced with (unless condition ...)
but I do not know what one is more readable.

While there was `nil' in the version I attached, the actual code = in my personal debug code was not exactly `nil' but something like

(progn (message "word %s is at eob" word) nil)
=
Together with being more readable, that is why the structure= was left there. I have to thing about this, but with the full word-re test= we may even get rid of that test and all the flyspell.-get-word call, but = I need some time to test all that. If still needed (unless ...) may be used= in the final version.
=A0
(kbd "nd SPC and SPC nd C-b") fails to highlight the second "= ;nd" as
duplicate. It is a problem with bound equal to (length word) in
-backward function. I did not check it when I wrote it.
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(search-forward word (len= gth word) t))))
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(search-forward word (1+= (length word)) t))))

One "nd" is colored as duplicate due to -backward function after = that
fix. I did not touch it yet because it is a time for a break for me.

FIxed in my copy. Later noticed that=20 backward function has some other glitches, like flyspell-get-word=20 looking at wrong word. Fixed in my copy, but I will look at it later.
=
> Hope no on= e will generate files with words containing something in
> OTHERCHARS.

Why?

Sorry for not being clear, I= was thinking about a really rare corner case current code may not handle w= ell

Write 'nd-et' some thousand times and then wr= ite 'et' in a language where 'nd' and 'et' are miss= pellings and '-' is part of otherchars. I'd expect to have a lo= t of false positives in search.
=A0
BTW quite interesting flyspell behaviour could be observed with
"met'met'and": if you jump back and forth over this word = then met'met
is highlighted when you are at the beginning and met'and is
highlighted when you are at the end.

Also "met'met'and met'and" highlights both met'an= d as mis-spelled (the
second met'and is not marked as duplicate).

Funny :-)

I do not expect to have time to check = everything shortly. Also, note that Emacs 24.4 is on the way, so trunk is f= rozen for anything potentially problematic. Unless I consider the code abso= lutely rock solid, I'd prefer to wait until 24.4 is released.

Sorry I could not yet play with your fuzzer, will try next w= eek. Thanks for all the work you are putting here.

=
This is what I am using to break the loop in the forward function (see= `keep'). min limit in last condition should also be adjusted to bound = if not unlimited.

;; --------------------------- 8< ----------------------------------= ------------------------
(defun flyspell-word-search-forward= (word bound)
=A0 (save-excursion
=A0=A0=A0 (let* ((r '())
=A0= =A0=A0 =A0=A0 (inhibit-point-motion-hooks t)
=A0=A0=A0 =A0=A0 (word-end (nth 2 (flyspell-get-word)))
=A0=A0=A0 =A0=A0= (flyspell-not-casechars (flyspell-get-not-casechars))
=A0=A0=A0 =A0=A0 = (word-re (concat flyspell-not-casechars
=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 = =A0=A0=A0 (regexp-quote word)
=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0=A0 fl= yspell-not-casechars))
=A0=A0=A0 =A0=A0 (keep t) ;; Control flag to exit loop below after check wo= rd at eob.
=A0=A0=A0 =A0=A0 p)
=A0=A0=A0=A0=A0 (while
=A0=A0=A0 = =A0 (and (not r)
=A0=A0=A0 =A0=A0=A0=A0=A0=A0 keep
=A0=A0=A0 =A0=A0= =A0=A0=A0=A0 (setq p (if (=3D word-end (point-max))
=A0=A0=A0 =A0=A0=A0 = =A0=A0=A0 =A0=A0 nil ;; Current word is at e-o-b. No forward search
=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0(if (re-search-forward word-re bound t)=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0=A0=A0 ;; word-re match ends one char = after word
=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0=A0=A0 (progn (backward-c= har) (point) )
=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0 ;; Check above does = not match similar word at e-o-b
=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0 (goto-char (point-max))
=A0=A0=A0 = =A0=A0=A0 =A0=A0=A0 =A0=A0 (setq keep nil) ;; Ensure this is last iteration=
=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0 (search-backward word (- (point-ma= x)
=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0=A0 =A0=A0=A0= (length word)) t)))))
=A0=A0=A0 (let ((lw (flyspell-get-word)))
=A0=A0=A0 =A0 (if (and (consp lw) (string-equal (car lw) word))
=A0=A0= =A0 =A0=A0=A0=A0=A0 (setq r p)
=A0=A0=A0 =A0=A0=A0 (goto-char (1+ p)))))=
=A0=A0=A0=A0=A0 r)))
;; --------------------------- 8< ----------= ------------------------------------------------

Regards,

--
Agustin

--089e0115fb16070dd304f42fc723-- From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Agustin Martin Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 09 Mar 2014 17:37:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Aleksey Cherepanov , 16800@debbugs.gnu.org Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.13943866192060 (code B ref 16800); Sun, 09 Mar 2014 17:37:02 +0000 Received: (at 16800) by debbugs.gnu.org; 9 Mar 2014 17:36:59 +0000 Received: from localhost ([127.0.0.1]:58323 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WMhek-0000X6-LJ for submit@debbugs.gnu.org; Sun, 09 Mar 2014 13:36:58 -0400 Received: from mail-lb0-f175.google.com ([209.85.217.175]:58132) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WMhei-0000Ww-VP for 16800@debbugs.gnu.org; Sun, 09 Mar 2014 13:36:57 -0400 Received: by mail-lb0-f175.google.com with SMTP id w7so4063564lbi.34 for <16800@debbugs.gnu.org>; Sun, 09 Mar 2014 10:36:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=V3jzbsIgtb7fmzpOyxG8KcbWgDRo51+OY8qKL8XHhU4=; b=AMUniXwXYAzaiSXA1vS9mGFF1wkNQQrHRH2yojvPWIY4okfVKCoF1fRYlKiKuwwKCI E4NUXi1T2F3yssq415iM7MvXlXzk/WANxs6CktPRgg66azHRfnUdxqJOpdajdeInLVER doy60u/GglNLr4PdssbgiJA20Hh3fbASpBCQ1Mfc6GsKyuM36oaAQTupoGKX3jMWOxuQ 6FFYGZiJs+8dYVMmrGK03JdLYTxgEwC4zFUA4W0CktZJPZQOFpZSpNQmu45lx9ZJkSkk pMyqQmwdDczEnzptFhDV1LfmhL5RGLYKe6Tj1yqYZs4EB8mJMy/Nvh2yecU21YgPr2kP QLqQ== MIME-Version: 1.0 X-Received: by 10.112.22.196 with SMTP id g4mr9385lbf.47.1394386615761; Sun, 09 Mar 2014 10:36:55 -0700 (PDT) Received: by 10.112.201.165 with HTTP; Sun, 9 Mar 2014 10:36:55 -0700 (PDT) In-Reply-To: <83ob1ptfr3.fsf@gnu.org> References: <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> <83fvn3wj3m.fsf@gnu.org> <20140301214427.GA14106@openwall.com> <83ob1ptfr3.fsf@gnu.org> Date: Sun, 9 Mar 2014 18:36:55 +0100 Message-ID: From: Agustin Martin Content-Type: multipart/alternative; boundary=14dae9473ba3e97f5e04f42feef9 X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --14dae9473ba3e97f5e04f42feef9 Content-Type: text/plain; charset=ISO-8859-1 2014-03-02 4:56 GMT+01:00 Eli Zaretskii : > > Date: Sun, 2 Mar 2014 01:44:27 +0400 > > From: Aleksey Cherepanov > > Cc: Agustin Martin , 16800@debbugs.gnu.org > > > > We could try to limit execution time. Though checks for that could > > slow down the search. > > Exactly. > I see other problem with that, it is not deterministic, since the limit depends on system load. I have mixed feelings about changing current default from unlimited, but I slowly changing my mind towards having a big but not unlimited value as default. On the one hand, not putting limits in default value looks nicer, but on the other that may have a non negligible impact in performance for really huge files, as Eli points out. Alexey's one-liner is 30000 lines and 2.4e6 chars size. While new code seems to work for it, I'd put the limit somewhere lower, no more than 1e6. This should be huge enough for any practical use and for anyone to notice the difference. Regards, -- Agustin --14dae9473ba3e97f5e04f42feef9 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
2014= -03-02 4:56 GMT+01:00 Eli Zaretskii <eliz@gnu.org>:
> Date: Sun, 2 Mar 2014 01:44:27 +0400
> From: Aleksey Cherepanov <aleksey.4erepanov@gmail.com>
> We could try to limit execution time. Though che= cks for that could
> slow down the search.

Exactly.

I see other problem with= that, it is not deterministic, since the limit depends on system load.
=
I have mixed feelings about changing current default from un= limited, but I slowly changing my mind towards having=A0 a big but not unli= mited value as default.

On the one hand, not putting limits in default value looks n= icer, but on the other that may have a non negligible impact in performance= for really huge files, as Eli points out.

Alexey's o= ne-liner is 30000 lines and 2.4e6 chars size. While new code seems to work = for it, I'd put the limit somewhere lower, no more than 1e6. This shoul= d be huge enough for any practical use and for anyone to notice the differe= nce.

Regards,

--
Agustin

--14dae9473ba3e97f5e04f42feef9-- From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 09 Mar 2014 18:03:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.13943881724989 (code B ref 16800); Sun, 09 Mar 2014 18:03:01 +0000 Received: (at 16800) by debbugs.gnu.org; 9 Mar 2014 18:02:52 +0000 Received: from localhost ([127.0.0.1]:58334 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WMi3n-0001IO-Os for submit@debbugs.gnu.org; Sun, 09 Mar 2014 14:02:52 -0400 Received: from mail-la0-f45.google.com ([209.85.215.45]:54977) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WMi3l-0001IF-5w for 16800@debbugs.gnu.org; Sun, 09 Mar 2014 14:02:49 -0400 Received: by mail-la0-f45.google.com with SMTP id hr17so4122931lab.32 for <16800@debbugs.gnu.org>; Sun, 09 Mar 2014 11:02:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=Cpp5ab4C4kxsj9TvE10j7h5a9OzlMP4mHY7jUK9/Mc0=; b=dilHaO6FV9trptOIUDNW+UiXDgqLpia1oliiHznGFLYvCW0Gyvj3914Q3lpMaPbAly xpGJD2u2x1FhRPI9IcMDcbdH7kVb5QdTbmAH8oRLlNbhB+KK2elSV+yOavtGQv39bXk1 V62KikKxZLOCvCt5+hNeh4kqbILe8ohJTWvOG9eo1geTgyoLgscSD0HwHuKU6eZN+DxA /h1IYrJ3s5QsrQ3PXzvWtRPNdJTJbE0xWA8Pm3dzxwlOQj5EoMSfB9KlO2upqqQcK+pC JLyjsjGUkGseYV+OBOFGiYtTT8xRQyksuxIBmvYGBEZHktuWWAajLrGY4pnayKiRQxjL PYMg== X-Received: by 10.112.142.40 with SMTP id rt8mr9236lbb.52.1394388167812; Sun, 09 Mar 2014 11:02:47 -0700 (PDT) Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id r5sm13807216lbb.7.2014.03.09.11.02.46 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sun, 09 Mar 2014 11:02:46 -0700 (PDT) Date: Sun, 9 Mar 2014 22:02:44 +0400 From: Aleksey Cherepanov Message-ID: <20140309180244.GA24331@openwall.com> References: <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> <83fvn3wj3m.fsf@gnu.org> <20140301214427.GA14106@openwall.com> <83ob1ptfr3.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Sun, Mar 09, 2014 at 06:36:55PM +0100, Agustin Martin wrote: > 2014-03-02 4:56 GMT+01:00 Eli Zaretskii : > > > > Date: Sun, 2 Mar 2014 01:44:27 +0400 > > > From: Aleksey Cherepanov > > > Cc: Agustin Martin , 16800@debbugs.gnu.org > > > > > > We could try to limit execution time. Though checks for that could > > > slow down the search. > > > > Exactly. > > > > I see other problem with that, it is not deterministic, since the limit > depends on system load. We may have limit by time with additional limit for lowest radius so the search works not less than for N chars and if it looks further then not more than M milliseconds. Thanks! -- Regards, Aleksey Cherepanov From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 09 Mar 2014 18:25:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Aleksey Cherepanov Cc: 16800@debbugs.gnu.org, agustimartin@gmail.com Reply-To: Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.13943894957480 (code B ref 16800); Sun, 09 Mar 2014 18:25:01 +0000 Received: (at 16800) by debbugs.gnu.org; 9 Mar 2014 18:24:55 +0000 Received: from localhost ([127.0.0.1]:58344 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WMiP8-0001wZ-V6 for submit@debbugs.gnu.org; Sun, 09 Mar 2014 14:24:55 -0400 Received: from mtaout22.012.net.il ([80.179.55.172]:55937) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WMiP6-0001wQ-5t for 16800@debbugs.gnu.org; Sun, 09 Mar 2014 14:24:53 -0400 Received: from conversion-daemon.a-mtaout22.012.net.il by a-mtaout22.012.net.il (HyperSendmail v2007.08) id <0N2600900LMHYB00@a-mtaout22.012.net.il> for 16800@debbugs.gnu.org; Sun, 09 Mar 2014 20:24:50 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout22.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N26009NILTEWO20@a-mtaout22.012.net.il>; Sun, 09 Mar 2014 20:24:50 +0200 (IST) Date: Sun, 09 Mar 2014 20:24:33 +0200 From: Eli Zaretskii In-reply-to: <20140309180244.GA24331@openwall.com> X-012-Sender: halo1@inter.net.il Message-id: <8361nnp6vy.fsf@gnu.org> References: <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> <83fvn3wj3m.fsf@gnu.org> <20140301214427.GA14106@openwall.com> <83ob1ptfr3.fsf@gnu.org> <20140309180244.GA24331@openwall.com> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Sun, 9 Mar 2014 22:02:44 +0400 > From: Aleksey Cherepanov > Cc: 16800@debbugs.gnu.org > > On Sun, Mar 09, 2014 at 06:36:55PM +0100, Agustin Martin wrote: > > 2014-03-02 4:56 GMT+01:00 Eli Zaretskii : > > > > > > Date: Sun, 2 Mar 2014 01:44:27 +0400 > > > > From: Aleksey Cherepanov > > > > Cc: Agustin Martin , 16800@debbugs.gnu.org > > > > > > > > We could try to limit execution time. Though checks for that could > > > > slow down the search. > > > > > > Exactly. > > > > > > > I see other problem with that, it is not deterministic, since the limit > > depends on system load. > > We may have limit by time with additional limit for lowest radius so > the search works not less than for N chars and if it looks further > then not more than M milliseconds. FWIW, I think this would be over-engineering. Emacs never does anything like that, we always have simple, deterministic limits in terms of characters or lines. This makes it easy for users to customize their sessions in a way whose effect is simple to understand and predictable. From unknown Sat Aug 16 14:25:58 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.503 (Entity 5.503) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Aleksey Cherepanov Subject: bug#16800: closed (Re: bug#16800: 24.3; flyspell works slow on very short words at the end of big file) Message-ID: References: <85zjlo5ecy.fsf@gmail.com> X-Gnu-PR-Message: they-closed 16800 X-Gnu-PR-Package: emacs Reply-To: 16800@debbugs.gnu.org Date: Fri, 06 Mar 2015 21:48:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1425678482-11731-1" This is a multi-part message in MIME format... ------------=_1425678482-11731-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #16800: 24.3; flyspell works slow on very short words at the end of big file which was filed against the emacs package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 16800@debbugs.gnu.org. --=20 16800: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D16800 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1425678482-11731-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 16800-done) by debbugs.gnu.org; 6 Mar 2015 21:47:08 +0000 Received: from localhost ([127.0.0.1]:37898 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YU05L-00031z-Db for submit@debbugs.gnu.org; Fri, 06 Mar 2015 16:47:07 -0500 Received: from mail-lb0-f179.google.com ([209.85.217.179]:43304) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YU05J-00031U-9q for 16800-done@debbugs.gnu.org; Fri, 06 Mar 2015 16:47:05 -0500 Received: by lbvp9 with SMTP id p9so25230588lbv.10 for <16800-done@debbugs.gnu.org>; Fri, 06 Mar 2015 13:46:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:content-type; bh=tQk6wzshp5UhdsUSjtFZMD73n5j0L/Y/bRp+KxveAik=; b=0f+v4E5vYYf/9xM5/ufDs12kmpjTr62qrLrkUQwmLBPnkgS84dHOkw8I9SFJR12b7z jWD1uVxFxXNutzwToCYi/ldjYG9jqS6vh2Kjb3zVfuiklNdY6gzlg+ByTNidCaD7bDm1 NWAAQU1fGkAVPAqQKti1eQc6qsqwZaxk79Gd2nXEcx7OBLtzQqsX0m6mM9Txh3I60u6A F8KW+YPB1qS4XLsyma3ljP3FGNAy805/IvarYhEykV7vSeOFA/P1ngMpdspxZatgurm4 /2G9RT70cizzlQHGfi3aamm6i4pXd5iUqA1LIpZGh8cX2j4vErwtPML3E/ar2tbyI+tY fsFA== MIME-Version: 1.0 X-Received: by 10.112.134.167 with SMTP id pl7mr14563657lbb.63.1425678419378; Fri, 06 Mar 2015 13:46:59 -0800 (PST) Received: by 10.112.15.81 with HTTP; Fri, 6 Mar 2015 13:46:59 -0800 (PST) In-Reply-To: References: <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> <20140228231141.GA20782@openwall.com> Date: Fri, 6 Mar 2015 22:46:59 +0100 X-Google-Sender-Auth: 0gsDJPStJfgh5unQlxuh73XXZFg Message-ID: Subject: Re: bug#16800: 24.3; flyspell works slow on very short words at the end of big file From: Agustin Martin To: Aleksey Cherepanov , 16800-done@debbugs.gnu.org Content-Type: multipart/alternative; boundary=089e011767f9c0498b0510a59f44 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 16800-done X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --089e011767f9c0498b0510a59f44 Content-Type: text/plain; charset=UTF-8 Version: 24.5 2014-03-09 18:25 GMT+01:00 Agustin Martin : > > I do not expect to have time to check everything shortly. Also, note that > Emacs 24.4 is on the way, so trunk is frozen for anything potentially > problematic. Unless I consider the code absolutely rock solid, I'd prefer > to wait until 24.4 is released. > Hi, emacs24 was released, fix committed to emacs-24 branch and merged into trunk. It is time to close this bug report. Thanks for your contribution to the project. Best regards, -- Agustin --089e011767f9c0498b0510a59f44 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Vers= ion: 24.5

2014-03-09 18:25 GMT+01:0= 0 Agustin Martin <agustimartin@gmail.com>:

I do not expect to have time to check everything= shortly. Also, note that Emacs 24.4 is on the way, so trunk is frozen for = anything potentially problematic. Unless I consider the code absolutely roc= k solid, I'd prefer to wait until 24.4 is released.

Hi,

emacs24 was = released, fix committed to emacs-24 branch and merged into trunk.

It is time to close this bug report.

Thanks for = your contribution to the project. Best regards,

--
Ag= ustin

=C2=A0
--089e011767f9c0498b0510a59f44-- ------------=_1425678482-11731-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 18 Feb 2014 20:58:34 +0000 Received: from localhost ([127.0.0.1]:58979 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WFrkP-0007mP-BK for submit@debbugs.gnu.org; Tue, 18 Feb 2014 15:58:34 -0500 Received: from eggs.gnu.org ([208.118.235.92]:45516) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WFrjU-0007kO-ID for submit@debbugs.gnu.org; Tue, 18 Feb 2014 15:57:37 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WFrjJ-0008ST-CI for submit@debbugs.gnu.org; Tue, 18 Feb 2014 15:57:31 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:52014) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WFrjJ-0008SP-9W for submit@debbugs.gnu.org; Tue, 18 Feb 2014 15:57:25 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47205) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WFrjD-0000TE-8q for bug-gnu-emacs@gnu.org; Tue, 18 Feb 2014 15:57:25 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WFrj7-0008QJ-Ct for bug-gnu-emacs@gnu.org; Tue, 18 Feb 2014 15:57:19 -0500 Received: from mail-lb0-x22f.google.com ([2a00:1450:4010:c04::22f]:55379) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WFrj7-0008Q1-5A for bug-gnu-emacs@gnu.org; Tue, 18 Feb 2014 15:57:13 -0500 Received: by mail-lb0-f175.google.com with SMTP id p9so12512074lbv.6 for ; Tue, 18 Feb 2014 12:57:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:subject:date:message-id:mime-version:content-type; bh=8zuBCm6c4EwDny8RU7eDAh1FMMINcJHudOVjrroOOTU=; b=BWEH9qsdEP/v8TkyBuRBKbxERRC6btm40cxYhy98Kh/srOPNgch1jtgxQGAMYpad+t VHfWdVjyBE4uOC75Xx6+jK/EWjbWfecbz8nSr8iX/yog9JEe2EoR3Wc8oouFMo/7YFQ6 hlZgYcX9p1QAlKFb+03ybDUGrN99OmYeTIMHSgXJC1aLvHk3br3mTd0XcnyXIFhwNaok 4Va6a2uD+XK/tZj1Eb9I6vEqYdWaA9b2sfH3ID0FD1srHlCd62ywTKibg20MEimGGHli yWFT8lrXTck1qFzPA5JuUZHhiNMjf8oFwqGq9NFN/N118ZVgp4pkBvkmyWjbkXOeDRzJ ZjrA== X-Received: by 10.152.234.3 with SMTP id ua3mr142928lac.63.1392757031511; Tue, 18 Feb 2014 12:57:11 -0800 (PST) Received: from debian ([188.123.230.115]) by mx.google.com with ESMTPSA id y2sm33703583lal.10.2014.02.18.12.57.10 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Tue, 18 Feb 2014 12:57:10 -0800 (PST) From: Aleksey Cherepanov To: bug-gnu-emacs@gnu.org Subject: 24.3; flyspell works slow on very short words at the end of big file Date: Wed, 19 Feb 2014 00:56:45 +0400 Message-ID: <85zjlo5ecy.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Tue, 18 Feb 2014 15:58:32 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) Package: emacs Version: 24.3 Severity: normal Dear Maintainers, It is a copy of bug #739412 in Debian. Debian uses bug tracker similar to this one. The bug on web: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=739412 Address to continue thread there: 739412@bugs.debian.org * What led up to the situation? I faced a problem editing my big .org file (2mb+) with flyspell-mode enabled. I edit it every day, regularly mistype and get words of one or two letters that are wrong in Russian and cause flyspell work slow. This one-liner produces "good" file to reproduce the bug. perl -e 'print(((join " ", ("met and") x 10) . "\n") x 30000)' > t.txt Typing "nd" at the end of file gives a huge pause even on a fast computer. But "mw" or "md" does not give pauses because they are not substrings in this file. It is repeatable with emacs -Q. * What exactly did you do (or not do) that was effective (or ineffective)? So exact sequence is $ emacs --version GNU Emacs 24.3.1 $ emacs23 --version GNU Emacs 23.4.1 $ perl -e 'print(((join " ", ("met and") x 10) . "\n") x 30000)' > t.txt $ LANG=C emacs -Q t.txt Then in emacs: M-x flyspell-mode RET M-> nd SPC 'emacs23 -Q t.txt' works the same way. LANG=C affects regular words because default dictionary is Russian on my system so without LANG=C all words ("met" and "and") are considered misspelled. But it does not affect huge pause at the end. * What was the outcome of this action? Huge pause when emacs does not react on keys except C-g. Word "nd" is colored as misspelled after the pause. C-g stops emacs internal thinking and I could work without waiting but word "nd" is not colored as misspelled word. * What outcome did you expect instead? I expect it to work as fast as with other words like "md" or "mw" that does not produce a pause and are colored immediately. I tried to patch flyspell-word-search-backward and flyspell-word-search-forward functions from flyspell.el replacing search-backward with word-search-backward and search-forward with word-search-forward (perl -pe 's/\(search-/(word-search-/' ). It solved the problem but I do not know what it broke. I expect problems with this solution because I do not know if flyspell's meaning of word is the same as emacs' one. I think it is described in flyspell-get-word function that is called after search-* in the patched functions. flyspell-duplicate-distance variable on its own could mitigate the problem but it changes the behaviour so I do not want to use this variable. Thanks! -- Regards, Aleksey Cherepanov In GNU Emacs 24.3.1 (x86_64-pc-linux-gnu, GTK+ Version 3.8.6) of 2013-12-23 on brahms, modified by Debian Windowing system distributor `The X.Org Foundation', version 11.0.11405000 System Description: Debian GNU/Linux testing (jessie) Configured using: `configure '--build' 'x86_64-linux-gnu' '--build' 'x86_64-linux-gnu' '--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--localstatedir=/var/lib' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--with-pop=yes' '--enable-locallisppath=/etc/emacs24:/etc/emacs:/usr/local/share/emacs/24.3/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/24.3/site-lisp:/usr/share/emacs/site-lisp' '--with-crt-dir=/usr/lib/x86_64-linux-gnu' '--with-x=yes' '--with-x-toolkit=gtk3' '--with-toolkit-scroll-bars' 'build_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Wall' 'LDFLAGS=-Wl,-z,relro' 'CPPFLAGS=-D_FORTIFY_SOURCE=2'' Important settings: value of $LANG: ru_RU.UTF-8 value of $XMODIFIERS: @im=ibus locale-coding-system: utf-8-unix default enable-multibyte-characters: t ------------=_1425678482-11731-1-- From unknown Sat Aug 16 14:25:58 2025 X-Loop: help-debbugs@gnu.org Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 07 Mar 2015 08:10:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Agustin Martin Cc: 16800@debbugs.gnu.org, aleksey.4erepanov@gmail.com, agustin6martin@gmail.com Reply-To: Eli Zaretskii Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.142571575215946 (code B ref 16800); Sat, 07 Mar 2015 08:10:01 +0000 Received: (at 16800) by debbugs.gnu.org; 7 Mar 2015 08:09:12 +0000 Received: from localhost ([127.0.0.1]:38083 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YU9nM-000498-41 for submit@debbugs.gnu.org; Sat, 07 Mar 2015 03:09:12 -0500 Received: from mtaout27.012.net.il ([80.179.55.183]:45339) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YU9nI-00048t-TN for 16800@debbugs.gnu.org; Sat, 07 Mar 2015 03:09:10 -0500 Received: from conversion-daemon.mtaout27.012.net.il by mtaout27.012.net.il (HyperSendmail v2007.08) id <0NKU00B000Y65Y00@mtaout27.012.net.il> for 16800@debbugs.gnu.org; Sat, 07 Mar 2015 10:03:37 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout27.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NKU0071P121DD40@mtaout27.012.net.il>; Sat, 07 Mar 2015 10:03:37 +0200 (IST) Date: Sat, 07 Mar 2015 10:09:07 +0200 From: Eli Zaretskii In-reply-to: X-012-Sender: halo1@inter.net.il Message-id: <83mw3pmdp8.fsf@gnu.org> References: <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> <20140228114545.GA8669@agmartin.aq.upm.es> <20140228231141.GA20782@openwall.com> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Fri, 6 Mar 2015 22:46:59 +0100 > From: Agustin Martin > > emacs24 was released, fix committed to emacs-24 branch and merged into trunk. Thanks. But since we are currently pretesting 24.5, please avoid committing to the emacs-24 branch changes that do not fix clear regressions in 24.4.