GNU bug report logs - #22090
Isearch is sluggish and eventually refuses further service with "[Too many words]".

Previous Next

Package: emacs;

Reported by: Alan Mackenzie <acm <at> muc.de>

Date: Fri, 4 Dec 2015 04:26:01 UTC

Severity: normal

Done: Alan Mackenzie <acm <at> muc.de>

Bug is archived. No further changes may be made.

Full log


Message #67 received at 22090 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Alan Mackenzie <acm <at> muc.de>
Cc: 22090 <at> debbugs.gnu.org
Subject: Re: bug#22090: Isearch is sluggish and eventually refuses further
 service with "[Too many words]".
Date: Sat, 5 Dec 2015 17:23:53 +0000
nn2015-12-04 23:00 GMT+00:00 Alan Mackenzie <acm <at> muc.de>:
>> When case-fold-search is on the previous code would simply join these
>> regexps with "\\(\\(a[Β΄`]?\\|[Γ‘Γ π‘Ž]\\)\\|\\(A[`Β΄]?\\|[ÁÀ]\\)\\)".
>
> Quick question: _why_ do you need to join them?  Given that
> case-fold-search is enabled, couldn't you just use, say, the lower case
> version?

Because there are some characters in each regexp that don't have
lower/upper-case equivalents. For instance, if I use the
"\\(\\(a[Β΄`]?\\|[Γ‘Γ π‘Ž]\\)" regexp, that's enough to match A or Γ€, but
it's not enough to match a variety of other chars (π”Έπ•¬π– π—”π˜ˆπ˜Όπ™°πŸ„°).

> it looks to me that this redundancy would
> be quite easy to eliminate - you just need three regexp fragments for
> the letter "a" - a lower case one, an upper case one and a
> case-fold-search one.

Yes, we could go that route. It's just going to add complexity to the
code that generates the char-fold-table (which is already quite dense)
and I wonder if it's worth such a corner-case. Like I said, 'a'
already matches A and Γ€, how much do we want to support this extra
case-folding?

> The other thing is that for that single character "a" a 39 character
> regexp fragment is being generated.  Might this have something to do
> with the "[Too many words]" error I got last night (which comes from the
> regexp engine returning a "too long regexp" error)?

yes

> Even if you can reduce that to, say 19 characters, that's only winning a
> factor of 2 in the slide towards a too long regexp.  It might well be
> that for a very long regexp, you might have to divide it into shorter
> sections (a typical long RE will by a sequence of sub expressions,
> rather than lots of alternatives inside \(...\|........\)).

I don't understand what you mean. Could you elaborate?




This bug report was last modified 9 years and 171 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.