GNU bug report logs -
#31817
Possible bug in regex search
Previous Next
To reply to this bug, email your comments to 31817 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#31817
; Package
emacs
.
(Wed, 13 Jun 2018 17:54:03 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Michele Pes <mp81ss <at> rambler.ru>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Wed, 13 Jun 2018 17:54:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi,I'm working with regex, and I isolated an unexpected behaviour.If I paste
attached code code in scratch buffer and evaluate it, emacs hangs (cpu stays at
100% until I kill emacs)Im running emacs for windows x64, in particular this
package:
https://sourceforge.net/projects/emacsbinw64/files/release/emacs-w64-25.3-O2-with-modules.7z/download
The variable case-fold-search is set to t.
Can you help me?
Thank you very much,
Michele Pes
[Message part 2 (text/html, inline)]
[bug.el (application/octet-stream, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#31817
; Package
emacs
.
(Thu, 14 Jun 2018 00:38:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 31817 <at> debbugs.gnu.org (full text, mbox):
"Michele Pes" <mp81ss <at> rambler.ru> writes:
> Hi,I'm working with regex, and I isolated an unexpected behaviour.If I paste
> attached code code in scratch buffer and evaluate it, emacs hangs (cpu stays at
> 100% until I kill emacs)
This is because the current regexp engine is implemented with
backtracking, so certain patterns trigger an exponential amount of
searching.
> (let (
> (regex
> (concat "[ \f\t\n\r\v]" "*\\(" "_*[[:alpha:]]+[A-Z0-9a-z_]*"
So here, [[:alpha:]]+ and [A-Z0-9a-z_]* have a lot of overlap, so when
matching against "void", for example, the matcher could match
[[:alpha:]]+ with "v", "vo", "voi", or "void". And for each of these it
goes and matches the rest of the pattern (with additional backtracking)
all of which takes a long time.
It's better to use non-overlapping matches, like
(let ((regex
(concat "[ \f\t\n\r\v]*"
"\\(" "[_[:alpha:]][A-Z0-9a-z_]*" "[ \f\t\n\r\v\\*]*" "\\)+"
"(" "[ \f\t\n\r\v]*" "\\*?" "[ \f\t\n\r\v]*"
"\\(" "[_[:alpha:]][A-Z0-9a-z_]*" "\\)"
"[ \f\t\n\r\v]*" "[()]"))
(x "void Vsdk_Init(void) { /* Open watchdog iwdt driver, watchdog is running now */ iwdt.p_api->open"))
(if (and (string-match "(" x)
(string-match regex x))
(match-string 2 x)
(message "did not match")))
Also note, if you meant the regex to match starting from the beginning
of the string, you should prefix it with "\\`", so that the string-match
won't try other positions.
Further reading:
https://en.wikipedia.org/wiki/ReDoS
https://swtch.com/~rsc/regexp/regexp1.html
Merged 6640 20230 31817.
Request was from
Noam Postavsky <npostavs <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Sat, 16 Jun 2018 13:56:02 GMT)
Full text and
rfc822 format available.
This bug report was last modified 6 years and 73 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.