GNU bug report logs - #76731
C-style comment regexp example in (info "(elisp)Rx Notation") is not correct

Previous Next

Package: emacs;

Reported by: "Yue Yi" <include_yy <at> qq.com>

Date: Tue, 4 Mar 2025 04:09:02 UTC

Severity: wishlist

Done: Mattias EngdegÄrd <mattias.engdegard <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #21 received at 76731-done <at> debbugs.gnu.org (full text, mbox):

From: Mattias EngdegÄrd <mattias.engdegard <at> gmail.com>
To: Yue Yi <include_yy <at> qq.com>
Cc: Stefan Kangas <stefankangas <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, 76731-done <at> debbugs.gnu.org
Subject: Re: bug#76731: C-style comment regexp example in (info "(elisp)Rx
 Notation") is not correct
Date: Sat, 17 May 2025 12:21:52 +0200
16 maj 2025 kl. 17.12 skrev Yue Yi <include_yy <at> qq.com>:

> I'm not an expert in regular expressions, but it seems that cases like C
> block comments are hard to handle without introducing
> backtracking.

I see no fundamental reason why they should be, as the C comment syntax can be parsed efficiently by a tiny state machine. The first "/*" encountered is always the beginning of the comment on matter what is found later, and the first "*/" after that is always the end. There is never any reason to go back and try a different parse.

Non-DFA regexp engines such as the one in Emacs need some hacks and/or carefully formulated regexps to avoid consuming stack space but that's a different matter. I still think we should be able to do better with either your or my regexps.

I kept your proposed fix instead of switching to a different example. The quoted-string case is simpler but the amount of backslashes detracted from the point of the exercise.

Fix pushed to master. Thank you again!





This bug report was last modified 52 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.