GNU bug report logs -
#76731
C-style comment regexp example in (info "(elisp)Rx Notation") is not correct
Previous Next
Reported by: "Yue Yi" <include_yy <at> qq.com>
Date: Tue, 4 Mar 2025 04:09:02 UTC
Severity: wishlist
Done: Mattias Engdegård <mattias.engdegard <at> gmail.com>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Hello Emacs, In Elisp Manual's Rx Notation section, we have ------------------------------------------------------------------- Here is an ¡®rx¡¯ regexp(1) that matches a block comment in the C programming language: (rx "/*" ; Initial /* (zero-or-more (or (not "*") ; Either non-*, (seq "*" ; or * followed by (not "/")))) ; non-/ (one-or-more "*") ; At least one star, "/") ; and the final / or, using shorter synonyms and written more compactly, (rx "/*" (* (| (not "*") (: "*" (not "/")))) (+ "*") "/") In conventional string syntax, it would be written "/\\*\\(?:[^*]\\|\\*[^/]\\)*\\*+/" -------------------------------------------------------------------- Sadly, this regexp is not correct, as demonstated by this simple example: (Try M-x isearch-forward-regexp with /\*\(?:[^*]\|\*[^/]\)*\*+/) /***/ 123 /* anything else */ As you can see, the entire line above is highlighted by the search, meaning that the whole line has been matched. In fact, this issue occurs when the number of asterisks in /*(nstar)*/ is odd. The correct regular expression is: /\*\(?:[^*]\|\*+[^*/]\)*\*+/ The corresponding RX expression in the original document could be: (rx "/*" (zero-or-more (or (not "*") (seq (one-or-more "*") (not (or "*" "/"))))) (one-or-more "*") "/") Or: (rx "/*" (* (| (not "*") (: (1+ "*") (not (or "*" "/"))))) (1+ "*") "/") BTW, using non-greedy `*?', the simplest way might be: (rx "/*" (*? anything) "*/") "/\\*[^z-a]*?\\*/" or "/\\*\\(?:.\\|\n\\)*?\\*/" Regards.
[Message part 2 (text/html, inline)]
This bug report was last modified today.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.