GNU bug report logs -
#58727
29.0.50; rx doc: Semantics of RX...
Previous Next
Reported by: Michael Heerdegen <michael_heerdegen <at> web.de>
Date: Sun, 23 Oct 2022 02:33:02 UTC
Severity: normal
Found in version 29.0.50
Done: Michael Heerdegen <michael_heerdegen <at> web.de>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
Hello,
please document the semantics of multiple RXs for the RX repetition
operators (and maybe grouping operators, too).
The resulting regexps are concatenating like with an implicit `seq'.
This is not trivial, though: in stringish regexps the repetition
operators are only unary, and different interpretations would make sense
for `rx' (implicit `seq', implicit `or').
The docstring of `rx' doesn't tell anything about this. The manual has
sentences like
| ‘(zero-or-more RX...)’
| ‘(0+ RX...)’
| Match the RXs zero or more times. Greedy by default.
| Corresponding string regexp: ‘A*’ (greedy), ‘A*?’ (non-greedy)
but that suffers from the same problem that the semantics of A are not
clear: A == (seq RX...) ?
Oh, and maybe let's also make more clear that `rx' always cares about
implicit grouping when necessary. For example, in
(info "(elisp) Rx Constructs") it's not trivial that e.g. in
‘(seq RX...)’
‘(sequence RX...)’
‘(: RX...)’
‘(and RX...)’
Match the RXs in sequence. Without arguments, the expression
matches the empty string.
Corresponding string regexp: ‘AB...’ (subexpressions in sequence).
`rx' silently adds shy grouping to the result, and the corresponding string
regexp in this case is more precisely \(?:AB...\). I think it is enough
to mention this implicit grouping feature once, but it is important to
spell it out.
TIA,
Michael.
This bug report was last modified 2 years and 212 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.