GNU bug report logs -
#37849
composable character alternatives in rx
Previous Next
Full log
Message #11 received at 37849 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
This patch adds `union' and `intersection' to rx. They both take zero or more charsets as arguments. A charset is either an `any' form that does not contain character classes, a `union' or `intersection' form, or a `not' form with charset argument.
Example:
(rx (union (any "a-f") (any "b-m")))
=> "[a-m]"
(rx (intersection (any "a-f") (any "b-m")))
=> "[b-f]"
The character class limitation stems from the inability to complement or intersect classes in general. It would be possible to partially lift this restriction for `union'; it is clear that
(rx (union (any "ab" space) (any "bc" space digit)))
=> "[abc[:space:][:digit:]]"
but it makes the facility harder to explain to the user in a way that makes sense. Still, it could be a future extension.
A `difference' operator was not included but could be added; it is trivially defined in rx as
(rx-define difference (a b)
(intersection a (not b)))
The names `union' and `intersection' are verbose, but should be rare enough that it's better with something descriptive.
SRE, from where the concept was taken, uses `|' and `&' respectively, and `~' for complement, `-' for difference.
[0001-Add-union-and-intersection-to-rx-bug-37849.patch (application/octet-stream, attachment)]
This bug report was last modified 5 years and 161 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.