GNU bug report logs -
#29343
Match data doesn't contain elements for trailing non-matched subgroups
Previous Next
Reported by: Philipp Stephani <p.stephani2 <at> gmail.com>
Date: Fri, 17 Nov 2017 20:12:01 UTC
Severity: minor
Found in version 27.0.50
Fixed in version 29.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
Message #24 received at 29343 <at> debbugs.gnu.org (full text, mbox):
Am Fr., 19. Apr. 2019 um 20:29 Uhr schrieb Noam Postavsky <npostavs <at> gmail.com>:
>
> Philipp Stephani <p.stephani2 <at> gmail.com> writes:
>
> > Am Sa., 17. März 2018 um 01:37 Uhr schrieb Noam Postavsky <npostavs <at> gmail.com>:
> >>
> >> Philipp Stephani <p.stephani2 <at> gmail.com> writes:
> >>
> >> > $ emacs -Q -batch -eval '(progn (string-match "^\\(a\\)?\\(b\\)\\(c\\)?$" "b") (print (match-data)))'
> >> > (0 1 nil nil 0 1)
> >> >
> >> > Note that neither the `a` nor the `c` group matched, but there are
> >> > entries for `a` in `match-data`, but not for `c`. This makes working
> >> > with the match data unnecessarily hard because its length depends on
> >> > whether certain optional groups have matched or not. I haven't seen any
> >> > discussion about this behavior in either the manual or the docstring. I
> >> > think the match data in this case should be (0 1 nil nil 0 1 nil nil).
> >>
> >> You can get that result by passing a list of the expected length as the
> >> REUSE argument to match-data:
> >
> > True, but that also requires knowing the expected length. In the most
> > general case this should work for unknown regular expressions.
>
> I don't understand how the general case you describe could occur. If
> you don't know the expected length, that means you don't what groups are
> in the regexp, so you can only rely on group 0 existing, i.e., you only
> care about the first two elements in the match-data.
>
The context here is https://github.com/magnars/s.el/pull/117. Normally
you'd expect something like Python's Match.group
(https://docs.python.org/3/library/re.html#re.Match.group), i.e. a
group match per defined group, even if the group didn't match. That
Emacs doesn't behave this way is surprising and should at least be
documented.
This bug report was last modified 3 years and 117 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.