GNU bug report logs -
#74666
31.0.50; Regression in replace-match with empty-adjacent groups
Previous Next
Full log
Message #14 received at 74666 <at> debbugs.gnu.org (full text, mbox):
> (defun test-me (is-forward)
> (let ((result ""))
> (with-temp-buffer
> (insert "__B_\n")
> (save-match-data
> (set-match-data (list 2 4 2 2 2 4))
> (cond
> (is-forward
> (replace-match "HELLO" t t nil 1)
> (replace-match "WORLD" t t nil 2))
> (t
> (replace-match "WORLD" t t nil 2)
> (replace-match "HELLO" t t nil 1))))
> (setq result (buffer-substring-no-properties (point-min)
> (point-max))))
> result))
[...]
> In emacs 29.4 this prints:
>
> A: _HELLOWORLD_
> B: _HELLOWORLD_
>
> In emacs 31.0.50 this prints:
>
> A: _WORLD_
> B: _HELLOWORLD_
The problem is that the `set-match-data` doesn't give us any information
about the intended inclusion relationship between the subgroups.
I agree that the behavior you see is not the one you want if it's the
result of:
(goto-char (point-min))
(looking-at "_\\(\\)\\(_B\\)")
But OTOH it is the one we want if it is the result of:
(goto-char (point-min))
(looking-at "_\\(?2:\\(?1:\\)_B\\)")
We can try and guess the inclusion relationship based on circumstantial
evidence (e.g. a "_\\(\\)\\(_B\\)" regexp is more likely than
"_\\(?2:\\(?1:\\)_B\\)"), but that would make the code of
`update_search_regs` tricky, with various heuristics.
And we'll never handle all cases right unless we make significant
changes to the match-data (and the regexp compiler) to keep track of
inclusion relationships.
Could you give us some information about the larger context in which you
bumped into this problem?
Stefan
This bug report was last modified 176 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.