GNU bug report logs -
#74666
31.0.50; Regression in replace-match with empty-adjacent groups
Previous Next
To reply to this bug, email your comments to 74666 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#74666
; Package
emacs
.
(Tue, 03 Dec 2024 10:57:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Campbell Barton <ideasman42 <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Tue, 03 Dec 2024 10:57:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Run the following file as an executable script:
emacs -Q --batch -l ./emacs-test.el
;; BEGIN SCRIPT:
#!/usr/bin/env -S emacs --batch -l
;;; emacs-git-log --- Batch check emacs-lisp. -*- lexical-binding: t -*-
(defun printf (&rest args)
(princ (apply #'format args) #'external-debugging-output))
(defun test-me (is-forward)
(let ((result ""))
(with-temp-buffer
(insert "__B_\n")
(save-match-data
(set-match-data (list 2 4 2 2 2 4))
(cond
(is-forward
(replace-match "HELLO" t t nil 1)
(replace-match "WORLD" t t nil 2))
(t
(replace-match "WORLD" t t nil 2)
(replace-match "HELLO" t t nil 1))))
(setq result (buffer-substring-no-properties (point-min)
(point-max))))
result))
(printf "A: %s" (test-me t))
(printf "B: %s" (test-me nil))
;; END SCRIPT:
In emacs 29.4 this prints:
A: _HELLOWORLD_
B: _HELLOWORLD_
In emacs 31.0.50 this prints:
A: _WORLD_
B: _HELLOWORLD_
This is a regression in 47b497b4dac91e5ea56102018223bdeb5e21a93b.
In GNU Emacs 31.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version
3.24.43, cairo version 1.18.2) of 2024-12-03 built on
campbell-trx40aorusmaster
Repository revision: 04d035acd72823e82dd61c4894f9c09113e65dd6
Repository branch: master
System Description: Arch Linux
Configured using:
'configure --prefix=/opt/emacs --with-sound=no --disable-acl
--with-file-notification=no --with-xpm --with-native-compilation
--with-jpeg --with-tiff --with-gif --with-png
--without-compress-install --without-rsvg --without-lcms2
--without-libsystemd --without-gsettings --without-harfbuzz
--without-m17n-flt --without-libotf --without-gpm --without-dbus
--without-gsettings --without-gconf --without-selinux --without-gnutls
--without-makeinfo --without-libgmp --with-zlib --with-threads
--with-x-toolkit=gtk3 --with-xft --with-pgtk --with-tree-sitter
--with-modules 'CFLAGS=-O3 -mtune=native -march=native -pipe
-fomit-frame-pointer''
Configured features:
CAIRO FREETYPE GIF GLIB JPEG LIBXML2 MODULES NATIVE_COMP PDUMPER PGTK
PNG SECCOMP SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP
XIM GTK3 ZLIB
Important settings:
value of $LC_MONETARY: en_US.UTF-8
value of $LC_NUMERIC: en_US.UTF-8
value of $LC_TIME: en_US.UTF-8
value of $LANG: en_US.UTF-8
value of $XMODIFIERS: @im=ibus
locale-coding-system: utf-8-unix
Major mode: Lisp Interaction
Minor modes in effect:
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
minibuffer-regexp-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug message mailcap yank-media puny dired
dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg rfc6068
epg-config gnus-util text-property-search time-date subr-x mm-decode
mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader
cl-loaddefs cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util
mail-prsvr mail-utils rmc iso-transl tooltip cconv eldoc paren electric
uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
term/pgtk-win pgtk-win term/common-win touch-screen pgtk-dnd tool-bar
dnd fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow
isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax
font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
theme-loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads dynamic-setting
font-render-setting cairo gtk pgtk multi-tty move-toolbar
make-network-process native-compile emacs)
Memory information:
((conses 16 53518 10389) (symbols 48 5294 0) (strings 32 18752 1367)
(string-bytes 1 607905) (vectors 16 9151)
(vector-slots 8 128466 8940) (floats 8 22 2) (intervals 56 268 3)
(buffers 992 10))
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#74666
; Package
emacs
.
(Tue, 03 Dec 2024 14:06:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 74666 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 3 Dec 2024 21:56:02 +1100
> From: Campbell Barton <ideasman42 <at> gmail.com>
>
>
> Run the following file as an executable script:
>
> emacs -Q --batch -l ./emacs-test.el
>
> ;; BEGIN SCRIPT:
> #!/usr/bin/env -S emacs --batch -l
> ;;; emacs-git-log --- Batch check emacs-lisp. -*- lexical-binding: t -*-
>
> (defun printf (&rest args)
> (princ (apply #'format args) #'external-debugging-output))
>
> (defun test-me (is-forward)
> (let ((result ""))
> (with-temp-buffer
> (insert "__B_\n")
> (save-match-data
> (set-match-data (list 2 4 2 2 2 4))
> (cond
> (is-forward
> (replace-match "HELLO" t t nil 1)
> (replace-match "WORLD" t t nil 2))
> (t
> (replace-match "WORLD" t t nil 2)
> (replace-match "HELLO" t t nil 1))))
> (setq result (buffer-substring-no-properties (point-min)
> (point-max))))
> result))
>
> (printf "A: %s" (test-me t))
> (printf "B: %s" (test-me nil))
> ;; END SCRIPT:
>
> In emacs 29.4 this prints:
>
> A: _HELLOWORLD_
> B: _HELLOWORLD_
>
> In emacs 31.0.50 this prints:
>
> A: _WORLD_
> B: _HELLOWORLD_
>
> This is a regression in 47b497b4dac91e5ea56102018223bdeb5e21a93b.
Stefan, could you please take a look?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#74666
; Package
emacs
.
(Sat, 14 Dec 2024 09:44:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 74666 <at> debbugs.gnu.org (full text, mbox):
Ping!
> Cc: 74666 <at> debbugs.gnu.org
> Date: Tue, 03 Dec 2024 16:05:40 +0200
> From: Eli Zaretskii <eliz <at> gnu.org>
>
> > Date: Tue, 3 Dec 2024 21:56:02 +1100
> > From: Campbell Barton <ideasman42 <at> gmail.com>
> >
> >
> > Run the following file as an executable script:
> >
> > emacs -Q --batch -l ./emacs-test.el
> >
> > ;; BEGIN SCRIPT:
> > #!/usr/bin/env -S emacs --batch -l
> > ;;; emacs-git-log --- Batch check emacs-lisp. -*- lexical-binding: t -*-
> >
> > (defun printf (&rest args)
> > (princ (apply #'format args) #'external-debugging-output))
> >
> > (defun test-me (is-forward)
> > (let ((result ""))
> > (with-temp-buffer
> > (insert "__B_\n")
> > (save-match-data
> > (set-match-data (list 2 4 2 2 2 4))
> > (cond
> > (is-forward
> > (replace-match "HELLO" t t nil 1)
> > (replace-match "WORLD" t t nil 2))
> > (t
> > (replace-match "WORLD" t t nil 2)
> > (replace-match "HELLO" t t nil 1))))
> > (setq result (buffer-substring-no-properties (point-min)
> > (point-max))))
> > result))
> >
> > (printf "A: %s" (test-me t))
> > (printf "B: %s" (test-me nil))
> > ;; END SCRIPT:
> >
> > In emacs 29.4 this prints:
> >
> > A: _HELLOWORLD_
> > B: _HELLOWORLD_
> >
> > In emacs 31.0.50 this prints:
> >
> > A: _WORLD_
> > B: _HELLOWORLD_
> >
> > This is a regression in 47b497b4dac91e5ea56102018223bdeb5e21a93b.
>
> Stefan, could you please take a look?
>
>
>
>
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#74666
; Package
emacs
.
(Sat, 14 Dec 2024 16:12:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 74666 <at> debbugs.gnu.org (full text, mbox):
> (defun test-me (is-forward)
> (let ((result ""))
> (with-temp-buffer
> (insert "__B_\n")
> (save-match-data
> (set-match-data (list 2 4 2 2 2 4))
> (cond
> (is-forward
> (replace-match "HELLO" t t nil 1)
> (replace-match "WORLD" t t nil 2))
> (t
> (replace-match "WORLD" t t nil 2)
> (replace-match "HELLO" t t nil 1))))
> (setq result (buffer-substring-no-properties (point-min)
> (point-max))))
> result))
[...]
> In emacs 29.4 this prints:
>
> A: _HELLOWORLD_
> B: _HELLOWORLD_
>
> In emacs 31.0.50 this prints:
>
> A: _WORLD_
> B: _HELLOWORLD_
The problem is that the `set-match-data` doesn't give us any information
about the intended inclusion relationship between the subgroups.
I agree that the behavior you see is not the one you want if it's the
result of:
(goto-char (point-min))
(looking-at "_\\(\\)\\(_B\\)")
But OTOH it is the one we want if it is the result of:
(goto-char (point-min))
(looking-at "_\\(?2:\\(?1:\\)_B\\)")
We can try and guess the inclusion relationship based on circumstantial
evidence (e.g. a "_\\(\\)\\(_B\\)" regexp is more likely than
"_\\(?2:\\(?1:\\)_B\\)"), but that would make the code of
`update_search_regs` tricky, with various heuristics.
And we'll never handle all cases right unless we make significant
changes to the match-data (and the regexp compiler) to keep track of
inclusion relationships.
Could you give us some information about the larger context in which you
bumped into this problem?
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#74666
; Package
emacs
.
(Sun, 15 Dec 2024 01:12:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 74666 <at> debbugs.gnu.org (full text, mbox):
On 24-12-15 3:11 AM, Stefan Monnier wrote:
>> (defun test-me (is-forward)
>> (let ((result ""))
>> (with-temp-buffer
>> (insert "__B_\n")
>> (save-match-data
>> (set-match-data (list 2 4 2 2 2 4))
>> (cond
>> (is-forward
>> (replace-match "HELLO" t t nil 1)
>> (replace-match "WORLD" t t nil 2))
>> (t
>> (replace-match "WORLD" t t nil 2)
>> (replace-match "HELLO" t t nil 1))))
>> (setq result (buffer-substring-no-properties (point-min)
>> (point-max))))
>> result))
> [...]
>> In emacs 29.4 this prints:
>>
>> A: _HELLOWORLD_
>> B: _HELLOWORLD_
>>
>> In emacs 31.0.50 this prints:
>>
>> A: _WORLD_
>> B: _HELLOWORLD_
>
> The problem is that the `set-match-data` doesn't give us any information
> about the intended inclusion relationship between the subgroups.
>
> I agree that the behavior you see is not the one you want if it's the
> result of:
>
> (goto-char (point-min))
> (looking-at "_\\(\\)\\(_B\\)")
>
> But OTOH it is the one we want if it is the result of:
>
> (goto-char (point-min))
> (looking-at "_\\(?2:\\(?1:\\)_B\\)")
>
> We can try and guess the inclusion relationship based on circumstantial
> evidence (e.g. a "_\\(\\)\\(_B\\)" regexp is more likely than
> "_\\(?2:\\(?1:\\)_B\\)"), but that would make the code of
> `update_search_regs` tricky, with various heuristics.
> And we'll never handle all cases right unless we make significant
> changes to the match-data (and the regexp compiler) to keep track of
> inclusion relationships.
>
> Could you give us some information about the larger context in which you
> bumped into this problem?
On the user side - I ran into this bug when decrementing numbers broke
for me in the evil-numbers package [0]. Numbers would fail to become
negative. Decrementing 0 would become 1.
In this case, the match data is set with `set-match-data' using
calculated ranges.
Since this used to work I think it's reasonable to consider it a regression.
I've since committed a workaround to evil-numbers [1], although I'd
suspect this would impact others.
[0]: https://melpa.org/#/evil-numbers
[1]:
https://github.com/juliapath/evil-numbers/commit/f93258b706fa5cf9259e815c2d8258fcc6262804
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#74666
; Package
emacs
.
(Tue, 17 Dec 2024 03:20:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 74666 <at> debbugs.gnu.org (full text, mbox):
> In this case, the match data is set with `set-match-data' using
> calculated ranges.
I guess we can take this as good thing: it means complexifying the
regexp code would be wasted. 🙂
> Since this used to work I think it's reasonable to consider it a regression.
I was not trying to say it's not a regression. Just pouting because the
old behavior was just a lucky accident and recovering it without losing
the other improvement isn't completely straightforward.
> I've since committed a workaround to evil-numbers [1], although I'd suspect
> this would impact others.
[ And the workaround relies on another lucky accident: the "easiest"
fix for the problem would break your workaround. 🙁 ]
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#74666
; Package
emacs
.
(Sat, 28 Dec 2024 11:39:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 74666 <at> debbugs.gnu.org (full text, mbox):
> Cc: 74666 <at> debbugs.gnu.org
> Date: Mon, 16 Dec 2024 22:18:57 -0500
> From: Stefan Monnier via "Bug reports for GNU Emacs,
> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>
> > In this case, the match data is set with `set-match-data' using
> > calculated ranges.
>
> I guess we can take this as good thing: it means complexifying the
> regexp code would be wasted. 🙂
>
> > Since this used to work I think it's reasonable to consider it a regression.
>
> I was not trying to say it's not a regression. Just pouting because the
> old behavior was just a lucky accident and recovering it without losing
> the other improvement isn't completely straightforward.
>
> > I've since committed a workaround to evil-numbers [1], although I'd suspect
> > this would impact others.
>
> [ And the workaround relies on another lucky accident: the "easiest"
> fix for the problem would break your workaround. 🙁 ]
What should we do with this bug report?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#74666
; Package
emacs
.
(Sat, 28 Dec 2024 15:01:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 74666 <at> debbugs.gnu.org (full text, mbox):
> What should we do with this bug report?
I'd leave it open for now.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#74666
; Package
emacs
.
(Sat, 28 Dec 2024 16:36:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 74666 <at> debbugs.gnu.org (full text, mbox):
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: ideasman42 <at> gmail.com, 74666 <at> debbugs.gnu.org
> Date: Sat, 28 Dec 2024 10:00:33 -0500
>
> > What should we do with this bug report?
>
> I'd leave it open for now.
OK, but when should I ask again?
This bug report was last modified 175 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.