GNU bug report logs - #77942
31.0.50; replace-region-contents gets stuck

Previous Next

Package: emacs;

Reported by: Gerd Möllmann <gerd.moellmann <at> gmail.com>

Date: Sun, 20 Apr 2025 15:30:07 UTC

Severity: normal

Found in version 31.0.50

To reply to this bug, email your comments to 77942 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Sun, 20 Apr 2025 15:30:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Gerd Möllmann <gerd.moellmann <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 20 Apr 2025 15:30:08 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 31.0.50; replace-region-contents gets stuck
Date: Sun, 20 Apr 2025 17:28:51 +0200
To reproduce with emacs -Q, define this function

(defun elb-replace-region-contents-entry ()
  (with-temp-buffer
    (let ((step (apply #'concat (make-list 2000 "🙂été👶🏿 "))))
      (dotimes (_ (/ 10000000 (length step)))
        (insert step)))

    (dotimes (_ 100)
      (let* ((a (1+ (random (point-max))))
	     (b (1+ (random (point-max))))
	     (beg (min a b))
	     (end (max a b)))
	(replace-region-contents beg end "🙂été👶🏿 🙂été👶🏿 ")))))

and evaluate

  /elb-replace-region-contents-entry)

This enters an (infinite?) loop that cannot be interrupted with C-g in
the GUI version. If started with -nw, C-g eventually crashes Emacs,
without printing anything on stdout or stderr.


In GNU Emacs 31.0.50 (build 1, aarch64-apple-darwin24.4.0, NS
 appkit-2575.50 Version 15.4.1 (Build 24E263)) of 2025-04-20 built on
 pro2
Repository revision: 6fb2a4691f4d53473c0a326d3a6c23df9008b6e8
Repository branch: master
System Description:  macOS 15.4.1

Configured using:
 'configure --cache-file
 /var/folders/1d/k_6t25f94sl83szqbf8gpkrh0000gn/T//config.cache.master
 --enable-checking=yes --with-native-compilation=no CC=clang
 'CFLAGS=-Wgnu-imaginary-constant -Wunused-result -g -g -O0 -F
 /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks
 -Wno-ignored-attributes -Wno-flag-enum -Wno-missing-method-return-type
 -Wno-variadic-macros -Wno-strict-prototypes -Wno-availability
 -Wno-nullability-completeness''

Configured features:
ACL DBUS GLIB GNUTLS LCMS2 LIBXML2 MODULES NOTIFY KQUEUE NS PDUMPER PNG
RSVG SQLITE3 THREADS TOOLKIT_SCROLL_BARS TREE_SITTER WEBP XIM ZLIB

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Sun, 20 Apr 2025 16:07:05 GMT) Full text and rfc822 format available.

Message #8 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Gerd Möllmann <gerd.moellmann <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Sun, 20 Apr 2025 19:06:44 +0300
> From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
> Date: Sun, 20 Apr 2025 17:28:51 +0200
> 
> To reproduce with emacs -Q, define this function
> 
> (defun elb-replace-region-contents-entry ()
>   (with-temp-buffer
>     (let ((step (apply #'concat (make-list 2000 "🙂été👶🏿 "))))
>       (dotimes (_ (/ 10000000 (length step)))
>         (insert step)))
> 
>     (dotimes (_ 100)
>       (let* ((a (1+ (random (point-max))))
> 	     (b (1+ (random (point-max))))
> 	     (beg (min a b))
> 	     (end (max a b)))
> 	(replace-region-contents beg end "🙂été👶🏿 🙂été👶🏿 ")))))
> 
> and evaluate
> 
>   /elb-replace-region-contents-entry)
> 
> This enters an (infinite?) loop that cannot be interrupted with C-g in
> the GUI version. If started with -nw, C-g eventually crashes Emacs,
> without printing anything on stdout or stderr.

Adding Stefan.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Mon, 21 Apr 2025 17:24:02 GMT) Full text and rfc822 format available.

Message #11 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Gerd Möllmann <gerd.moellmann <at> gmail.com>
Cc: 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Mon, 21 Apr 2025 13:23:34 -0400
> To reproduce with emacs -Q, define this function
>
> (defun elb-replace-region-contents-entry ()
>   (with-temp-buffer
>     (let ((step (apply #'concat (make-list 2000 "🙂été👶🏿 "))))
>       (dotimes (_ (/ 10000000 (length step)))
>         (insert step)))
>
>     (dotimes (_ 100)
>       (let* ((a (1+ (random (point-max))))
> 	       (b (1+ (random (point-max))))

[ Side note: This should use (+ (point-min) (random (buffer-size)))  ]

> 	     (beg (min a b))
> 	     (end (max a b)))

And this should not be necessary because `replace-region-contents`
should accept positions in any order, like `delete-region`.

> 	(replace-region-contents beg end "🙂été👶🏿 🙂été👶🏿 ")))))
>
> and evaluate
>
>   (elb-replace-region-contents-entry)
>
> This enters an (infinite?) loop

AFAICT it's not infinite.  At least, it does end for me when the buffer
is smaller (100000) and it does take more time (but still end) with
a buffer of 500000, so I think it would end *eventually*.

> that cannot be interrupted with C-g in the GUI version.

Apparently the MAX-SECS argument doesn't help either.  🙁

> If started with -nw, C-g eventually crashes Emacs,
> without printing anything on stdout or stderr.

That's even worse.

Eli wrote:
> Adding Stefan.

FWIW, this problem also shows up in Emacs-28 with:

    (require 'subr-x)
    
    (defun elb-replace-region-contents-entry ()
      (with-temp-buffer
        (let ((step (apply #'concat (make-list 2000 "🙂été👶🏿 "))))
          (dotimes (_ (/ 10000000 (length step)))
            (insert step)))
    
        (dotimes (_ 100)
          (let* ((a (+ (point-min) (random (buffer-size))))
                 (b (+ (point-min) (random (buffer-size)))))
            (message "replace-region-contents %S %S ..." a b)
            (replace-region-contents a b (lambda () "🙂été👶🏿 🙂été👶🏿 "))
            (message "replace-region-contents...done")))))



- Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Mon, 21 Apr 2025 18:02:01 GMT) Full text and rfc822 format available.

Message #14 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Mon, 21 Apr 2025 20:01:43 +0200
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>> This enters an (infinite?) loop
>
> AFAICT it's not infinite.  At least, it does end for me when the buffer
> is smaller (100000) and it does take more time (but still end) with
> a buffer of 500000, so I think it would end *eventually*.

Just as an observation: I started with 1 run and that seemed reasonably
fast, a few seconds, so I made it 100, which had this effect.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Mon, 21 Apr 2025 20:32:02 GMT) Full text and rfc822 format available.

Message #17 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Gerd Möllmann <gerd.moellmann <at> gmail.com>
Cc: 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Mon, 21 Apr 2025 16:31:45 -0400
Gerd Möllmann [2025-04-21 20:01:43] wrote:
> Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>>> This enters an (infinite?) loop
>>
>> AFAICT it's not infinite.  At least, it does end for me when the buffer
>> is smaller (100000) and it does take more time (but still end) with
>> a buffer of 500000, so I think it would end *eventually*.
>
> Just as an observation: I started with 1 run and that seemed reasonably
> fast, a few seconds, so I made it 100, which had this effect.

For the first round, how fast is it depends on your luck with the random
number generator since the time depends on the size of the region on
which you perform the replace.

Try just

    (require 'subr-x)
    
    (defun elb-replace-region-contents-entry ()
      (with-temp-buffer
        (let ((step (apply #'concat (make-list 2000 "🙂été👶🏿 "))))
          (dotimes (_ (/ 1000000 (length step)))
            (insert step)))
    
        (replace-region-contents 123 (- (point-max) 123)
                                 (lambda () "🙂été👶🏿 🙂été👶🏿 "))))


- Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Mon, 21 Apr 2025 20:40:01 GMT) Full text and rfc822 format available.

Message #20 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Mon, 21 Apr 2025 22:39:39 +0200
Gerd Möllmann <gerd.moellmann <at> gmail.com> writes:

> Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>
>>> This enters an (infinite?) loop
>>
>> AFAICT it's not infinite.  At least, it does end for me when the buffer
>> is smaller (100000) and it does take more time (but still end) with
>> a buffer of 500000, so I think it would end *eventually*.
>
> Just as an observation: I started with 1 run and that seemed reasonably
> fast, a few seconds, so I made it 100, which had this effect.

Just found that replace-region-contents apparently uses a diff
algorithm from diffseq.h, and that says

   The basic algorithm is described in:
   "An O(ND) Difference Algorithm and its Variations", Eugene W. Myers,
   Algorithmica Vol. 1, 1986, pp. 251-266,

If N is the buffer size and D is the diff size, and both are large, good
luck with that. And indeed, reducing the buffer size by 2 orders of
magnitude makes things acceptable for my use case.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Mon, 21 Apr 2025 20:43:03 GMT) Full text and rfc822 format available.

Message #23 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Mon, 21 Apr 2025 22:42:16 +0200
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

> Gerd Möllmann [2025-04-21 20:01:43] wrote:
>> Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>>>> This enters an (infinite?) loop
>>>
>>> AFAICT it's not infinite.  At least, it does end for me when the buffer
>>> is smaller (100000) and it does take more time (but still end) with
>>> a buffer of 500000, so I think it would end *eventually*.
>>
>> Just as an observation: I started with 1 run and that seemed reasonably
>> fast, a few seconds, so I made it 100, which had this effect.
>
> For the first round, how fast is it depends on your luck with the random
> number generator since the time depends on the size of the region on
> which you perform the replace.
>
> Try just
>
>     (require 'subr-x)
>     
>     (defun elb-replace-region-contents-entry ()
>       (with-temp-buffer
>         (let ((step (apply #'concat (make-list 2000 "🙂été👶🏿 "))))
>           (dotimes (_ (/ 1000000 (length step)))
>             (insert step)))
>     
>         (replace-region-contents 123 (- (point-max) 123)
>                                  (lambda () "🙂été👶🏿 🙂été👶🏿 "))))
>

Yep, I've read the code now, and the O(ND) diffing was, well, a bit
suprising :-).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Mon, 21 Apr 2025 21:13:03 GMT) Full text and rfc822 format available.

Message #26 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Gerd Möllmann <gerd.moellmann <at> gmail.com>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Mon, 21 Apr 2025 17:11:57 -0400
> If N is the buffer size and D is the diff size, and both are large, good
> luck with that. And indeed, reducing the buffer size by 2 orders of
> magnitude makes things acceptable for my use case.

And the fun thing is that in your test case, the "inner function"
`buffer_chars_equal` which compares two chars (which is the part where
the diff code hands us back control so we can check the time against
MAX_SECS, and that's also the place where we probably spend too much
time doing charpos->bytepos conversion) is called only 74 times, at the
very beginning, afterwards there's just a very long processing inside
the diff code itself without looking at the buffer contents (or the
replacement string for that matter).

I actually haven't checked the way the algorithm works, but I get the
impression that this is a case that can be optimized further.  🙂
Paul, IIUC you've played with this algorithm.  Would you be tempted to
try and improve that situation where the replacement text is *much*
smaller the text it replaces?

Gerd, IIUC you're trying to use this to benchmark the charpos->bytepos
code, in which case you should be better served by a call where the text
and its replacement are of fairly similar size.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Mon, 21 Apr 2025 22:33:01 GMT) Full text and rfc822 format available.

Message #29 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Gerd Möllmann <gerd.moellmann <at> gmail.com>,
 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Mon, 21 Apr 2025 15:32:17 -0700
On 4/21/25 14:11, Stefan Monnier wrote:
> Paul, IIUC you've played with this algorithm.  Would you be tempted to
> try and improve that situation where the replacement text is*much*
> smaller the text it replaces?

My quick reaction is that you'd need a different algorithm than the 
Myers-Ukkonen algorithm used by GNU diff etc. A good place to start 
might be here:

A. Andoni, R. Krauthgamer and K. Onak, "Polylogarithmic Approximation 
for Edit Distance and the Asymmetric Query Complexity," FOCS 2010, 
377-386, <https://doi.org/10.1109/FOCS.2010.43>.

... though this is just the tip of an iceberg that I haven't had time to 
look into. If there's real interest in this (it'd be some work) I can 
ask a colleague who's more of an expert....




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 02:48:02 GMT) Full text and rfc822 format available.

Message #32 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Gerd Möllmann <gerd.moellmann <at> gmail.com>,
 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Mon, 21 Apr 2025 22:47:22 -0400
>> Paul, IIUC you've played with this algorithm.  Would you be tempted to
>> try and improve that situation where the replacement text is*much*
>> smaller the text it replaces?
>
> My quick reaction is that you'd need a different algorithm than the
> Myers-Ukkonen algorithm used by GNU diff etc. A good place to start might be
> here:
>
> A. Andoni, R. Krauthgamer and K. Onak, "Polylogarithmic Approximation for
>  Edit Distance and the Asymmetric Query Complexity," FOCS 2010, 377-386,
> <https://doi.org/10.1109/FOCS.2010.43>.
>
> ... though this is just the tip of an iceberg that I haven't had time to
>  look into. If there's real interest in this (it'd be some work) I can ask
> a colleague who's more of an expert....

I don't think I have the time&energy to look into it.

But the issue at hand is that in Gerd's example, the diff code takes
a *really* long time (even though it has already found an optimal
answer, AFAICT), and more importantly: without calling the
XVECREF_YVECREF_EQUAL function, which prevents us from aborting the
operation with a timeout.

For example, if I interrupt the execution I found it is in `diag` in the
following loop:

      for (d = fmax; d >= fmin; d -= 2)
        {
          OFFSET x;
          OFFSET y;
          OFFSET tlo = fd[d - 1];
          OFFSET thi = fd[d + 1];
          OFFSET x0 = tlo < thi ? thi : tlo + 1;

          for (x = x0, y = x0 - d;
               x < xlim && y < ylim && XREF_YREF_EQUAL (x, y);
               x++, y++)
            continue;
          if (x - x0 > SNAKE_LIMIT)
            big_snake = true;
          fd[d] = x;
          if (odd && bmin <= d && d <= bmax && bd[d] <= x)
            {
              part->xmid = x;
              part->ymid = y;
              part->lo_minimal = part->hi_minimal = true;
              return;
            }
        }

where (in our case when we replace a very large region with
a short string), the `y < ylim` is apparently always false (once the
first 74 calls to `XVECREF_YVECREF_EQUAL` took place):

    (gdb) p x
    $1 = 51823
    (gdb) p xlim
    $2 = 489755
    (gdb) p y
    $3 = <optimized out>
    (gdb) p ylim
    $4 = 14
    (gdb) p x0
    $5 = 51823
    (gdb) p d
    $6 = 41094
    (gdb) p x0 - d
    $7 = 10729

I don't understand this code enough, but I get the impression that we
might be able to streamline this loop once we get to that state because
every iteration seems to do very little useful work that does not change
between iterations (clearly this depends on being able to detect "that
state", which depends on the `tlo` and `thi` we read from the array, so
it might require keeping track somewhere of the min/max values stored in
the `fd` array?).

The same probably happens for the loop that follows (and does the same
but for "the bottom-up search", according to the comment).

I'm hoping you understand enough of the algorithm and of what I'm saying
to be able to tell me if I'm making sense.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 03:49:02 GMT) Full text and rfc822 format available.

Message #35 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 05:48:14 +0200
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

> Gerd, IIUC you're trying to use this to benchmark the charpos->bytepos
> code, in which case you should be better served by a call where the text
> and its replacement are of fairly similar size.

That's right, it's for a bencnmark that helps me to get an impression if
avoiding the binary search in the charpos -> bytepos case is worth it.
Now that I know what replace-region-contents does, that's no problem.

But honestly, what the heck? Does someone know what this diffing is
about in the first place? I mean, that replacing a region takes O(ND) is
not something that I would have thought possible. 





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 11:35:02 GMT) Full text and rfc822 format available.

Message #38 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Gerd Möllmann <gerd.moellmann <at> gmail.com>
Cc: monnier <at> iro.umontreal.ca, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 14:34:30 +0300
> Cc: 77942 <at> debbugs.gnu.org
> From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
> Date: Mon, 21 Apr 2025 22:39:39 +0200
> 
> Just found that replace-region-contents apparently uses a diff
> algorithm from diffseq.h, and that says
> 
>    The basic algorithm is described in:
>    "An O(ND) Difference Algorithm and its Variations", Eugene W. Myers,
>    Algorithmica Vol. 1, 1986, pp. 251-266,
> 
> If N is the buffer size and D is the diff size, and both are large, good
> luck with that. And indeed, reducing the buffer size by 2 orders of
> magnitude makes things acceptable for my use case.

You should be able to cut your losses by passing a sufficiently small
number for MAX-SEC, perhaps even zero.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 11:38:02 GMT) Full text and rfc822 format available.

Message #41 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Gerd Möllmann <gerd.moellmann <at> gmail.com>
Cc: eggert <at> cs.ucla.edu, monnier <at> iro.umontreal.ca, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 14:37:13 +0300
> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 77942 <at> debbugs.gnu.org
> From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
> Date: Tue, 22 Apr 2025 05:48:14 +0200
> 
> But honestly, what the heck? Does someone know what this diffing is
> about in the first place?

It's about leaving the unchanged portions of text intact, including
their properties, overlays, markers, etc.  I believe the doc string
says that much.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 11:43:02 GMT) Full text and rfc822 format available.

Message #44 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: eggert <at> cs.ucla.edu, monnier <at> iro.umontreal.ca, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 13:42:15 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 77942 <at> debbugs.gnu.org
>> From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
>> Date: Tue, 22 Apr 2025 05:48:14 +0200
>> 
>> But honestly, what the heck? Does someone know what this diffing is
>> about in the first place?
>
> It's about leaving the unchanged portions of text intact, including
> their properties, overlays, markers, etc.  I believe the doc string
> says that much.

What I meant is why perform an O(ND) diff for that purpose. That's not
at all obvious. (And it would be nice if the doc string talked about
that O(ND) a bit to warn users.)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 11:45:01 GMT) Full text and rfc822 format available.

Message #47 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: monnier <at> iro.umontreal.ca, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 13:43:51 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Cc: 77942 <at> debbugs.gnu.org
>> From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
>> Date: Mon, 21 Apr 2025 22:39:39 +0200
>> 
>> Just found that replace-region-contents apparently uses a diff
>> algorithm from diffseq.h, and that says
>> 
>>    The basic algorithm is described in:
>>    "An O(ND) Difference Algorithm and its Variations", Eugene W. Myers,
>>    Algorithmica Vol. 1, 1986, pp. 251-266,
>> 
>> If N is the buffer size and D is the diff size, and both are large, good
>> luck with that. And indeed, reducing the buffer size by 2 orders of
>> magnitude makes things acceptable for my use case.
>
> You should be able to cut your losses by passing a sufficiently small
> number for MAX-SEC, perhaps even zero.

MAX-SECS isn't working as expected said Stef elsewhere.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 13:07:02 GMT) Full text and rfc822 format available.

Message #50 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Gerd Möllmann <gerd.moellmann <at> gmail.com>
Cc: eggert <at> cs.ucla.edu, monnier <at> iro.umontreal.ca, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 16:05:53 +0300
> From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
> Cc: monnier <at> iro.umontreal.ca,  eggert <at> cs.ucla.edu,  77942 <at> debbugs.gnu.org
> Date: Tue, 22 Apr 2025 13:42:15 +0200
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 77942 <at> debbugs.gnu.org
> >> From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
> >> Date: Tue, 22 Apr 2025 05:48:14 +0200
> >> 
> >> But honestly, what the heck? Does someone know what this diffing is
> >> about in the first place?
> >
> > It's about leaving the unchanged portions of text intact, including
> > their properties, overlays, markers, etc.  I believe the doc string
> > says that much.
> 
> What I meant is why perform an O(ND) diff for that purpose. That's not
> at all obvious.

AFAIU, it's because the Diff algorithm makes the minimum number of
changes.

> (And it would be nice if the doc string talked about
> that O(ND) a bit to warn users.)

It does:

  Because this function can be very slow if there is a large number of
  differences between the two buffers, there are two optional arguments
  mitigating this issue.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 13:07:02 GMT) Full text and rfc822 format available.

Message #53 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Gerd Möllmann <gerd.moellmann <at> gmail.com>
Cc: monnier <at> iro.umontreal.ca, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 16:06:34 +0300
> From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
> Cc: monnier <at> iro.umontreal.ca,  77942 <at> debbugs.gnu.org
> Date: Tue, 22 Apr 2025 13:43:51 +0200
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >> Cc: 77942 <at> debbugs.gnu.org
> >> From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
> >> Date: Mon, 21 Apr 2025 22:39:39 +0200
> >> 
> >> Just found that replace-region-contents apparently uses a diff
> >> algorithm from diffseq.h, and that says
> >> 
> >>    The basic algorithm is described in:
> >>    "An O(ND) Difference Algorithm and its Variations", Eugene W. Myers,
> >>    Algorithmica Vol. 1, 1986, pp. 251-266,
> >> 
> >> If N is the buffer size and D is the diff size, and both are large, good
> >> luck with that. And indeed, reducing the buffer size by 2 orders of
> >> magnitude makes things acceptable for my use case.
> >
> > You should be able to cut your losses by passing a sufficiently small
> > number for MAX-SEC, perhaps even zero.
> 
> MAX-SECS isn't working as expected said Stef elsewhere.

If you set it to zero, it must work as expected.  If it doesn't, it's
a bug.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 13:15:02 GMT) Full text and rfc822 format available.

Message #56 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: eggert <at> cs.ucla.edu, monnier <at> iro.umontreal.ca, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 15:14:07 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
>> Cc: monnier <at> iro.umontreal.ca,  eggert <at> cs.ucla.edu,  77942 <at> debbugs.gnu.org
>> Date: Tue, 22 Apr 2025 13:42:15 +0200
>> 
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>> 
>> >> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 77942 <at> debbugs.gnu.org
>> >> From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
>> >> Date: Tue, 22 Apr 2025 05:48:14 +0200
>> >> 
>> >> But honestly, what the heck? Does someone know what this diffing is
>> >> about in the first place?
>> >
>> > It's about leaving the unchanged portions of text intact, including
>> > their properties, overlays, markers, etc.  I believe the doc string
>> > says that much.
>> 
>> What I meant is why perform an O(ND) diff for that purpose. That's not
>> at all obvious.
>
> AFAIU, it's because the Diff algorithm makes the minimum number of
> changes.
>
>> (And it would be nice if the doc string talked about
>> that O(ND) a bit to warn users.)
>
> It does:
>
>   Because this function can be very slow if there is a large number of
>   differences between the two buffers, there are two optional arguments
>   mitigating this issue.

I find that pretty vague. If it said something about the O(ND), one
could figure out that that can be morally equivalent to O(N^2) which I
would find a fair warning.

But whatever. Funny function that one.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 13:27:02 GMT) Full text and rfc822 format available.

Message #59 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Gerd Möllmann <gerd.moellmann <at> gmail.com>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 09:26:25 -0400
Gerd Möllmann [2025-04-22 05:48:14] wrote:
> Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>> Gerd, IIUC you're trying to use this to benchmark the charpos->bytepos
>> code, in which case you should be better served by a call where the text
>> and its replacement are of fairly similar size.
>
> That's right, it's for a bencnmark that helps me to get an impression if
> avoiding the binary search in the charpos -> bytepos case is worth it.
> Now that I know what replace-region-contents does, that's no problem.
>
> But honestly, what the heck? Does someone know what this diffing is
> about in the first place? I mean, that replacing a region takes O(ND) is
> not something that I would have thought possible. 

That's why I wanted (well, still want) to have a `replace-region` that
does just the plain/stupid/naive replacement, and then keep
`replace-region-contents` as a separate function (ideally even renamed
to something like `replace-region-slowly` or
`replace-region-carefully`).

One place where the diff's extra work is worthwhile is when you do
reindent/prettyprint via an external process.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 13:34:03 GMT) Full text and rfc822 format available.

Message #62 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Gerd Möllmann <gerd.moellmann <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 15:33:10 +0200
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

> Gerd Möllmann [2025-04-22 05:48:14] wrote:
>> Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>>> Gerd, IIUC you're trying to use this to benchmark the charpos->bytepos
>>> code, in which case you should be better served by a call where the text
>>> and its replacement are of fairly similar size.
>>
>> That's right, it's for a bencnmark that helps me to get an impression if
>> avoiding the binary search in the charpos -> bytepos case is worth it.
>> Now that I know what replace-region-contents does, that's no problem.
>>
>> But honestly, what the heck? Does someone know what this diffing is
>> about in the first place? I mean, that replacing a region takes O(ND) is
>> not something that I would have thought possible. 
>
> That's why I wanted (well, still want) to have a `replace-region` that
> does just the plain/stupid/naive replacement, and then keep
> `replace-region-contents` as a separate function (ideally even renamed
> to something like `replace-region-slowly` or
> `replace-region-carefully`).

100% agreement. +ND

> One place where the diff's extra work is worthwhile is when you do
> reindent/prettyprint via an external process.

Thanks. 





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 14:34:01 GMT) Full text and rfc822 format available.

Message #65 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Gerd Möllmann <gerd.moellmann <at> gmail.com>,
 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 10:33:09 -0400
>> MAX-SECS isn't working as expected said Stef elsewhere.
> If you set it to zero, it must work as expected.  If it doesn't, it's
> a bug.

He does want to trigger the diff code, tho.  The problem is that in his
test case, the diff code consults checks for a timeout too infrequently
(e.g. not at all for a whole hour).


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 14:43:01 GMT) Full text and rfc822 format available.

Message #68 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: gerd.moellmann <at> gmail.com, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 17:42:35 +0300
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Gerd Möllmann <gerd.moellmann <at> gmail.com>,
>   77942 <at> debbugs.gnu.org
> Date: Tue, 22 Apr 2025 10:33:09 -0400
> 
> >> MAX-SECS isn't working as expected said Stef elsewhere.
> > If you set it to zero, it must work as expected.  If it doesn't, it's
> > a bug.
> 
> He does want to trigger the diff code, tho.

Why is that?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 15:21:02 GMT) Full text and rfc822 format available.

Message #71 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: gerd.moellmann <at> gmail.com, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 11:20:44 -0400
>> >> MAX-SECS isn't working as expected said Stef elsewhere.
>> > If you set it to zero, it must work as expected.  If it doesn't, it's
>> > a bug.
>> He does want to trigger the diff code, tho.
> Why is that?

Because it's the diff code which does loads of conversions from charpos
to bytepos.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 15:32:02 GMT) Full text and rfc822 format available.

Message #74 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: gerd.moellmann <at> gmail.com, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 18:31:08 +0300
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: gerd.moellmann <at> gmail.com,  77942 <at> debbugs.gnu.org
> Date: Tue, 22 Apr 2025 11:20:44 -0400
> 
> >> >> MAX-SECS isn't working as expected said Stef elsewhere.
> >> > If you set it to zero, it must work as expected.  If it doesn't, it's
> >> > a bug.
> >> He does want to trigger the diff code, tho.
> > Why is that?
> 
> Because it's the diff code which does loads of conversions from charpos
> to bytepos.

But then waiting for a long time is actually what the doctor
prescribed, as it collects a lot of statistics, no?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#77942; Package emacs. (Tue, 22 Apr 2025 18:24:02 GMT) Full text and rfc822 format available.

Message #77 received at 77942 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: gerd.moellmann <at> gmail.com, 77942 <at> debbugs.gnu.org
Subject: Re: bug#77942: 31.0.50; replace-region-contents gets stuck
Date: Tue, 22 Apr 2025 14:22:51 -0400
>> >> >> MAX-SECS isn't working as expected said Stef elsewhere.
>> >> > If you set it to zero, it must work as expected.  If it doesn't, it's
>> >> > a bug.
>> >> He does want to trigger the diff code, tho.
>> > Why is that?
>> 
>> Because it's the diff code which does loads of conversions from charpos
>> to bytepos.
>
> But then waiting for a long time is actually what the doctor
> prescribed, as it collects a lot of statistics, no?

It would be if that time was spent in Emacs's side of the code, but in
that case it's all spent in `diffseq.h` without calling us back to look
at the buffer (which is what triggers the charpos->bytepos conversions
and lets us check the timeout as well).


        Stefan





This bug report was last modified 56 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.