GNU bug report logs - #73885
[PATCH] Fix internal_equal so that it uses at most one hash table

Previous Next

Package: emacs;

Reported by: Ethan Kong <ek.ethan.kong <at> gmail.com>

Date: Sat, 19 Oct 2024 14:26:04 UTC

Severity: normal

Tags: patch

Merged with 73883

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 73885 in the body.
You can then email your comments to 73885 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#73885; Package emacs. (Sat, 19 Oct 2024 14:26:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ethan Kong <ek.ethan.kong <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 19 Oct 2024 14:26:05 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ethan Kong <ek.ethan.kong <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: [PATCH] Fix internal_equal so that it uses at most one hash table
Date: Sat, 19 Oct 2024 21:12:41 +0800
[Message part 1 (text/plain, inline)]
Tags: patch

Hi,

When I use 'equal' on large, cyclic objects, emacs freezes.

Reproduce:

    ;; emacs -Q
    (require 'org-element)

    ;; get two identical `org-element' trees of this file
    ;; https://github.com/emacs-mirror/emacs/blob/master/doc/misc/modus-themes.org
    (let ((file (expand-file-name "doc/misc/modus-themes.org"
                                  source-directory)))
      (with-current-buffer (find-file-noselect file)
        (setq org-o1 (org-element-parse-buffer))
        (org-element-cache-reset)
        (setq org-o2 (org-element-parse-buffer))))

    ;; the test
    (equal org-o1 org-o2)

This equal call freezes emacs. And the memory usage of emacs grows
quickly, until I quit. Reproduced with emacs 29.4 and the master branch.


The cause:

Current code on the master branch:

    static bool
    internal_equal (Lisp_Object o1, Lisp_Object o2, enum equal_kind equal_kind,
                    int depth, Lisp_Object ht)
    {
     tail_recurse:
      if (depth > 10)
        {
          // ...
          if (NILP (ht))
            ht = CALLN (Fmake_hash_table, QCtest, Qeq);
          // ...
        }
      // ...
            if (! internal_equal (XCAR (o1), XCAR (o2),
                                  equal_kind, depth + 1, ht))
              return false;
            // ...
    }

When internal_equal calls itself, it passes the hash table argument 'ht'
by value. As a result, each internal_equal(depth=11) call initializes
its own hash table, separately recording the objects it encounters in
further recursive calls. This leads to excessive 'hash_put' and
significant memory allocation.


Fix:

Make internal_equal pass the hash table by pointer (see the patch).

With this patch, the test above returns quickly:

    (benchmark-run 100 (equal org-o1 org-o2))

(2.562444 20 0.30933900000000014)

This will also improve the performance for smaller cyclic objects:

    ;; emacs -Q
    (defun make-cyclic (n m)
      (let* ((l1 (make-list n 1))
             (l2 (make-list m 2)))
        (setf (nth (1- m) l2) l1)
        (setf (nth (1- n) l1) l2)
        l1))

    (let ((a (make-cyclic 100 7))
          (b (make-cyclic 100 7)))
      (cl-assert (equal a b))
      (benchmark-run 10000 (equal a b)))

Current:
(0.530081 32 0.49294199999999977)

With the patch:
(0.036401 1 0.0173350000000001)

And it passes all the tests in 'make fns-tests'.


P.S. Could I get the copyright assignment paperwork please?

Best,
Ethan Kong


In GNU Emacs 29.4 (build 2, x86_64-w64-mingw32) of 2024-07-05 built on
 AVALON
Windowing system distributor 'Microsoft Corp.', version 10.0.22631
System Description: Microsoft Windows 10 Pro (v10.0.2009.22631.4317)

Configured using:
 'configure --with-modules --without-dbus --with-native-compilation=aot
 --without-compress-install --with-sqlite3 --with-tree-sitter
 CFLAGS=-O2'

[0001-Fix-internal_equal-so-that-it-uses-at-most-one-hash-.patch (text/patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73885; Package emacs. (Sat, 19 Oct 2024 15:11:02 GMT) Full text and rfc822 format available.

Message #8 received at 73885 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ethan Kong <ek.ethan.kong <at> gmail.com>
Cc: 73885 <at> debbugs.gnu.org
Subject: Re: bug#73885: [PATCH] Fix internal_equal so that it uses at most one
 hash table
Date: Sat, 19 Oct 2024 18:09:49 +0300
merge 73885 73883
thanks

> From: Ethan Kong <ek.ethan.kong <at> gmail.com>
> Date: Sat, 19 Oct 2024 21:12:41 +0800
> 
> Tags: patch
> 
> Hi,
> 
> When I use 'equal' on large, cyclic objects, emacs freezes.

Thanks, this is an exact duplicate of bug#73883 that you submitted not
long ago, so I'm merging them.




Merged 73883 73885. Request was from Eli Zaretskii <eliz <at> gnu.org> to control <at> debbugs.gnu.org. (Sat, 19 Oct 2024 15:11:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 07 Dec 2024 12:24:13 GMT) Full text and rfc822 format available.

This bug report was last modified 195 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.