Package: emacs;
Reported by: Daniel Mendler <mail <at> daniel-mendler.de>
Date: Wed, 30 Apr 2025 10:04:02 UTC
Severity: normal
Tags: patch
Done: Stefan Monnier <monnier <at> iro.umontreal.ca>
Bug is archived. No further changes may be made.
Message #14 received at 78162 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Daniel Mendler <mail <at> daniel-mendler.de> Cc: 78162 <at> debbugs.gnu.org, Stefan Kangas <stefankangas <at> gmail.com> Subject: Re: bug#78162: [PATCH] hash-table-contains-p: Avoid creating a symbol on every call Date: Thu, 01 May 2025 09:38:04 +0000
"Daniel Mendler" <mail <at> daniel-mendler.de> writes: > Pip Cet <pipcet <at> protonmail.com> writes: > >> "Daniel Mendler via \"Bug reports for GNU Emacs, the Swiss army knife of text editors\"" <bug-gnu-emacs <at> gnu.org> writes: >> >>> Tags: patch >>> >>> For performance reasons avoid creating a symbol on every call to >>> `hash-table-contains-p'. See also https://github.com/emacs-compat/compat/issues/71. >> >> If the performance of hash-table-contains-p is an issue, we should >> reimplement it in C, which can use magic values which aren't accessible >> from Lisp. > > For a proper hash table I expect this function to be efficient and not > allocate. C would work, then. As it avoids both your performance concerns and my concerns about leaking magic values to Lisp, how about this? From 2a0103a8e67268eed0860baca397a0713335ba0d Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet <at> protonmail.com> Subject: [PATCH] Implement 'hash-table-contains-p' in C (bug#78162) * lisp/emacs-lisp/byte-opt.el (side-effect-free-fns): Add 'hash-table-contains-p'. * lisp/subr.el (hash-table-contains-p): Function removed. * src/fns.c (Fhash_table_contains_p): New function. (syms_of_fns): Register it. * test/lisp/subr-tests.el (hash-table-contains-p): Move... * test/src/fns-tests.el (test-hash-table-contains-p): ...here. --- lisp/emacs-lisp/byte-opt.el | 2 +- lisp/subr.el | 7 ------- src/fns.c | 12 ++++++++++++ test/lisp/subr-tests.el | 12 ------------ test/src/fns-tests.el | 12 ++++++++++++ 5 files changed, 25 insertions(+), 20 deletions(-) diff --git a/lisp/emacs-lisp/byte-opt.el b/lisp/emacs-lisp/byte-opt.el index 652c79e9c93..f93946635a0 100644 --- a/lisp/emacs-lisp/byte-opt.el +++ b/lisp/emacs-lisp/byte-opt.el @@ -1761,7 +1761,7 @@ byte-optimize-set compare-strings concat copy-alist copy-hash-table copy-sequence elt equal equal-including-properties featurep get - gethash hash-table-count hash-table-rehash-size + gethash hash-table-contains-p hash-table-count hash-table-rehash-size hash-table-rehash-threshold hash-table-size hash-table-test hash-table-weakness length length< length= length> diff --git a/lisp/subr.el b/lisp/subr.el index a5c850ebe5e..6a0487e96ee 100644 --- a/lisp/subr.el +++ b/lisp/subr.el @@ -7436,13 +7436,6 @@ string-trim (declare (important-return-value t)) (string-trim-left (string-trim-right string trim-right) trim-left)) -(defsubst hash-table-contains-p (key table) - "Return non-nil if TABLE has an element with KEY." - (declare (side-effect-free t) - (important-return-value t)) - (let ((missing (make-symbol "missing"))) - (not (eq (gethash key table missing) missing)))) - ;; The initial anchoring is for better performance in searching matches. (defconst regexp-unmatchable "\\`a\\`" "Standard regexp guaranteed not to match any string at all.") diff --git a/src/fns.c b/src/fns.c index 21916b6fb46..a3b45232700 100644 --- a/src/fns.c +++ b/src/fns.c @@ -5866,6 +5866,17 @@ DEFUN ("gethash", Fgethash, Sgethash, 2, 3, 0, } +DEFUN ("hash-table-contains-p", Fhash_table_contains_p, Shash_table_contains_p, + 2, 2, 0, + doc: /* Return non-nil if TABLE has an element with KEY. */) + (Lisp_Object key, Lisp_Object table) +{ + struct Lisp_Hash_Table *h = check_hash_table (table); + ptrdiff_t i = hash_find (h, key); + return i >= 0 ? Qt : Qnil; +} + + DEFUN ("puthash", Fputhash, Sputhash, 3, 3, 0, doc: /* Associate KEY with VALUE in hash table TABLE. If KEY is already present in table, replace its current value with @@ -6686,6 +6697,7 @@ syms_of_fns (void) defsubr (&Shash_table_p); defsubr (&Sclrhash); defsubr (&Sgethash); + defsubr (&Shash_table_contains_p); defsubr (&Sputhash); defsubr (&Sremhash); defsubr (&Smaphash); diff --git a/test/lisp/subr-tests.el b/test/lisp/subr-tests.el index 024cbe85bba..f3adfcaed62 100644 --- a/test/lisp/subr-tests.el +++ b/test/lisp/subr-tests.el @@ -1493,17 +1493,5 @@ subr--subst-char-in-string (props-out (object-intervals out))) (should (equal props-out props-in)))))))) -(ert-deftest hash-table-contains-p () - (let ((h (make-hash-table))) - (should-not (hash-table-contains-p 'problems h)) - (should-not (hash-table-contains-p 'cookie h)) - (should-not (hash-table-contains-p 'milk h)) - (puthash 'problems 99 h) - (puthash 'cookie nil h) - (puthash 'milk 'missing h) - (should (hash-table-contains-p 'problems h)) - (should (hash-table-contains-p 'cookie h)) - (should (hash-table-contains-p 'milk h)))) - (provide 'subr-tests) ;;; subr-tests.el ends here diff --git a/test/src/fns-tests.el b/test/src/fns-tests.el index e6abcdc34e7..d3e9512434a 100644 --- a/test/src/fns-tests.el +++ b/test/src/fns-tests.el @@ -1149,6 +1149,18 @@ test-hash-table-wrong-keywords (should (make-hash-table :rehash-threshold 123)) ; obsolete and ignored (should-error (make-hash-table :some-random-keyword 123))) +(ert-deftest test-hash-table-contains-p () + (let ((h (make-hash-table))) + (should-not (hash-table-contains-p 'problems h)) + (should-not (hash-table-contains-p 'cookie h)) + (should-not (hash-table-contains-p 'milk h)) + (puthash 'problems 99 h) + (puthash 'cookie nil h) + (puthash 'milk 'missing h) + (should (hash-table-contains-p 'problems h)) + (should (hash-table-contains-p 'cookie h)) + (should (hash-table-contains-p 'milk h)))) + (ert-deftest test-remhash () (let ((h (make-hash-table)) (val "anything")) -- 2.48.1 > A magic value becomes a problem if it is actually used in Lisp. Is > this likely? This should not happen by accident, only if > `hash-table--missing` is used intentionally. I think it's reasonable to build a hash table associating each symbol's name to its value, in order to detect which symbol values change as code is executed. With your patch, this code will result in a hash table which has a false negative for (hash-table-contains-p "hash-table--missing" h): (defun dump-obarray () (let ((h (make-hash-table :test 'equal))) (mapatoms (lambda (sym) (when (boundp sym) (puthash (symbol-name sym) (symbol-value sym) h)))) h)) (let ((old (dump-obarray)) (new (dump-obarray))) (maphash (lambda (key value) (if (hash-table-contains-p key old) (unless (eq (gethash key old) (gethash key new)) (message "value of %S changed" key)) (message "%S newly defined to be %S" key value))) new)) With the C implementation, this would work, of course. > Note that the actual symbol which is used for the check is uninterned. I understand that it's traditional to use uninterned symbols for cases like this one, but strings or conses (depending on whether you want extra information) would work just as well, and be a bit cheaper. Floats would work and be cheapest, but might break if our representation of them changes, as it has in my local tree. In any case, IMHO, it's not the magic value that's the problem here, it's that it's stored as a global symbol's value and thus becomes reachable via the obarray. Pip
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.