Package: emacs;
Reported by: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
Date: Wed, 15 Feb 2017 19:20:02 UTC
Severity: normal
Found in version 26.0.50
Done: Stefan Monnier <monnier <at> iro.umontreal.ca>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 25743 in the body.
You can then email your comments to 25743 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-gnu-emacs <at> gnu.org
:bug#25743
; Package emacs
.
(Wed, 15 Feb 2017 19:20:02 GMT) Full text and rfc822 format available.Stefan Monnier <monnier <at> IRO.UMontreal.CA>
:bug-gnu-emacs <at> gnu.org
.
(Wed, 15 Feb 2017 19:20:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Stefan Monnier <monnier <at> IRO.UMontreal.CA> To: bug-gnu-emacs <at> gnu.org Subject: rehash-size ignored Date: Wed, 15 Feb 2017 14:18:54 -0500
Package: Emacs Version: 26.0.50 Looking accidentally at maybe_resize_hash_table, I noticed that we don't obey resize_hash, although we do all the work needed for that. Worse, we make dangerous assumptions about the behavior of larger_vector: maybe_resize_hash_table takes `old_size' from `ASIZE (h->next)' and then uses rehash_size to compute the desired new_size. The problem comes here: set_hash_key_and_value (h, larger_vector (h->key_and_value, 2 * (new_size - old_size), -1)); set_hash_next (h, larger_vector (h->next, new_size - old_size, -1)); This says, that h->next and h->key_and_value are replaced by new vectors that are larger than the previous one so that they are large enough to accomodate new_size. The problem is that larger_vector's promise is to return a vector at least as large as requested, but sometimes larger. In practice the new vector is twice the size of the old vector. This means that rehash_size is ignored when it means to something smaller than twice the size. It *also* means that h->next may end up larger than planned. But `ASIZE (h->next)' is the definition of the hash-table's size, so if it's larger than planned, then it's indispensible to make sure that all other related tables are equally larger: h->key_and_value must always be at least twice the size of h->next. But the call above only makes sure the new h->key_and_value is at least twice the desired new_size but not twice the size of the new h->next. I think the right fix is to change the calls to larger_vector so that instead of requesting vectors "at least as large as new_size", we should request vectors "of size exactly new_size". That will fix both problems at the same time. The patch below implements it. Any objection? Stefan diff --git a/src/fns.c b/src/fns.c index 760a017dedf..62c834fac60 100644 --- a/src/fns.c +++ b/src/fns.c @@ -3536,19 +3535,19 @@ get_key_arg (Lisp_Object key, ptrdiff_t nargs, Lisp_Object *args, char *used) Lisp_Object larger_vector (Lisp_Object vec, ptrdiff_t incr_min, ptrdiff_t nitems_max) { + eassert (VECTORP (vec)); struct Lisp_Vector *v; - ptrdiff_t incr, incr_max, old_size, new_size; + ptrdiff_t old_size = ASIZE (vec); ptrdiff_t C_language_max = min (PTRDIFF_MAX, SIZE_MAX) / sizeof *v->contents; - ptrdiff_t n_max = (0 <= nitems_max && nitems_max < C_language_max + ptrdiff_t n_max = (nitems_max == 0 ? old_size + incr_min + : 0 < nitems_max && nitems_max < C_language_max ? nitems_max : C_language_max); - eassert (VECTORP (vec)); eassert (0 < incr_min && -1 <= nitems_max); - old_size = ASIZE (vec); - incr_max = n_max - old_size; - incr = max (incr_min, min (old_size >> 1, incr_max)); + ptrdiff_t incr_max = n_max - old_size; + ptrdiff_t incr = max (incr_min, min (old_size >> 1, incr_max)); if (incr_max < incr) memory_full (SIZE_MAX); - new_size = old_size + incr; + ptrdiff_t new_size = old_size + incr; v = allocate_vector (new_size); memcpy (v->contents, XVECTOR (vec)->contents, old_size * sizeof *v->contents); memclear (v->contents + old_size, incr * word_size); @@ -3828,10 +3827,10 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h) message ("Growing hash table to: %"pI"d", new_size); #endif - set_hash_key_and_value (h, larger_vector (h->key_and_value, - 2 * (new_size - old_size), -1)); - set_hash_next (h, larger_vector (h->next, new_size - old_size, -1)); - set_hash_hash (h, larger_vector (h->hash, new_size - old_size, -1)); + set_hash_key_and_value (h, larger_vector (h->key_and_value, + 2 * (new_size - old_size), 0)); + set_hash_next (h, larger_vector (h->next, new_size - old_size, 0)); + set_hash_hash (h, larger_vector (h->hash, new_size - old_size, 0)); set_hash_index (h, Fmake_vector (make_number (index_size), Qnil)); /* Update the free list. Do it so that new entries are added at In GNU Emacs 26.0.50.1 (x86_64-unknown-linux-gnu, X toolkit, Xaw3d scroll bars) of 2017-01-30 built on lechazo Repository revision: 5ddca31bf707686aafaf891c216702b64657c516 Windowing system distributor 'The X.Org Foundation', version 11.0.11900000 System Description: Debian GNU/Linux 9.0 (stretch) Recent messages: Wrote /home/monnier/src/emacs/work/src/fns.c Saving file /home/monnier/src/emacs/work/src/fns.c... Wrote /home/monnier/src/emacs/work/src/fns.c previous-line: Beginning of buffer [2 times] Hunk already applied Mark set Saving file /home/monnier/src/emacs/work/src/fns.c... Wrote /home/monnier/src/emacs/work/src/fns.c Saving file /home/monnier/src/emacs/work/src/fns.c... Wrote /home/monnier/src/emacs/work/src/fns.c Configured using: 'configure -C --enable-checking --enable-check-lisp-object-type 'CFLAGS=-Wall -g3 -Og -Wno-pointer-sign' PKG_CONFIG_PATH=/home/monnier/lib/pkgconfig' Configured features: XAW3D XPM JPEG TIFF GIF PNG SOUND GPM DBUS GSETTINGS NOTIFY GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB TOOLKIT_SCROLL_BARS LUCID X11 Important settings: value of $LANG: fr_CH.UTF-8 locale-coding-system: utf-8-unix Major mode: InactiveMinibuffer Minor modes in effect: csv-field-index-mode: t diff-auto-refine-mode: t shell-dirtrack-mode: t electric-pair-mode: t global-reveal-mode: t reveal-mode: t auto-insert-mode: t savehist-mode: t minibuffer-electric-default-mode: t global-compact-docstrings-mode: t url-handler-mode: t global-eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t global-prettify-symbols-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Load-path shadows: /home/monnier/src/emacs/elpa/packages/svg/svg hides /home/monnier/src/emacs/work/lisp/svg /home/monnier/src/emacs/elpa/packages/ada-mode/ada-xref hides /home/monnier/src/emacs/work/lisp/progmodes/ada-xref /home/monnier/src/emacs/elpa/packages/ada-mode/ada-stmt hides /home/monnier/src/emacs/work/lisp/progmodes/ada-stmt /home/monnier/src/emacs/elpa/packages/ada-mode/ada-prj hides /home/monnier/src/emacs/work/lisp/progmodes/ada-prj /home/monnier/src/emacs/elpa/packages/ada-mode/ada-mode hides /home/monnier/src/emacs/work/lisp/progmodes/ada-mode /home/monnier/src/emacs/elpa/packages/landmark/landmark hides /home/monnier/src/emacs/work/lisp/obsolete/landmark /home/monnier/src/emacs/elpa/packages/crisp/crisp hides /home/monnier/src/emacs/work/lisp/obsolete/crisp Features: (shadow mail-extr emacsbug message puny rfc822 mml mml-sec epa derived epg mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail grep csv-mode smerge-mode whitespace gud debug completion tar-mode dabbrev vc vc-dispatcher caml tuareg_indent tuareg caml-help caml-types caml-emacs map sm-c-mode smie ox-latex ox-icalendar ox-html ox-ascii ox-publish ox org-protocol org-mouse org-mobile org-agenda org-indent org-feed org-crypt org-capture org-attach org-id org-element org-rmail org-mhe org-irc org-info org-gnus gnus-util rmail rmail-loaddefs rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils org-docview org-bibtex org-bbdb org-w3m org org-macro org-footnote org-pcomplete org-list org-faces org-entities org-version ob-emacs-lisp ob ob-tangle ob-ref ob-lob ob-table ob-exp org-src ob-keys ob-comint ob-core ob-eval org-compat org-macs org-loaddefs bibtex-style bibtex rect skeleton eieio-opt speedbar sb-image ezimage dframe find-func help-fns radix-tree html5-schema rng-xsd xsd-regexp rng-cmpct rng-nxml rng-valid rng-loc rng-uri rng-parse nxml-parse rng-match rng-dt rng-util rng-pttrn nxml-ns nxml-mode nxml-outln nxml-rap sgml-mode dom nxml-util nxml-enc xmltok sort mpc view cal-china lunar solar cal-dst cal-bahai cal-islam cal-hebrew holidays hol-loaddefs warnings cal-french diary-lib diary-loaddefs cal-move cal-menu calendar cal-loaddefs misearch multi-isearch cus-edit cus-start cus-load wid-edit autorevert filenotify doc-view subr-x jka-compr image-mode dired dired-loaddefs format-spec executable copyright vc-git diff-mode filecache reftex-dcr reftex reftex-loaddefs reftex-vars tex-mode compile shell pcomplete comint ansi-color ring latexenc server time-date noutline outline easy-mmode flyspell ispell checkdoc thingatpt load-dir elec-pair reveal autoinsert proof-site proof-autoloads cl pg-vars savehist minibuf-eldef disp-table compact-docstrings kotl-loaddefs advice info realgud-recursive-autoloads finder-inf url-auth package epg-config url-handlers url-parse auth-source eieio eieio-core cl-macs eieio-loaddefs password-cache url-vars seq byte-opt gv bytecomp byte-compile cl-extra help-mode easymenu cconv cl-loaddefs pcase cl-lib bbdb-loaddefs mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite charscript case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote dbusbind inotify dynamic-setting system-font-setting font-render-setting x-toolkit x multi-tty make-network-process emacs) Memory information: ((conses 8 523925 99482) (symbols 24 37808 0) (miscs 20 5334 863) (strings 16 112634 17461) (string-bytes 1 4377921) (vectors 8 68327) (vector-slots 4 2220644 87010) (floats 8 1387 1336) (intervals 28 30482 0) (buffers 520 88))
bug-gnu-emacs <at> gnu.org
:bug#25743
; Package emacs
.
(Fri, 26 Jul 2019 13:56:02 GMT) Full text and rfc822 format available.Message #8 received at 25743 <at> debbugs.gnu.org (full text, mbox):
From: Lars Ingebrigtsen <larsi <at> gnus.org> To: Stefan Monnier <monnier <at> IRO.UMontreal.CA> Cc: 25743 <at> debbugs.gnu.org Subject: Re: bug#25743: rehash-size ignored Date: Fri, 26 Jul 2019 15:55:07 +0200
Stefan Monnier <monnier <at> IRO.UMontreal.CA> writes: > Looking accidentally at maybe_resize_hash_table, I noticed that we don't > obey resize_hash, although we do all the work needed for that. > Worse, we make dangerous assumptions about the behavior of > larger_vector: > > maybe_resize_hash_table takes `old_size' from `ASIZE (h->next)' and then > uses rehash_size to compute the desired new_size. The problem comes > here: > > set_hash_key_and_value (h, larger_vector (h->key_and_value, > 2 * (new_size - old_size), -1)); > set_hash_next (h, larger_vector (h->next, new_size - old_size, -1)); > > This says, that h->next and h->key_and_value are replaced by new vectors > that are larger than the previous one so that they are large enough to > accomodate new_size. I did not follow the recent thread about hash table resizing closely, but I do seem to remember somebody saying that they'd fixed something in the hash resizing code, and the commits in fns.c seem to back that up: commit 49e80e765b693736a8bb97ae5bfa341d25bf4f02 Author: Paul Eggert <eggert <at> cs.ucla.edu> Date: Sat Jul 20 23:21:14 2019 -0700 Tweak recent hash-table fix * src/fns.c (maybe_resize_hash_table): Completely initialize the new ‘next’ vector before allocating more vectors, as this preserves locality a bit better and it’s safer not to leave an uninitialized Lisp object around. Use next_size instead of new_size to compute new index size. So is the issue discussed in this bug report fixed now? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no
Stefan Monnier <monnier <at> iro.umontreal.ca>
:Stefan Monnier <monnier <at> IRO.UMontreal.CA>
:Message #13 received at 25743-done <at> debbugs.gnu.org (full text, mbox):
From: Stefan Monnier <monnier <at> iro.umontreal.ca> To: Lars Ingebrigtsen <larsi <at> gnus.org> Cc: 25743-done <at> debbugs.gnu.org Subject: Re: bug#25743: rehash-size ignored Date: Fri, 26 Jul 2019 13:16:10 -0400
> * src/fns.c (maybe_resize_hash_table): Completely initialize the > new ‘next’ vector before allocating more vectors, as this > preserves locality a bit better and it’s safer not to leave an > uninitialized Lisp object around. Use next_size instead of > new_size to compute new index size. > > So is the issue discussed in this bug report fixed now? This patch fixes the most severe problem (which was that the table was larger than requested and we didn't take it into account in the rest of the code). But it doesn't fix the problem in the title: by default rehash-size is 1.5 yet in practice the code just doubles the size (because larger_vecalloc only allocates something smaller than twice the original size when we're reaching the maximum possible size). So I installed the patch below to finish it. Thanks, Stefan diff --git a/src/fns.c b/src/fns.c index f4f3b95ac6..c45f455646 100644 --- a/src/fns.c +++ b/src/fns.c @@ -4190,7 +4190,7 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h) avoid problems if memory is exhausted. larger_vecalloc finishes computing the size of the replacement vectors. */ Lisp_Object next = larger_vecalloc (h->next, new_size - old_size, - PTRDIFF_MAX / 2); + new_size); ptrdiff_t next_size = ASIZE (next); for (ptrdiff_t i = old_size; i < next_size - 1; i++) gc_aset (next, i, make_fixnum (i + 1));
Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Sat, 24 Aug 2019 11:24:08 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.