#33014 - 26.1.50; 27.0.50; Fatal error after re-evaluating a thread's function

GNU bug report logs - #33014
26.1.50; 27.0.50; Fatal error after re-evaluating a thread's function

Package: emacs;

Reported by: Gemini Lasswell <gazally <at> runbox.com>

Date: Thu, 11 Oct 2018 05:32:01 UTC

Severity: normal

Tags: fixed

Found in version 26.1.50

Fixed in version 27.1

Done: Gemini Lasswell <gazally <at> runbox.com>

Bug is archived. No further changes may be made.

Message #11 received at 33014 <at> debbugs.gnu.org (full text, mbox):

From: Gemini Lasswell <gazally <at> runbox.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 33014 <at> debbugs.gnu.org Subject: Re: bug#33014: 26.1.50; 27.0.50; Fatal error after re-evaluating a thread's function Date: Fri, 12 Oct 2018 13:02:56 -0700

Eli Zaretskii <eliz <at> gnu.org> writes: > Can you please make a smaller stand-alone test case, which doesn't > require patching Emacs? That will make it much easier to try > reproducing the problem. I've tried to do that without success. The bug won't reproduce if I put all the code added to thread.el by the patch into its own file and load it with C-u M-x byte-compile-file, and it also doesn't work to put the resulting .elc on my load-path and load it with require. I've determined today that having -O2 in CFLAGS is necessary to reproduce the bug, and that -O1 or -O0 won't do it. > Can you show the Lisp backtrace of this thread? Also, what is the > offending object 'a' in this frame: The Lisp backtrace is really short: Thread 7 (Thread 0x7f1cd4dec700 (LWP 21837)): "erb--benchmark-monitor-func" (0x158ec58) >> #2 0x00000000006122b5 in XHASH_TABLE (a=...) at lisp.h:2241 > > and what was its parent object in the calling frame? Those are both optimized out with -O2. I recompiled bytecode.c with "volatile" on the declaration of jmp_table, and got this: (gdb) up 3 #3 exec_byte_code (bytestr=..., vector=..., maxdepth=..., args_template=..., nargs=nargs <at> entry=0, args=<optimized out>, args <at> entry=0x16eacf8 <bss_sbrk_buffer+9926232>) at bytecode.c:1403 1403 struct Lisp_Hash_Table *h = XHASH_TABLE (jmp_table); (gdb) p jmp_table $1 = make_number(514) (gdb) p *top $3 = XIL(0x42b4d0) (gdb) pp *top remove Then I started looking at other variables in exec_byte_code, and found this which didn't look right: (gdb) p *vectorp $13 = XIL(0x7f4934009523) (gdb) pr (((help-menu "Help" keymap (emacs-tutorial menu-item "Emacs Tutorial" help-with-tutorial :help "Lear ?\207" [yank-menu kill-ring buffer-read-only gui-backend-selection-exists-p CLIPBOARD featurep ns] 2 \205^Q^@ÅÆ!\207" [visual-line-mode word-wrap truncate-lines 0 nil toggle-truncate-lines -1] 2 nil ni (I've truncated the result of printing *vectorp since each line is over 5000 characters long.) Since that looked like it was unlikely to be the original value of *vectorp, I started a new debugging session and stepped through Thread 7's call to exec_byte_code for erb--benchmark-monitor-func, and determined that *vectorp's initial value was erb--status-updates, which matches the first element of the constants vector in (symbol-function 'erb--benchmark-monitor-func). The value of vectorp was 0x16eac38 so I set a watchpoint on *(EMACS_INT *) 0x16eac38 and continued, and then during the execution of eval-region it triggered here: Thread 1 "monitor" hit Hardware watchpoint 7: *(EMACS_INT *) 0x16eac38 Old value = 60897760 New value = 24075314 setup_on_free_list (v=v <at> entry=0x16eac30 <bss_sbrk_buffer+9926032>, nbytes=nbytes <at> entry=272) at alloc.c:3060 3060 total_free_vector_slots += nbytes / word_size; (gdb) bt 10 #0 setup_on_free_list (v=v <at> entry=0x16eac30 <bss_sbrk_buffer+9926032>, nbytes=nbytes <at> entry=272) at alloc.c:3060 #1 0x00000000005a9a24 in sweep_vectors () at alloc.c:3297 #2 0x00000000005adb2e in gc_sweep () at alloc.c:6872 #3 garbage_collect_1 (end=<optimized out>) at alloc.c:5860 #4 Fgarbage_collect () at alloc.c:5989 #5 0x00000000005ca478 in maybe_gc () at lisp.h:4804 #6 Ffuncall (nargs=4, args=args <at> entry=0x7fff210a3bc8) at eval.c:2838 #7 0x0000000000611e00 in exec_byte_code (bytestr=..., vector=..., maxdepth=..., args_template=..., nargs=nargs <at> entry=2, args=<optimized out>, args <at> entry=0x9bd128 <pure+781288>) at bytecode.c:632 #8 0x00000000005cdd32 in funcall_lambda (fun=XIL(0x7fff210a3bc8), nargs=nargs <at> entry=2, arg_vector=0x9bd128 <pure+781288>, arg_vector <at> entry=0x7fff210a3f00) at eval.c:3057 #9 0x00000000005ca54b in Ffuncall (nargs=3, args=args <at> entry=0x7fff210a3ef8) at eval.c:2870 (More stack frames follow...) Note that just as was happening when we were working through bug#32357, the thread names which gdb prints are wrong, which I verified with: (gdb) p current_thread $21 = (struct thread_state *) 0xd73480 <main_thread> (gdb) p current_thread->name $22 = XIL(0) Am I correct that the next step is to figure out why the garbage collector is not marking this vector? Presumably it's no longer attached to the function definition for erb--benchmark-monitor-func by the time the garbage collector runs, but it's supposed to be found by mark_stack when called from mark_one_thread for Thread 7, right?

This bug report was last modified 6 years and 257 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #33014 26.1.50; 27.0.50; Fatal error after re-evaluating a thread's function

GNU bug report logs - #33014
26.1.50; 27.0.50; Fatal error after re-evaluating a thread's function