GNU bug report logs -
#55441
Use of 'primitive-fork' in (guix inferior) leads to hangs in 'cuirass evaluate'
Previous Next
Full log
Message #40 received at 55441 <at> debbugs.gnu.org (full text, mbox):
Hi!
Ludovic Courtès <ludo <at> gnu.org> skribis:
> Fixed in Guix commit a4994d739306abcf3f36706012fb88b35a970e6b with a
> test that reproduces the issue.
>
> Commit d02b7abe24fac84ef1fb1880f51d56fc9fb6cfef updates the ‘guix’
> package so we should be able to reconfigure berlin now and hopefully
> (crossing fingers!) be done with it.
An update: Cuirass is now up-to-date on berlin.guix, built from Guix
commit adf5ae5a412ed13302186dd4ce8e2df783d4515d.
Unfortunately, while evaluations now run to completion, child processes
of ‘cuirass evaluate’ stick around at the end:
--8<---------------cut here---------------start------------->8---
(gdb) bt
#0 futex_wait (private=0, expected=2, futex_word=0x7f5b1d054f08) at ../sysdeps/nptl/futex-internal.h:146
#1 __lll_lock_wait (futex=futex <at> entry=0x7f5b1d054f08, private=0) at lowlevellock.c:52
#2 0x00007f5b1d873ef3 in __GI___pthread_mutex_lock (mutex=mutex <at> entry=0x7f5b1d054f08) at ../nptl/pthread_mutex_lock.c:80
#3 0x00007f5b1d995303 in scm_c_weak_set_remove_x (pred=<optimized out>, closure=0x7f5b13dd8d00, raw_hash=1824276156261873434, set=#<weak-set 7f5b156772f0>) at weak-set.c:794
#4 scm_weak_set_remove_x (obj=#<port #<port-type file 7f5b1567ab40> 7f5b13dd8d00>, set=#<weak-set 7f5b156772f0>) at weak-set.c:817
#5 close_port (explicit=<optimized out>, port=#<port #<port-type file 7f5b1567ab40> 7f5b13dd8d00>) at ports.c:891
#6 close_port (port=#<port #<port-type file 7f5b1567ab40> 7f5b13dd8d00>, explicit=<optimized out>) at ports.c:874
#7 0x00007f5af3a7df82 in ?? ()
#8 0x0000000000dbd860 in ?? ()
#9 0x00007f5af3a7df60 in ?? ()
#10 0x0000000000db82b8 in ?? ()
#11 0x00007f5b1d972ccc in scm_jit_enter_mcode (thread=0x7f5b157bf240, mcode=0xdbd86c "\034\217\003") at jit.c:6038
#12 0x00007f5b1d9c7f3c in vm_regular_engine (thread=0x7f5b157bf240) at vm-engine.c:360
#13 0x00007f5b1d9d55e9 in scm_call_n (proc=<optimized out>, argv=<optimized out>, nargs=0) at vm.c:1608
#14 0x00007f5b1d939a0e in scm_call_with_unblocked_asyncs (proc=#<program 7f5aebcd7f40>) at async.c:406
#15 0x00007f5b1d9c8336 in vm_regular_engine (thread=0x7f5b157bf240) at vm-engine.c:972
#16 0x00007f5b1d9d55e9 in scm_call_n (proc=<optimized out>, argv=<optimized out>, nargs=0) at vm.c:1608
#17 0x00007f5b1d9c4be6 in really_launch (d=0x7f5aebccac80) at threads.c:778
#18 0x00007f5b1d93b85a in c_body (d=0x7f5aea691d80) at continuations.c:430
#19 0x00007f5aeeb118c2 in ?? ()
#20 0x00007f5b1553d7e0 in ?? ()
#21 0x00007f5b138a7370 in ?? ()
#22 0x0000000000000048 in ?? ()
#23 0x00007f5b1d972ccc in scm_jit_enter_mcode (thread=0x7f5b157bf240, mcode=0xdbc874 "\034<\003") at jit.c:6038
#24 0x00007f5b1d9c7f3c in vm_regular_engine (thread=0x7f5b157bf240) at vm-engine.c:360
#25 0x00007f5b1d9d55e9 in scm_call_n (proc=<optimized out>, argv=<optimized out>, nargs=2) at vm.c:1608
#26 0x00007f5b1d93d09a in scm_call_2 (proc=<optimized out>, arg1=<optimized out>, arg2=<optimized out>) at eval.c:503
#27 0x00007f5b1d9f3752 in scm_c_with_exception_handler.constprop.0 (type=#t, handler_data=handler_data <at> entry=0x7f5aea691d10, thunk_data=thunk_data <at> entry=0x7f5aea691d10,
thunk=<optimized out>, handler=<optimized out>) at exceptions.c:170
#28 0x00007f5b1d9c588f in scm_c_catch (tag=<optimized out>, body=<optimized out>, body_data=<optimized out>, handler=<optimized out>, handler_data=<optimized out>,
pre_unwind_handler=<optimized out>, pre_unwind_handler_data=0x7f5b156b2040) at throw.c:168
#29 0x00007f5b1d93de66 in scm_i_with_continuation_barrier (pre_unwind_handler=0x7f5b1d93db80 <pre_unwind_handler>, pre_unwind_handler_data=0x7f5b156b2040, handler_data=0x7f5aea691d80,
handler=0x7f5b1d9448b0 <c_handler>, body_data=0x7f5aea691d80, body=0x7f5b1d93b850 <c_body>) at continuations.c:368
#30 scm_c_with_continuation_barrier (func=<optimized out>, data=<optimized out>) at continuations.c:464
#31 0x00007f5b1d9c4b39 in with_guile (base=0x7f5aea691e08, data=0x7f5aea691e30) at threads.c:645
#32 0x00007f5b1d89b0ba in GC_call_with_stack_base () from /gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#33 0x00007f5b1d9bd16d in scm_i_with_guile (dynamic_state=<optimized out>, data=0x7f5aebccac80, func=0x7f5b1d9c4b70 <really_launch>) at threads.c:688
#34 launch_thread (d=0x7f5aebccac80) at threads.c:787
#35 0x00007f5b1d871d7e in start_thread (arg=0x7f5aea692640) at pthread_create.c:473
#36 0x00007f5b1d46feff in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) info threads
Id Target Id Frame
* 1 process 53801 "guile" futex_wait (private=0, expected=2, futex_word=0x7f5b1d054f08) at ../sysdeps/nptl/futex-internal.h:146
--8<---------------cut here---------------end--------------->8---
Notice there’s a single thread: it very much looks like the random
results one gets when forking a multithreaded process (in this case,
this one thread is a finalization thread, except it’s running in a
process that doesn’t actually have the other Guile threads). The
fork+threads problem is already manifesting, after all.
I’ll try and come up with a solution to that, if nobody beats me at it.
What’s annoying is that it’s not easy to test: the problem doesn’t
manifest on my 4-core laptop, but it does on the 96-core berlin.
To be continued…
Ludo’.
This bug report was last modified 2 years and 121 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.