GNU bug report logs - #39266
Heap corruption leads to random crashes

Previous Next

Package: guile;

Reported by: Ludovic Courtès <ludo <at> gnu.org>

Date: Fri, 24 Jan 2020 15:15:02 UTC

Severity: important

Merged with 36811, 36812, 39208, 39241, 39988

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Ludovic Courtès <ludo <at> gnu.org>
Subject: bug#39988: closed (Re: bug#39266: Finalization thread hits
 wrong-type-arg on weak vector (AArch64))
Date: Tue, 10 Mar 2020 17:27:02 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#39266: [3.0.1] Segfault in GC

which was filed against the guile package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 39988 <at> debbugs.gnu.org.

-- 
39266: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=39266
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Ludovic Courtès <ludo <at> gnu.org>
To: Pierre Langlois <pierre.langlois <at> gmx.com>
Cc: 39266-done <at> debbugs.gnu.org
Subject: Re: bug#39266: Finalization thread hits wrong-type-arg on weak vector
 (AArch64)
Date: Tue, 10 Mar 2020 18:25:55 +0100
Hi Pierre,

Pierre Langlois <pierre.langlois <at> gmx.com> skribis:

> I've tested it on AArch64 and it's looking good, I'm running Guile 3
> finally! I've tested by running 'guix pull --branch=wip-guile-3.0.1' on
> a rockpro64 running the Guix system, I've then reconfigured and rebooted
> and it's all good.

Thanks for testing!

> Thanks so much for the fix! Hopefully it'll work on every platform and
> that can be the end of it :-).

Yup, I’ve tested ‘guix pull --branch=wip-guile-3.0.1’ and ‘guix build
guile3.0-guix’ on all 4 architectures that Guix supports, and everything
is fine.

I’ve now pushed the upgrade to 3.0.1 + patch to Guix.

Closing!  \o/

The bug appears to be rare for Guile workloads not as intensive as a
Guix build (never reported, never seen), but we should still probably do
a bug-fix 3.0.2 release in the coming weeks, I guess.

Ludo’.

[Message part 3 (message/rfc822, inline)]
From: Ludovic Courtès <ludo <at> gnu.org>
To: bug-guile <at> gnu.org
Subject: [3.0.1] Segfault in GC
Date: Sun, 08 Mar 2020 22:50:21 +0100
Hello,

Building ‘guile3.0-guix’ on x86_64-linux from Guix commit
1a30351bf37930222f077cdbcbb6659372f1ea2d leads to a GC segfault.  This
can be reproduced with:

  guix pull --commit=1a30351bf37930222f077cdbcbb6659372f1ea2d
  guix build -K guile3.0-guix

‘guix pull’ also segfaults similarly on x86_64-linux.

The build log for ‘guile3.0-guix’ goes like this:

--8<---------------cut here---------------start------------->8---
[ 43%] LOAD     gnu/services.scm
[ 43%] LOAD     gnu/services/admin.scm
[ 43%] LOAD     gnu/services/audio.scm
[ 44%] LOAD     gnu/services/auditd.scm
[ 44%] LOAD     gnu/services/avahi.scm
[ 44%] LOAD     gnu/services/base.scm
[ 44%] LOAD     gnu/services/certbot.scm
/gnu/store/29jhbbg1hf557x8j53f9sxd9imlmf02a-bash-minimal-5.0.7/bin/bash: line 7: 26114 Segmentation fault      XDG_CACHE_HOME=/nowhere host=x86_64-unknown-linux-gnu srcdir="." ./pre-inst-env /gnum
make[2]: *** [Makefile:5785: make-go] Error 139
make[2]: Leaving directory '/tmp/guix-build-guile3.0-guix-1.0.1-14.c2f9ea2.drv-0/source'
--8<---------------cut here---------------end--------------->8---

The backtrace:

--8<---------------cut here---------------start------------->8---
(gdb) bt
#0  GC_clear_fl_marks (q=<optimized out>) at alloc.c:880
#1  0x00007f92ec27d331 in GC_finish_collection () at alloc.c:987
#2  0x00007f92ec27d705 in GC_try_to_collect_inner (
    stop_func=0x7f92ec27c8e0 <GC_never_stop_func>) at alloc.c:502
#3  0x00007f92ec27e314 in GC_collect_or_expand (needed_blocks=needed_blocks <at> entry=1, 
    ignore_off_page=ignore_off_page <at> entry=0, retry=retry <at> entry=0) at alloc.c:1353
#4  0x00007f92ec27e50f in GC_allocobj (gran=gran <at> entry=1, kind=1) at alloc.c:1445
#5  0x00007f92ec28413f in GC_generic_malloc_inner (lb=lb <at> entry=16, k=k <at> entry=1)
    at malloc.c:143
#6  0x00007f92ec2854f6 in GC_generic_malloc_many (lb=lb <at> entry=16, k=k <at> entry=1, 
    result=result <at> entry=0x7f92ec2b0458 <first_thread+312>) at mallocx.c:445
#7  0x00007f92ec290623 in GC_malloc_kind (bytes=16, knd=1) at thread_local_alloc.c:184
#8  0x00007f92ec28463a in GC_malloc (lb=<optimized out>) at malloc.c:294
#9  0x00007f92ec353b73 in scm_cell (cdr=772, car=140268964272160)
    at ../libguile/gc.h:161
#10 scm_cons (y=0x304, x=0x7f92e9c9d820) at pairs.h:155
#11 scm_append (args=<optimized out>) at list.c:255
#12 0x00007f92e9a2c7a0 in ?? ()
#13 0x00007f92eb99fd80 in ?? ()
#14 0x00007f92ec418880 in ?? ()
   from /gnu/store/gjr8c5qibb1v8clbafsr3a1xn9h4wb9y-guile-next-3.0.1/lib/libguile-3.0.so.1
#15 0x00007f92eb99fd80 in ?? ()
#16 0x00007f92ec352f0b in scm_jit_enter_mcode (thread=0x7f92eb99fd80, 
    mcode=0x7f92e8b66e70 "H\203\350\060I\211\314I)\304I\203\374@\017\205\263\006")
    at jit.c:5777
#17 0x00007f92ec3ae4b9 in vm_regular_engine (thread=0x7f92caf2e5e0) at vm-engine.c:360
#18 0x00007f92ec3af155 in scm_call_n (proc=<optimized out>, 
    argv=argv <at> entry=0x7ffec93ceb48, nargs=nargs <at> entry=1) at vm.c:1600
#19 0x00007f92ec32d207 in scm_primitive_eval (exp=<optimized out>) at eval.c:671
#20 0x00007f92ec354bcb in scm_primitive_load (filename=filename <at> entry=0x7f92cfa84f40)
    at load.c:131
#21 0x00007f92ec356078 in scm_primitive_load_path (args=<optimized out>) at load.c:1267
#22 0x00007f92e96ceef0 in ?? ()
#23 0x00007f92eb99fd80 in ?? ()
#24 0x00007f92ec418880 in ?? ()
   from /gnu/store/gjr8c5qibb1v8clbafsr3a1xn9h4wb9y-guile-next-3.0.1/lib/libguile-3.0.so.1
#25 0x00007f92eb99fd80 in ?? ()
#26 0x00007f92ec352f0b in scm_jit_enter_mcode (thread=0x7f92eb99fd80, 
    mcode=0x7f92e8b66e70 "H\203\350\060I\211\314I)\304I\203\374@\017\205\263\006")
    at jit.c:5777
#27 0x00007f92ec3ae4b9 in vm_regular_engine (thread=0x7f92cf24c850) at vm-engine.c:360
#28 0x00007f92ec3af155 in scm_call_n (proc=<optimized out>, 
    argv=argv <at> entry=0x7ffec93ceed8, nargs=nargs <at> entry=1) at vm.c:1600
#29 0x00007f92ec32d207 in scm_primitive_eval (exp=<optimized out>) at eval.c:671
#30 0x00007f92ec354bcb in scm_primitive_load (filename=filename <at> entry=0x7f92cfa57440)
    at load.c:131
#31 0x00007f92ec356078 in scm_primitive_load_path (args=<optimized out>) at load.c:1267
#32 0x00007f92e96ceef0 in ?? ()
#33 0x00007f92eb99fd80 in ?? ()
#34 0x00007f92ec418880 in ?? ()
   from /gnu/store/gjr8c5qibb1v8clbafsr3a1xn9h4wb9y-guile-next-3.0.1/lib/libguile-3.0.so.1
#35 0x00007f92eb99fd80 in ?? ()
#36 0x00007f92ec352f0b in scm_jit_enter_mcode (thread=0x7f92eb99fd80, 
    mcode=0x7f92e39387f0 "I\211\314I)\304I\203\374\020\017\217", <incomplete sequence \344>) at jit.c:5777
#37 0x00007f92ec3ae7a8 in vm_regular_engine (thread=0x7f92cf219410) at vm-engine.c:374
#38 0x00007f92ec3af155 in scm_call_n (proc=<optimized out>, 
    argv=argv <at> entry=0x7ffec93cf268, nargs=nargs <at> entry=1) at vm.c:1600
#39 0x00007f92ec32d207 in scm_primitive_eval (exp=<optimized out>) at eval.c:671
#40 0x00007f92ec354bcb in scm_primitive_load (filename=<optimized out>) at load.c:131
#41 0x00007f92ec3add1c in vm_regular_engine (thread=0x7f92eb99fd80) at vm-engine.c:972
#42 0x00007f92ec3af155 in scm_call_n (proc=<optimized out>, 
    argv=argv <at> entry=0x7ffec93cf438, nargs=nargs <at> entry=1) at vm.c:1600
#43 0x00007f92ec32d207 in scm_primitive_eval (exp=<optimized out>, 
    exp <at> entry=0x7f92e98c4fe0) at eval.c:671
#44 0x00007f92ec32d263 in scm_eval (exp=0x7f92e98c4fe0, 
    module_or_state=module_or_state <at> entry=0x7f92e98a7f00) at eval.c:705
#45 0x00007f92ec385080 in scm_shell (argc=834, argv=0x7ffec93cfa98) at script.c:357
#46 0x00007f92ec344c0d in invoke_main_func (body_data=0x7ffec93cf940) at init.c:308
#47 0x00007f92ec327e5a in c_body (d=0x7ffec93cf880) at continuations.c:430
#48 0x00007f92ec3add1c in vm_regular_engine (thread=0x7f92eb99fd80) at vm-engine.c:972
#49 0x00007f92ec3af155 in scm_call_n (proc=<optimized out>, 
    argv=argv <at> entry=0x7ffec93cf640, nargs=nargs <at> entry=2) at vm.c:1600
#50 0x00007f92ec32c09a in scm_call_2 (proc=<optimized out>, arg1=<optimized out>, 
    arg2=<optimized out>) at eval.c:503
#51 0x00007f92ec32d89a in scm_c_with_exception_handler (type=type <at> entry=0x404, 
    handler=handler <at> entry=0x7f92ec3a4580 <catch_post_unwind_handler>, 
    handler_data=handler_data <at> entry=0x7ffec93cf7b0, 
    thunk=thunk <at> entry=0x7f92ec3a46c0 <catch_body>, 
    thunk_data=thunk_data <at> entry=0x7ffec93cf7b0) at exceptions.c:170
#52 0x00007f92ec3a48bd in scm_c_catch (tag=tag <at> entry=0x404, 
    body=body <at> entry=0x7f92ec327e50 <c_body>, body_data=body_data <at> entry=0x7ffec93cf880, 
    handler=handler <at> entry=0x7f92ec3280f0 <c_handler>, 
    handler_data=handler_data <at> entry=0x7ffec93cf880, 
    pre_unwind_handler=pre_unwind_handler <at> entry=0x7f92ec327f50 <pre_unwind_handler>, 
    pre_unwind_handler_data=0x7f92e9c763c0) at throw.c:168
#53 0x00007f92ec328403 in scm_i_with_continuation_barrier (
    body=body <at> entry=0x7f92ec327e50 <c_body>, body_data=body_data <at> entry=0x7ffec93cf880, 
    handler=handler <at> entry=0x7f92ec3280f0 <c_handler>, 
    handler_data=handler_data <at> entry=0x7ffec93cf880, 
    pre_unwind_handler=pre_unwind_handler <at> entry=0x7f92ec327f50 <pre_unwind_handler>, 
    pre_unwind_handler_data=0x7f92e9c763c0) at continuations.c:368
#54 0x00007f92ec328495 in scm_c_with_continuation_barrier (func=<optimized out>, 
    data=<optimized out>) at continuations.c:464
#55 0x00007f92ec3a335f in with_guile (base=base <at> entry=0x7ffec93cf8e8, 
    data=data <at> entry=0x7ffec93cf910) at threads.c:645
#56 0x00007f92ec289a68 in GC_call_with_stack_base (
    fn=fn <at> entry=0x7f92ec3a3310 <with_guile>, arg=arg <at> entry=0x7ffec93cf910)
    at misc.c:1941
#57 0x00007f92ec3a3678 in scm_i_with_guile (dynamic_state=<optimized out>, 
    data=data <at> entry=0x7ffec93cf910, func=func <at> entry=0x7f92ec344bf0 <invoke_main_func>)
    at threads.c:688
#58 scm_with_guile (func=func <at> entry=0x7f92ec344bf0 <invoke_main_func>, 
    data=data <at> entry=0x7ffec93cf940) at threads.c:694
#59 0x00007f92ec344d82 in scm_boot_guile (argc=argc <at> entry=834, 
    argv=argv <at> entry=0x7ffec93cfa98, main_func=main_func <at> entry=0x401240 <inner_main>, 
    closure=closure <at> entry=0x0) at init.c:291
#60 0x0000000000401100 in main (argc=834, argv=0x7ffec93cfa98) at guile.c:95
(gdb) info threads
  Id   Target Id                        Frame 
* 1    Thread 0x7f92ebcc3b80 (LWP 6259) GC_clear_fl_marks (q=<optimized out>) at alloc.c:880
  2    Thread 0x7f92ea67b700 (LWP 6266) 0x00007f92ec257efc in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
  3    Thread 0x7f92e79fd700 (LWP 6276) 0x00007f92ec25b344 in read () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
  4    Thread 0x7f92e9623700 (LWP 6271) 0x00007f92ec25b344 in read () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
  5    Thread 0x7f92eb00c700 (LWP 6265) 0x00007f92ec257efc in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
  6    Thread 0x7f92eb99d700 (LWP 6264) 0x00007f92ec257efc in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
--8<---------------cut here---------------end--------------->8---

Setting GUILE_JIT_THRESHOLD=-1, the thing goes further without
segfaulting, but then it hangs with:

--8<---------------cut here---------------start------------->8---
[ 50%] LOAD     guix/store/ssh.scm
[ 50%] LOAD     guix/scripts/offload.scm
Backtrace:
[ 50%] LOAD     guix/store/database.scm
[ 50%] LOAD     guix/store/deduplication.scm
[ 50%] LOAD     guix/store/roots.scm
[ 50%] LOAD     guix/config.scm
[ 50%] LOAD     guix/tests.scm
[ 50%] LOAD     guix/tests/http.scm
--8<---------------cut here---------------end--------------->8---

Apparently a deadlock on ‘all_weak_tables_lock’:

--8<---------------cut here---------------start------------->8---
(gdb) bt
#0  0x00007f9a51bc00bc in __lll_lock_wait () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
#1  0x00007f9a51bb9674 in pthread_mutex_lock () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
#2  0x00007f9a51d1624f in scm_c_make_weak_table (k=<optimized out>, kind=SCM_WEAK_TABLE_KIND_KEY) at weak-table.c:505
#3  0x00007f9a51d12d1c in vm_regular_engine (thread=0x7f9a51304d80) at vm-engine.c:972
#4  0x00007f9a51d14155 in scm_call_n (proc=<optimized out>, argv=argv <at> entry=0x7ffc439d27d8, nargs=nargs <at> entry=1) at vm.c:1600
#5  0x00007f9a51c92207 in scm_primitive_eval (exp=<optimized out>) at eval.c:671
#6  0x00007f9a51cb9bcb in scm_primitive_load (filename=<optimized out>) at load.c:131
#7  0x00007f9a51d12d1c in vm_regular_engine (thread=0x7f9a51304d80) at vm-engine.c:972
#8  0x00007f9a51d14155 in scm_call_n (proc=<optimized out>, argv=argv <at> entry=0x7ffc439d29a8, nargs=nargs <at> entry=1) at vm.c:1600
#9  0x00007f9a51c92207 in scm_primitive_eval (exp=<optimized out>, exp <at> entry=0x7f9a4f269fe0) at eval.c:671
#10 0x00007f9a51c92263 in scm_eval (exp=0x7f9a4f269fe0, module_or_state=module_or_state <at> entry=0x7f9a4f24cf00) at eval.c:705
#11 0x00007f9a51cea080 in scm_shell (argc=834, argv=0x7ffc439d3008) at script.c:357
#12 0x00007f9a51ca9c0d in invoke_main_func (body_data=0x7ffc439d2eb0) at init.c:308
#13 0x00007f9a51c8ce5a in c_body (d=0x7ffc439d2df0) at continuations.c:430
#14 0x00007f9a51d12d1c in vm_regular_engine (thread=0x7f9a51304d80) at vm-engine.c:972
#15 0x00007f9a51d14155 in scm_call_n (proc=<optimized out>, argv=argv <at> entry=0x7ffc439d2bb0, nargs=nargs <at> entry=2) at vm.c:1600
#16 0x00007f9a51c9109a in scm_call_2 (proc=<optimized out>, arg1=<optimized out>, arg2=<optimized out>) at eval.c:503
#17 0x00007f9a51c9289a in scm_c_with_exception_handler (type=type <at> entry=0x404, handler=handler <at> entry=0x7f9a51d09580 <catch_post_unwind_handler>, handler_data=handler_data <at> entry=0x7ffc439d2d20, 
    thunk=thunk <at> entry=0x7f9a51d096c0 <catch_body>, thunk_data=thunk_data <at> entry=0x7ffc439d2d20) at exceptions.c:170
#18 0x00007f9a51d098bd in scm_c_catch (tag=tag <at> entry=0x404, body=body <at> entry=0x7f9a51c8ce50 <c_body>, body_data=body_data <at> entry=0x7ffc439d2df0, handler=handler <at> entry=0x7f9a51c8d0f0 <c_handler>, 
    handler_data=handler_data <at> entry=0x7ffc439d2df0, pre_unwind_handler=pre_unwind_handler <at> entry=0x7f9a51c8cf50 <pre_unwind_handler>, pre_unwind_handler_data=0x7f9a4f5db3c0) at throw.c:168
#19 0x00007f9a51c8d403 in scm_i_with_continuation_barrier (body=body <at> entry=0x7f9a51c8ce50 <c_body>, body_data=body_data <at> entry=0x7ffc439d2df0, handler=handler <at> entry=0x7f9a51c8d0f0 <c_handler>, 
    handler_data=handler_data <at> entry=0x7ffc439d2df0, pre_unwind_handler=pre_unwind_handler <at> entry=0x7f9a51c8cf50 <pre_unwind_handler>, pre_unwind_handler_data=0x7f9a4f5db3c0) at continuations.c:368
#20 0x00007f9a51c8d495 in scm_c_with_continuation_barrier (func=<optimized out>, data=<optimized out>) at continuations.c:464
#21 0x00007f9a51d0835f in with_guile (base=base <at> entry=0x7ffc439d2e58, data=data <at> entry=0x7ffc439d2e80) at threads.c:645
#22 0x00007f9a51beea68 in GC_call_with_stack_base (fn=fn <at> entry=0x7f9a51d08310 <with_guile>, arg=arg <at> entry=0x7ffc439d2e80) at misc.c:1941
#23 0x00007f9a51d08678 in scm_i_with_guile (dynamic_state=<optimized out>, data=data <at> entry=0x7ffc439d2e80, func=func <at> entry=0x7f9a51ca9bf0 <invoke_main_func>) at threads.c:688
#24 scm_with_guile (func=func <at> entry=0x7f9a51ca9bf0 <invoke_main_func>, data=data <at> entry=0x7ffc439d2eb0) at threads.c:694
#25 0x00007f9a51ca9d82 in scm_boot_guile (argc=argc <at> entry=834, argv=argv <at> entry=0x7ffc439d3008, main_func=main_func <at> entry=0x401240 <inner_main>, closure=closure <at> entry=0x0) at init.c:291
#26 0x0000000000401100 in main (argc=834, argv=0x7ffc439d3008) at guile.c:95
(gdb) info threads
  Id   Target Id                                Frame 
* 1    Thread 0x7f9a51628b80 (LWP 7003) "guile" 0x00007f9a51bc00bc in __lll_lock_wait () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
  2    Thread 0x7f9a51302700 (LWP 7006) "guile" 0x00007f9a51bbcefc in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
  3    Thread 0x7f9a50971700 (LWP 7007) "guile" 0x00007f9a51bbcefc in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
  4    Thread 0x7f9a4ffe0700 (LWP 7008) "guile" 0x00007f9a51bbcefc in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
  5    Thread 0x7f9a4f008700 (LWP 7009) "guile" 0x00007f9a51bbcefc in pthread_cond_wait@@GLIBC_2.3.2 () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
  6    Thread 0x7f9a4d522700 (LWP 7010) "guile" 0x00007f9a51bc0344 in read () from /gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libpthread.so.0
--8<---------------cut here---------------end--------------->8---

To be continued…

Ludo’.



This bug report was last modified 5 years and 73 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.