GNU bug report logs - #39266
Heap corruption leads to random crashes

Previous Next

Package: guile;

Reported by: Ludovic Courtès <ludo <at> gnu.org>

Date: Fri, 24 Jan 2020 15:15:02 UTC

Severity: important

Merged with 36811, 36812, 39208, 39241, 39988

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #22 received at 39266 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: 39266 <at> debbugs.gnu.org
Subject: Re: bug#39266: Finalization thread hits wrong-type-arg on weak vector
 (AArch64)
Date: Mon, 09 Mar 2020 15:38:45 +0100
Ludovic Courtès <ludo <at> gnu.org> skribis:

> While building the “guix-system.drv” derivation on AArch64, I got this
> crash (not fully deterministic but quite frequent).  Here the
> finalization thread gets a wrong-type-arg in ‘scm_i_weak_car’ (i.e.,
> accessing a one-element weak vector):

With 3.0.1, I can reproduce the bug on x86_64.  With rr (thanks, Andy!),
I found this (starting from the point where the type cell of the weak
vector is zeroed, and reverse-continuing until its gets its original
value of 0x10f):

--8<---------------cut here---------------start------------->8---
(rr) frame 40
#40 0x00007ffff7f2e66d in scm_i_weak_car (pair=0x7fffe15af690) at ../libguile/pairs.h:190
190	  return SCM_CAR (x);
(rr) down
#39 0x00007ffff7f2f576 in scm_c_weak_vector_ref (wv=<optimized out>, k=k <at> entry=0) at weak-vector.c:193
193	  SCM_VALIDATE_WEAK_VECTOR (1, wv);
(rr) 
#38 0x00007ffff7ea7ba0 in scm_wrong_type_arg_msg (
    subr=subr <at> entry=0x7ffff7f56f00 <s_scm_weak_vector_ref> "weak-vector-ref", pos=pos <at> entry=1, 
    bad_value=0x7fffec472b90, szMessage=szMessage <at> entry=0x7ffff7f56e80 "weak vector") at error.c:282
282	      scm_error (scm_arg_type_key,
(rr) p *((void**)0x7fffec472b90)
$1 = (void *) 0x0
(rr) watch *((void**)0x7fffec472b90)
Hardware watchpoint 1: *((void**)0x7fffec472b90)
(rr) reverse-cont
Continuing.

Thread 1 received signal SIGCONT, Continued.
[Switching to Thread 27074.27074]
__lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:101
101	../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: Dosiero aŭ dosierujo ne ekzistas.
(rr) 
Continuing.

Thread 1 hit Hardware watchpoint 1: *((void**)0x7fffec472b90)

Old value = (void *) 0x0
New value = (void *) 0x10f
__memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:259
259	../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Dosiero aŭ dosierujo ne ekzistas.
(rr) bt
#0  __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:259
#1  0x00007ffff7f1d499 in set_vtable_access_fields (vtable=vtable <at> entry=0x7fffeb48ee80) at struct.c:143
#2  0x00007ffff7f1dd8d in scm_i_struct_inherit_vtable_magic (vtable=vtable <at> entry=0x7ffff4e32fa0, 
    obj=obj <at> entry=0x7fffeb48ee80) at struct.c:215
#3  0x00007ffff7f1dfea in scm_c_make_structv (vtable=0x7ffff4e32fa0, n_tail=<optimized out>, n_init=8, 
    init=0x7fffffff50d0) at struct.c:364
#4  0x00007ffff7f1e0b9 in scm_make_struct_no_tail (vtable=0x7ffff4e32fa0, init=0x304) at struct.c:491
--8<---------------cut here---------------end--------------->8---

Bingo!  There’s a mismatch in struct.c:

--8<---------------cut here---------------start------------->8---
  bitmask_size = (nfields + 31U) / 32U;
  unboxed_fields = scm_gc_malloc_pointerless (bitmask_size, "unboxed fields");
  memset (unboxed_fields, 0, bitmask_size * sizeof(*unboxed_fields));
--8<---------------cut here---------------end--------------->8---

Pushed a fix as 7c17655cd3d859bf0c5a86d9782a7788205fc05a.

Thanks, rr!  You made my day!  :-)

Now testing Guix builds on x86_64, i686, ARMv7, and AArch64 to see if
that addresses seemingly related issues.

Ludo’.




This bug report was last modified 5 years and 74 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.