GNU bug report logs - #25387
guile-2.2 multi-thread segfault in SCM_VALIDATE_WEAK_TABLE

Previous Next

Package: guile;

Reported by: linasvepstas <at> gmail.com

Date: Sun, 8 Jan 2017 00:19:01 UTC

Severity: normal

Done: Andy Wingo <wingo <at> pobox.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 25387 in the body.
You can then email your comments to 25387 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#25387; Package guile. (Sun, 08 Jan 2017 00:19:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to linasvepstas <at> gmail.com:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Sun, 08 Jan 2017 00:19:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Linas Vepstas <linasvepstas <at> gmail.com>
To: bug-guile <at> gnu.org
Subject: guile-2.2 multi-thread segfault in SCM_VALIDATE_WEAK_TABLE
Date: Sat, 7 Jan 2017 18:18:20 -0600
Following program crashes immediately (fraction of a second)
in guile-2.2, current git version (as of 29 Dec 2016
a0656ad4cf976b3845e9b9663a90b46b4cf9fc5a )

It runs fine in guile-2.0. Its doing something slightly squonky:
referencing the variable 'cnt' in a thread.  Note definition of
use before definition of variable

Its deterministic - always crashes in the same place.

(define junk 0)
(define halt #f)

(define (wtf-thr)
   (define start (- (current-time) 0.1))

   ; Create thread that does junk and exits.  Yes, the increment
   ; of `junk` is not protected, and its racey, but so what.
   (define (mkthr v) (call-with-new-thread (lambda ()
      (if (eq? 0 (modulo cnt 30)) (gc))   ;;;; <<<<  crashes here!!!
(set! junk (+ junk 1)))))

   ; thread arguments
   (define thrarg (make-list 10 0))

   (define cnt 0)
   (define (mke)
      ; Create a limited number of threads
      (define thr-list (map mkthr thrarg))
      ; (display (length (all-threads)))
      (map join-thread thr-list)

      ; Some handy debug printing.
      (set! cnt (+ cnt 1))
      (if (eq? 0 (modulo cnt 500))
         (begin
            (display "rate=")
            (display (/ cnt (- (current-time) start))) (newline)
(display "cnt=") (display cnt) (newline)
(display (gc-stats)) (newline) (newline)
         )))

   ; tail recursive infinite loop.
   (define (aloop) (mke) (if (not halt) (aloop)))

   ; while forever.
   (aloop)
)

; Run elsewhere, so that we have a shell prompt
; (not required for the bug)
(call-with-new-thread wtf-thr)

; halt if desired.
; (set! halt #t)


Thread 621 "guile" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffedbe1700 (LWP 10504)]
0x00007ffff7b78af1 in scm_c_weak_table_ref (table=0x0,
    raw_hash=2738445758486295669, pred=0x7ffff7b77bb0 <assq_predicate>,
    closure=0x5555558fff00, dflt=0x904) at ../../libguile/weak-table.c:862
warning: Source file is more recent than executable.
862  SCM_VALIDATE_WEAK_TABLE (1, table);
(gdb) bt
#0  0x00007ffff7b78af1 in scm_c_weak_table_ref (table=0x0,
    raw_hash=2738445758486295669, pred=0x7ffff7b77bb0 <assq_predicate>,
    closure=0x5555558fff00, dflt=0x904) at ../../libguile/weak-table.c:862
#1  0x00007ffff7b02fa4 in fluid_ref (dynamic_state=0x555555f8ce60,
    fluid=0x5555558fff00) at ../../libguile/fluids.c:287
#2  0x00007ffff7b0325f in scm_fluid_ref (fluid=0x5555558fff00)
    at ../../libguile/fluids.c:308
#3  0x00007ffff7b34424 in scm_i_default_port_conversion_strategy ()
    at ../../libguile/ports.c:1015
#4  0x00007ffff7b5e4df in scm_i_default_string_failed_conversion_handler ()
    at ../../libguile/strings.c:1619
#5  scm_from_locale_stringn (
    str=0x7ffff7b88d50 "Wrong type argument in position ~A: ~S",
    len=len <at> entry=18446744073709551615) at ../../libguile/strings.c:1626
#6  0x00007ffff7b5e51c in scm_from_locale_string (str=<optimized out>)
    at ../../libguile/strings.c:1613
#7  0x00007ffff7af76c6 in scm_error (key=0x5555558fa960,
    subr=subr <at> entry=0x7ffff7b8a080 <s_scm_set_current_dynamic_state>
"set-current-dynamic-state", message=<optimized out>,
args=0x555555c6ce30,
    rest=rest <at> entry=0x555555c6ce50) at ../../libguile/error.c:59
#8  0x00007ffff7af7968 in scm_wrong_type_arg (
    subr=subr <at> entry=0x7ffff7b8a080 <s_scm_set_current_dynamic_state>
"set-current-dynamic-state", pos=pos <at> entry=1,
bad_value=bad_value <at> entry=0x555555c6c3b0)
---Type <return> to continue, or q <return> to quit---
    at ../../libguile/error.c:251
#9  0x00007ffff7b03096 in scm_set_current_dynamic_state (
    state=state <at> entry=0x555555c6c3b0) at ../../libguile/fluids.c:496
#10 0x00007ffff7b6351a in guilify_self_2 (
    dynamic_state=dynamic_state <at> entry=0x555555c6c3b0)
    at ../../libguile/threads.c:466
#11 0x00007ffff7b63e0c in scm_i_init_thread_for_guile (base=0x7fffedbe0ec0,
    dynamic_state=0x555555c6c3b0) at ../../libguile/threads.c:595
#12 0x00007ffff7b63e59 in with_guile (base=base <at> entry=0x7fffedbe0ec0,
    data=data <at> entry=0x7fffedbe0ef0) at ../../libguile/threads.c:638
#13 0x00007ffff6c71812 in GC_call_with_stack_base (
    fn=fn <at> entry=0x7ffff7b63e40 <with_guile>, arg=arg <at> entry=0x7fffedbe0ef0)
    at misc.c:1925
#14 0x00007ffff7b635cc in scm_i_with_guile (dynamic_state=<optimized out>,
    data=0x555555c6c410, func=0x7ffff7b635e0 <really_launch>)
    at ../../libguile/threads.c:688
#15 launch_thread (d=0x555555c6c410) at ../../libguile/threads.c:750
#16 0x00007ffff735f464 in start_thread (arg=0x7fffedbe1700)
    at pthread_create.c:333
#17 0x00007ffff70a29df in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105




Information forwarded to bug-guile <at> gnu.org:
bug#25387; Package guile. (Sun, 08 Jan 2017 00:23:01 GMT) Full text and rfc822 format available.

Message #8 received at 25387 <at> debbugs.gnu.org (full text, mbox):

From: Linas Vepstas <linasvepstas <at> gmail.com>
To: 25387 <at> debbugs.gnu.org
Subject: also crashes in guile-2.0
Date: Sat, 7 Jan 2017 18:21:53 -0600
Also crashes in guile-2.0, but takes much longer - 5 minutes

--linas




Information forwarded to bug-guile <at> gnu.org:
bug#25387; Package guile. (Mon, 09 Jan 2017 22:09:02 GMT) Full text and rfc822 format available.

Message #11 received at 25387 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: Linas Vepstas <linasvepstas <at> gmail.com>
Cc: 25387 <at> debbugs.gnu.org
Subject: Re: bug#25387: guile-2.2 multi-thread segfault in
 SCM_VALIDATE_WEAK_TABLE
Date: Mon, 09 Jan 2017 23:08:18 +0100
On Sun 08 Jan 2017 01:18, Linas Vepstas <linasvepstas <at> gmail.com> writes:

> Following program crashes immediately (fraction of a second)
> in guile-2.2, current git version (as of 29 Dec 2016
> a0656ad4cf976b3845e9b9663a90b46b4cf9fc5a )

Nice bug, thank you!  I will have a look.

Andy




Information forwarded to bug-guile <at> gnu.org:
bug#25387; Package guile. (Tue, 10 Jan 2017 17:39:01 GMT) Full text and rfc822 format available.

Message #14 received at 25387 <at> debbugs.gnu.org (full text, mbox):

From: Linas Vepstas <linasvepstas <at> gmail.com>
To: 25387 <at> debbugs.gnu.org
Subject: better but still an issue.
Date: Tue, 10 Jan 2017 11:37:58 -0600
Retested with today's version of git. still crashes, but not instantly;
it now takes 20 seconds to 5 minutes to reproduce.

guile (GNU Guile) 2.1.5.19-7e9395




Reply sent to Andy Wingo <wingo <at> pobox.com>:
You have taken responsibility. (Wed, 11 Jan 2017 21:25:02 GMT) Full text and rfc822 format available.

Notification sent to linasvepstas <at> gmail.com:
bug acknowledged by developer. (Wed, 11 Jan 2017 21:25:02 GMT) Full text and rfc822 format available.

Message #19 received at 25387-done <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: Linas Vepstas <linasvepstas <at> gmail.com>
Cc: 25387-done <at> debbugs.gnu.org
Subject: Re: bug#25387: guile-2.2 multi-thread segfault in
 SCM_VALIDATE_WEAK_TABLE
Date: Wed, 11 Jan 2017 22:24:03 +0100
On Mon 09 Jan 2017 23:08, Andy Wingo <wingo <at> pobox.com> writes:

> On Sun 08 Jan 2017 01:18, Linas Vepstas <linasvepstas <at> gmail.com> writes:
>
>> Following program crashes immediately (fraction of a second)
>> in guile-2.2, current git version (as of 29 Dec 2016
>> a0656ad4cf976b3845e9b9663a90b46b4cf9fc5a )
>
> Nice bug, thank you!  I will have a look.

Fixed in master, I think.  Have a look!

  commit 63bf6ffa0d3cdddf8151cc80ac18fe5dfb614587
  Author: Andy Wingo <wingo <at> pobox.com>
  Date:   Wed Jan 11 22:17:24 2017 +0100

      Protect call-with-new-thread data from GC.
      
      * libguile/threads.c (struct launch_data): Add prev/next pointers.
        (protected_launch_data, protected_launch_data_lock): New static vars.
        (protect_launch_data, unprotect_launch_data): New functions.
        (really_launch, scm_sys_call_with_new_thread): Preserve launch data
        from GC.  Thanks to Linas Vepstas for the report!

Cheers,

Andy




Information forwarded to bug-guile <at> gnu.org:
bug#25387; Package guile. (Wed, 11 Jan 2017 22:31:03 GMT) Full text and rfc822 format available.

Message #22 received at 25387-done <at> debbugs.gnu.org (full text, mbox):

From: Linas Vepstas <linasvepstas <at> gmail.com>
To: Andy Wingo <wingo <at> pobox.com>
Cc: 25387-done <at> debbugs.gnu.org
Subject: Re: bug#25387: guile-2.2 multi-thread segfault in
 SCM_VALIDATE_WEAK_TABLE
Date: Wed, 11 Jan 2017 16:29:38 -0600
Hi Andy: I just code-reviewed, it looks like a good fix;  you're saying that
the dynamic state was being accidentally collected when it shouldn't
have been.

Tested, it tests OK, after 40 mins cpu time, its still running.

--linas




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 09 Feb 2017 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 129 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.