GNU bug report logs - #73167
[PATCH] Fix setjmp/longjmp-related crashes on Windows

Previous Next

Package: guile;

Reported by: Michael Käppler <xmichael-k <at> web.de>

Date: Tue, 10 Sep 2024 12:47:02 UTC

Severity: normal

Tags: patch

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 73167 in the body.
You can then email your comments to 73167 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#73167; Package guile. (Tue, 10 Sep 2024 12:47:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Michael Käppler <xmichael-k <at> web.de>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Tue, 10 Sep 2024 12:47:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Michael Käppler <xmichael-k <at> web.de>
To: bug-guile <at> gnu.org
Subject: [PATCH] Fix setjmp/longjmp-related crashes on Windows
Date: Tue, 10 Sep 2024 14:45:40 +0200
[Message part 1 (text/plain, inline)]
Hi all,
recently an user reported a bug to the LilyPond project, where he tried
to print
a data structure with (pretty-print), which silently failed at a certain
point inside the data structure:

https://gitlab.com/lilypond/lilypond/-/issues/6737#note_2049515997

The error did only affect LilyPond mingw builds, compiled against Guile
3.0.10, 3.0.9 was (seemingly) fine.
Bisecting showed that the error first showed up with commit
https://git.savannah.gnu.org/cgit/guile.git/commit/?id=29a9f26a36035d5425b173d101628ecc62f5a46b

which substantially reworked the pretty-print implementation to use
delimited continuations.
As it turned out, the problems only arises with the JIT turned on.

Further debugging revealed that the crash happens inside the MSVCRT
function 'RtlUnwindEx', which detects
a stack corruption and throws "STATUS_BAD_STACK". The reason is that
'setjmp' on mingw expands to
'_setjmp' that takes the current frame address as second parameter.
After a 'longjmp' call it tries to unwind
the stack up to this particular frame address. IIUC, this will fail
because the JIT'ed code does not follow
the Windows x64 calling conventions. Setting the second parameter to
NULL prevents unwinding and fixes
the issue.

Guile is not the only project using JIT compilation that faces this
Windows peculiarity.
See
https://blog.lazym.io/2020/09/21/Unicorn-Devblog-setjmp-longjmp-on-Windows/
for a nice summary.

Please consider the attached patch.

Michael
[0001-Fix-setjmp-longjmp-related-crashes-on-Windows.patch (text/plain, attachment)]

Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Sun, 20 Oct 2024 11:08:02 GMT) Full text and rfc822 format available.

Notification sent to Michael Käppler <xmichael-k <at> web.de>:
bug acknowledged by developer. (Sun, 20 Oct 2024 11:08:02 GMT) Full text and rfc822 format available.

Message #10 received at 73167-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Michael Käppler <xmichael-k <at> web.de>
Cc: 73167-done <at> debbugs.gnu.org
Subject: Re: bug#73167: [PATCH] Fix setjmp/longjmp-related crashes on Windows
Date: Sun, 20 Oct 2024 13:05:14 +0200
Hi Michael,

Michael Käppler <xmichael-k <at> web.de> skribis:

> From f9222ec96209c59c9a9a409c019ff59c0c20917c Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Michael=20K=C3=A4ppler?= <xmichael-k <at> web.de>
> Date: Sat, 7 Sep 2024 22:52:22 +0200
> Subject: [PATCH] Fix setjmp/longjmp-related crashes on Windows
>
> * libguile/Makefile.am: add new header file setjump-win.h
> * libguile/continuations.h, libguile/dynstack.c, libguile/dynstack.h,
>   libguile/intrinsics.h, libguile/vm.h:
>   supply custom `setjmp` macro on Windows
>
> Mingw implements `setjmp (env)` as a macro that expands to
>
>  _setjmp (env, faddr)
>
> where `faddr` is set to the current frame address.
>
> This address is then stored as first element in the jump buffer `env`.
> When `longjmp` is called, it tries to unwind the stack up
> to the saved address by calling `RtlUnwindEx` from MSVCRT,
> which will fail, if the stack frames are interwoven with
> JIT-generated code, that violate the Windows x64 calling conventions.
>
> Thus implement the macro ourselves as
>
> _setjmp (env, NULL)
>
> which will toggle a code path in `longjmp` that does no unwinding.

Applied, thanks!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 17 Nov 2024 12:24:12 GMT) Full text and rfc822 format available.

This bug report was last modified 210 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.