GNU bug report logs -
#72145
rare Emacs screwups on x86 due to GCC bug 58416
Previous Next
Reported by: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Tue, 16 Jul 2024 23:27:02 UTC
Severity: normal
Tags: patch
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
Message #31 received at 72145 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert <eggert <at> cs.ucla.edu> writes:
> While testing GNU Emacs built on Fedora 40 with gcc (GCC) 14.1.1
> 20240607 (Red Hat 14.1.1-5) with -m32 for x86 and configured
> --with-wide-int, I discovered that Emacs misbehaved in a hard-to-debug
> way due to GCC bug 58416. This bug causes GCC to generate wrong x86
> machine instructions when a C program accesses a union containing a
> 'double'.
>
> The bug I observed is that if you have something like this:
>
> union u { double d; long long int i; } u;
>
> then GCC sometimes generates x86 instructions that copy u.i by using
> fldl/fstpl instruction pairs to push the 64-bit quantity onto the 387
> floating point stack, and then pop the stack into another memory
> location. Unfortunately the fldl/fstpl trick fails in the unusual case
> when the bit pattern of u.i, when interpreted as a double, is a NaN,
> as that can cause the fldl/fstpl pair to store a different NaN with a
> different bit pattern, which means the destination integer disagrees
> with u.i.
>
> The bug is obscure, since the bug's presence depends on the GCC
> version, on the optimization options used, on the exact source code,
> and on the exact integer value at runtime (the value is typically
> copied correctly even when GCC has generated the incorrect machine
> code, since most long long int values don't alias with NaNs).
>
> In short the bug appears to be rare.
>
> Here are some possible courses of action:
>
> * Do nothing and hope x86 users won't run into this rare bug.
>
> * Have the GCC folks fix the bug. However, given that the bug has been
> reported for over a decade multiple times without a fix, it seems
> that fixing it is too difficult and/or too low priority for this
> aging platform. Also, even if the bug is fixed in future GCC the bug
> will still be present with people using older GCC.
>
> * Build with Clang or some other compiler instead. We should be
> encouraging GCC, though.
>
> * Rewrite Emacs to never use 'double' (or 'float' or 'long double')
> inside a union. This could be painful and hardly seems worthwhile.
>
> * When using GCC to build Emacs on x86, compile with safer options
> that make the bug impossible. The attached proposed patch does that,
> by telling GCC not to use the 387 stack. (This patch fixed the Emacs
> misbehavior in my experimental build.) The downside is that the
> resulting Emacs executables need SSE2, introduced for the Pentium 4
> in 2000 <https://en.wikipedia.org/wiki/SSE2>. Nowadays few users
> need to run Emacs on non-SSE2 x86, so this may be good enough. Also,
> the proposed patch gives the builder an option to compile Emacs
> without the safer options, for people who want to build for older
> Intel-compatible platforms and who don't mind an occasional wrong
> answer or crash.
Mmmh nice one :)
I asked GCC people if they have a suggestion on how to work around this
bug <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58416#c9>.
Thanks
Andrea
This bug report was last modified 276 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.