GNU bug report logs -
#57789
Emacs 28.1 clone build with native compilation crashes on s390x
Previous Next
Full log
Message #82 received at 57789 <at> debbugs.gnu.org (full text, mbox):
"Rob Browning" <rlb <at> defaultvalue.org> writes:
> Stefan Kangas <stefankangas <at> gmail.com> writes:
>
>> Thanks. I guess not a lot of us have access to an s390x machine, so I
>> don't think anyone has been able to test it.
>
> Hmm, I think I've heard there may be (or were?) some public instances
> that provide short-term dev access, but have never looked in to it.
I have cfarm access, but cfarm doesn't have an s390 machine :-(
> I was also going to outline an easy way to test in a vm at least on a
> Debian system via debvm/mmdebstrap, but after doing that, I wasn't able
> to reproduce the problem there. (Happy to provide instructions for
> anyone interested, otherwise.)
Same compiler? Is ASLR in use? In any case, I'm always interested in
weird machines, even if they're virtual, so I'd appreciate such
instructions.
> In any case, I just tried both the current Debian package and an
> upstream emcs-29.4 checkout on zelenka.debian.org, and both fail.
>
> The emacs-29.4 tree fails like this:
>
> make[3]: Entering directory '/home/rlb/emacs/admin/unidata'
> make[3]: Nothing to be done for 'charscript.el'.
> make[3]: Leaving directory '/home/rlb/emacs/admin/unidata'
> make -C ../admin/unidata emoji-zwj.el
> make[3]: Entering directory '/home/rlb/emacs/admin/unidata'
> make[3]: Nothing to be done for 'emoji-zwj.el'.
> make[3]: Leaving directory '/home/rlb/emacs/admin/unidata'
> ELC+ELN ../lisp/emacs-lisp/eldoc.elc
>
> Error: wrong-type-argument ("../lisp/emacs-lisp/eldoc.el" hash-table-p
> [unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound unbound unbound unbound unbound unbound unbound
> unbound unbound])
> Fatal error 11: Segmentation fault
Two random guesses:
1. purespace overflow. This causes erratic behavior of pretty much
every description. The tell-tale sign would be a "Pure Lisp storage
overflowed" message at some point in the "make bootstrap" log, maybe a
very long time before we crash.
2. GC problem. One possible problem is that Emacs currently relies on
__builtin_unwind_init to do the right thing. If __builtin_unwind_init
isn't implemented on s390, but is necessary (the second part is very
likely), we'll fail to mark some objects on the stack.
(2) seems more likely.
> Backtrace:
> ../src/bootstrap-emacs(emacs_backtrace+0x46) [0x2aa1c2f12f6]
> ../src/bootstrap-emacs(terminate_due_to_signal+0x9e) [0x2aa1c18fb76]
> ../src/bootstrap-emacs(+0x8fdde) [0x2aa1c18fdde]
> ../src/bootstrap-emacs(+0x1ef45a) [0x2aa1c2ef45a]
> ../src/bootstrap-emacs(+0x1ef4a2) [0x2aa1c2ef4a2]
> linux-vdso64.so.1(__kernel_rt_sigreturn+0x0) [0x3ffdc0e5480]
> ../src/bootstrap-emacs(+0x2433a4) [0x2aa1c3433a4]
> ../src/bootstrap-emacs(visit_static_gc_roots+0x196) [0x2aa1c342dae]
> ../src/bootstrap-emacs(garbage_collect+0x1e6) [0x2aa1c3445d6]
> ../src/bootstrap-emacs(eval_sub+0x54c) [0x2aa1c370244]
> ../src/bootstrap-emacs(eval_sub+0x4ac) [0x2aa1c3701a4]
> ../src/bootstrap-emacs(Fcond+0x84) [0x2aa1c3711f4]
> ../src/bootstrap-emacs(eval_sub+0x8d2) [0x2aa1c3705ca]
> ../src/bootstrap-emacs(Fwhile+0x6e) [0x2aa1c370fb6]
Can you disassemble the Fwhile, eval_sub, and visit_static_gc_roots
functions? I assume s390 disassembled code isn't too hard to read...
Random aside: is 0x2aa1c3705ca a likely S390 program counter? The
number looks familiar because it looks similar to a Lisp_Object
representing a symbol on x86-64 without ASLR (an example would be
0x2aaa8dac00e8). I guess it's just a coincidence though.
Thanks!
Pip
This bug report was last modified 157 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.