GNU bug report logs -
#58320
Hurd VM fails to boot on AMD EPYC (kvm-amd)
Previous Next
Reported by: Ludovic Courtès <ludo <at> gnu.org>
Date: Wed, 5 Oct 2022 21:02:01 UTC
Severity: normal
Tags: wontfix
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
Message #29 received at 58320 <at> debbugs.gnu.org (full text, mbox):
Hi!
Ludovic Courtès <ludo <at> gnu.org> skribis:
> $ addr2line -e /gnu/store/m8afvcgwmrfhvjpd7b0xllk8vv5isd6j-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1 0x1000 0x11627 0x11bb
> ??:0
> /tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/glibc-2.33/elf/dl-misc.c:333
> :?
>
>
> That’s ‘_dl_fatal_printf’ calling ‘_exit’; it’s trying to tell us
> something.
>
> I’ll try and rebuild the system with the debugging patches at
> <https://lists.gnu.org/archive/html/bug-hurd/2011-11/msg00038.html>, to
> get early ld.so output, for lack of a better solution…
I tried adapted the patches above and tried them, but it seems that
‘_dl_sysdep_start’ isn’t even reached. For example, I set a breakpoint
on ‘mach_task_self’ (called from ‘__mach_init’, called from
‘_dl_sysdep_start’), but that’s never reached (I’m assuming ‘break/tu’
is reliable, is it?).
The user-space backtrace upon trap remains unhelpful:
--8<---------------cut here---------------start------------->8---
start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1]Kernel Breakpoint trap,
eip 0xc1030d5b
Breakpoint at task_resume: pushl %ebp
db> debug traps /on
db> b task_terminate
set breakpoint #2
db> c
Kernel Debug trap trap, eip 0xc1030d5b
execkernel: Page fault (14), code=6
Stopped at 0x1000: pushl 0x4(%ebx)
>>>>> user space <<<<<
0x1000(bfffff24,0,0,1160b,0)
0x11627(bfffff9c,0,0,0,2)
0x11bb()
--8<---------------cut here---------------end--------------->8---
… where:
--8<---------------cut here---------------start------------->8---
$ addr2line -e /gnu/store/4p1kab1c4h7h3kvgcm1hbjja4y5k9x4p-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1 0x11627 0x11bb
/tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/glibc-2.33/elf/dl-misc.c:333
:?
$ objdump -S /gnu/store/4p1kab1c4h7h3kvgcm1hbjja4y5k9x4p-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1 --start-address=0x000011b0 |head -40
/gnu/store/4p1kab1c4h7h3kvgcm1hbjja4y5k9x4p-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1: file format elf32-i386
Disassembly of section .text:
000011b0 <_start>:
11b0: 89 e0 mov %esp,%eax
11b2: 83 ec 0c sub $0xc,%esp
11b5: 50 push %eax
11b6: e8 b5 0a 00 00 call 1c70 <_dl_start>
11bb: 83 c4 10 add $0x10,%esp
--8<---------------cut here---------------end--------------->8---
So it would seem that ‘_dl_start’ is called and somehow then a tail-call
to ‘_dl_fatal_printf’ is made.
Through a dichotomy I tried to see how far it goes. The info I have so
far is that ld.so errors out from elf/rtld.c:563 (line 565 is not
reached):
--8<---------------cut here---------------start------------->8---
558: if (bootstrap_map.l_addr || ! bootstrap_map.l_info[VALIDX(DT_GNU_PRELINKED)])
559: {
560: /* Relocate ourselves so we can do normal function calls and
561: data access using the global offset table. */
562:
563: ELF_DYNAMIC_RELOCATE (&bootstrap_map, 0, 0, 0);
564: }
565: bootstrap_map.l_relocated = 1;
...
578: __rtld_malloc_init_stubs ();
--8<---------------cut here---------------end--------------->8---
It’s hard to be more precise because ELF_DYNAMIC_RELOCATE is a macro
that expands to quite a lot of code.
I don’t see the code path that would lead to a ‘_dl_fatal_printf’ call
though.
Ideas? :-)
Ludo’.
This bug report was last modified 270 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.