GNU bug report logs - #58320
Hurd VM fails to boot on AMD EPYC (kvm-amd)

Previous Next

Package: guix;

Reported by: Ludovic Courtès <ludo <at> gnu.org>

Date: Wed, 5 Oct 2022 21:02:01 UTC

Severity: normal

Tags: wontfix

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #38 received at 58320 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: 58320 <at> debbugs.gnu.org
Cc: bug-hurd <at> gnu.org
Subject: Re: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd)
Date: Mon, 17 Oct 2022 14:51:01 +0200
Hi,

Ludovic Courtès <ludo <at> gnu.org> skribis:

> … so ‘exec_load’ is doing its job, it seems.

Turns out that may not be the case.

Here’s a *bad* mapping on the second ‘task_resume’ breakpoint (when
‘exec’ is about to start):

--8<---------------cut here---------------start------------->8---
  db> show all threads
      TASK        THREADS
    0 gnumach (f5f7cf00): 7 threads:
                0 (f5f7be18) .W..N. 0xc11dac04
                1 (f5f7bcd0) R..O..(idle_thread_continue)
                2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4
                3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c
                4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0
                5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74
                6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8
    1 ext2fs (f5f7ce40): 6 threads:
                0 (f5f7b520) R....F
                1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0
                2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0
                3 (f5f7b000) .W.O..(mach_msg_continue) 0
                4 (f67d3e20) .W.O..(mach_msg_receive_continue) 0
                5 (f67d3cd8) .W.O..(mach_msg_continue) 0
    2 exec (f5f7cd80): (f5f7b3d8) ..SO..(thread_bootstrap_return)
   db> trace
  task_resume(f593e010,fb7d9010,f5f73e80,c106972a)
  ipc_kobject_server(f593e000,3,18,0)+0x1eb
  mach_msg_trap(bffff4c0,3,18,20,8)+0x1703
  >>>>> user space <<<<<
  db> x/tbx 0xcbc 0xf5f7b3d8

  no memory is assigned to address 00000cbc
  0
  db> show map $map2
    Map 0xf5f6ff30: name="exec", pmap=0xf5f71fa8,ref=1,nentries=5
  size=290816,resident:225280,wired=0
  version=13
     map entry 0xf625ec08: start=0x0, end=0x1000
     prot=1/7/copy, object=0x0, offset=0x0
     map entry 0xf625ebb0: start=0x1000, end=0x26000
     prot=5/7/copy, object=0xf5f6ad70, offset=0x0
      Object 0xf5f6ad70: size=0x25000, 1 references
      37 resident pages, 0 absent pages, 0 paging ops
       memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82780
       uninitialized,temporary    internal,copy_strategy=0
       shadow=0x0 (offset=0x0),copy=0x0
     map entry 0xf625eb58: start=0x26000, end=0x34000
     prot=1/7/copy, object=0xf5f6ad20, offset=0x0
      Object 0xf5f6ad20: size=0xe000, 1 references
      14 resident pages, 0 absent pages, 0 paging ops
       memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82730
       uninitialized,temporary    internal,copy_strategy=0
       shadow=0x0 (offset=0x0),copy=0x0
     map entry 0xf625eb00: start=0x34000, end=0x37000
     prot=3/7/copy, object=0xf5f6acd0, offset=0x0
      Object 0xf5f6acd0: size=0x3000, 1 references
      3 resident pages,--db_more--
--8<---------------cut here---------------end--------------->8---

Compare with what a “good” mapping looks like at that same moment:

--8<---------------cut here---------------start------------->8---
  start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1]Kernel Breakpoint trap,
   eip 0xc1030d5b
  Breakpoint at  task_resume:     pushl   %ebp
  db> show all threads
      TASK        THREADS
    0 gnumach (f5f7cf00): 7 threads:
                0 (f5f7be18) .W..N. 0xc11dac04
                1 (f5f7bcd0) R..O..(idle_thread_continue)
                2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4
                3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c
                4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0
                5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74
                6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8
    1 ext2fs (f5f7ce40): 6 threads:
                0 (f5f7b520) R....F
                1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0
                2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0
                3 (f5f7b000) .W.O..(mach_msg_continue) 0
                4 (f67d2e20) .W.O..(mach_msg_receive_continue) 0
                5 (f67d2cd8) .W.O..(mach_msg_continue) 0
    2 exec (f5f7cd80): (f5f7b3d8) ..SO..(thread_bootstrap_return)
  db> x/tbx 0xcbc 0xf5f7b3d8
                  8
  db> show map $map2
  Map 0xf5f6ff30: name="exec", pmap=0xf5f71fa8,ref=1,nentries=5
  size=290816,resident:229376,wired=0
  version=14
   map entry 0xf625ec08: start=0x0, end=0x1000
   prot=1/7/copy, object=0xf5f6ad70, offset=0x0
    Object 0xf5f6ad70: size=0x1000, 1 references
    1 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82780
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
   map entry 0xf625ebb0: start=0x1000, end=0x26000
   prot=5/7/copy, object=0xf5f6ad20, offset=0x0
    Object 0xf5f6ad20: size=0x25000, 1 references
    37 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82730
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
   map entry 0xf625eb58: start=0x26000, end=0x34000
   prot=1/7/copy, object=0xf5f6acd0, offset=0x0
    Object 0xf5f6acd0: size=0xe000, 1 references
    14 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f826e0
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
   map entry 0xf625eb00: start=0x34000, end=0x37000
   prot=3/7/copy, object=0xf5f6ac80, offset=0x0
    Object 0xf5f6ac80: size=0x3000, 1 references
    3 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82690
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
   map entry 0xf625eaa8: start=0xbfff0000, end=0xc0000000
   prot=3/7/copy, object=0xf5f6ac30, offset=0x0
    Object 0xf5f6ac30: size=0x10000, 1 references
    1 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82640
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
--8<---------------cut here---------------end--------------->8---

Notice that 0xcbc reads a valid relocation, where 8 = R_386_RELATIVE.

In the “bad” case, the first map entry is empty, with no associated
memory object and zero resident pages.

My reading of ‘read_exec’ is that the page is supposed to be populated
eagerly by the ‘copyout’ call here:

--8<---------------cut here---------------start------------->8---
static int
read_exec(void *handle, vm_offset_t file_ofs, vm_size_t file_size,
		     vm_offset_t mem_addr, vm_size_t mem_size,
		     exec_sectype_t sec_type)
{
  struct multiboot_module *mod = handle;

[...]

	err = vm_allocate(user_map, &start_page, end_page - start_page, FALSE);
	assert(err == 0);
	assert(start_page == trunc_page(mem_addr));

	if (file_size > 0)
	{
		err = copyout((char *)phystokv (mod->mod_start) + file_ofs,
			      (void *)mem_addr, file_size);
		assert(err == 0);
	}

[...]

	return 0;
}
--8<---------------cut here---------------end--------------->8---

There are interesting tricks in ‘copyout_retry’ to fake a page fault so
the copy can actually be made, IIUC.

Could it be that this bit isn’t quite working?

Ideas?

Problem with debugging this is that setting a breakpoint on ‘exec_load’
causes the system to boot fine (breaking on ‘task_resume’ is fine tough,
go figure…).

Ludo’.




This bug report was last modified 270 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.