GNU bug report logs - #59069
`guix shell -CN' failed to access GPU

Previous Next

Package: guix;

Reported by: dan <i <at> dan.games>

Date: Sun, 6 Nov 2022 06:47:02 UTC

Severity: normal

Merged with 59166

To reply to this bug, email your comments to 59069 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#59069; Package guix. (Sun, 06 Nov 2022 06:47:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to dan <i <at> dan.games>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Sun, 06 Nov 2022 06:47:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: dan <i <at> dan.games>
To: bug-guix <at> gnu.org
Subject: `guix shell -CN' failed to access GPU
Date: Sun, 06 Nov 2022 14:11:08 +0800
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256


I was trying to run some GUI software in a guix container, and would like to have GPU access in it.  However, I later found out that if I gave network access to the container, it seems like unable to properly find the GPU.  The following are the commands that I run and the output I got:

- ------------------------------without-network-access------------------------------

$ guix shell -C mesa-utils --expose=/tmp/.X11-unix --expose=$XAUTHORITY --expose=/dev/dri --expose=/etc/udev -E "DISPLAY|XAUTHORITY" -- glxinfo -B

name of display: :1
display: :1  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: AMD (0x1002)
    Device: AMD RENOIR (DRM 3.47.0, 5.19.15, LLVM 11.0.0) (0x1638)
    Version: 21.3.8
    Accelerated: yes
    Video memory: 1024MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.6
    Max compat profile version: 4.6
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
Memory info (GL_ATI_meminfo):
    VBO free memory - total: 655 MB, largest block: 655 MB
    VBO free aux. memory - total: 15305 MB, largest block: 15305 MB
    Texture free memory - total: 655 MB, largest block: 655 MB
    Texture free aux. memory - total: 15305 MB, largest block: 15305 MB
    Renderbuffer free memory - total: 655 MB, largest block: 655 MB
    Renderbuffer free aux. memory - total: 15305 MB, largest block: 15305 MB
Memory info (GL_NVX_gpu_memory_info):
    Dedicated video memory: 1024 MB
    Total available memory: 16487 MB
    Currently available dedicated video memory: 655 MB
OpenGL vendor string: AMD
OpenGL renderer string: AMD RENOIR (DRM 3.47.0, 5.19.15, LLVM 11.0.0)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 21.3.8
OpenGL core profile shading language version string: 4.60
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 4.6 (Compatibility Profile) Mesa 21.3.8
OpenGL shading language version string: 4.60
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile

OpenGL ES profile version string: OpenGL ES 3.2 Mesa 21.3.8
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
- ------------------------------with-network-access------------------------------

$ guix shell -CN mesa-utils --expose=/tmp/.X11-unix --expose=$XAUTHORITY --expose=/dev/dri --expose=/etc/udev -E "DISPLAY|XAUTHORITY" -- glxinfo -B

name of display: :1
libGL error: MESA-LOADER: failed to retrieve device information
libGL error: MESA-LOADER: failed to open amdgpu: /gnu/store/83kzrpczis5s8hn3ly9y89mij7ngq4bw-mesa-21.3.8/lib/dri/amdgpu_dri.so: cannot open shared object file: No such file or directory (search paths /gnu/store/83kzrpczis5s8hn3ly9y89mij7ngq4bw-mesa-21.3.8/lib/dri, suffix _dri)
libGL error: failed to load driver: amdgpu
libGL error: MESA-LOADER: failed to retrieve device information
libGL error: MESA-LOADER: failed to open amdgpu: /gnu/store/83kzrpczis5s8hn3ly9y89mij7ngq4bw-mesa-21.3.8/lib/dri/amdgpu_dri.so: cannot open shared object file: No such file or directory (search paths /gnu/store/83kzrpczis5s8hn3ly9y89mij7ngq4bw-mesa-21.3.8/lib/dri, suffix _dri)
libGL error: failed to load driver: amdgpu
display: :1  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: Mesa/X.org (0xffffffff)
    Device: llvmpipe (LLVM 11.0.0, 256 bits) (0xffffffff)
    Version: 21.3.8
    Accelerated: no
    Video memory: 30926MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.5
    Max compat profile version: 4.5
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
OpenGL vendor string: Mesa/X.org
OpenGL renderer string: llvmpipe (LLVM 11.0.0, 256 bits)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 21.3.8
OpenGL core profile shading language version string: 4.50
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 4.5 (Compatibility Profile) Mesa 21.3.8
OpenGL shading language version string: 4.50
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile

OpenGL ES profile version string: OpenGL ES 3.2 Mesa 21.3.8
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
- ------------------------------paste-ends-here------------------------------

The only difference between these two executions are the `-N' flag.  I also had a look at the related guile code, and it seems like the `-N' flag is only doing two things:
  1. bind several network related files to /etc
  2. share network namespace to the container

I've had a few other guix users tested the commands, and they reported the similar results.

Some info about my environment:
kernel version: 6.0.7
mesa   version: 21.3.8

- -- 
dan
-----BEGIN PGP SIGNATURE-----

iQJABAEBCAAqFiEENywBMxcNCHYJ4/aIR1rKxpmiJ40FAmNnWCgMHGlAZGFuLmdh
bWVzAAoJEEdaysaZoieNbFsP/2INlj3WNX8fKBt5pFGkAnewXUHS4Vn+pBSbshuc
srwJ4gaatBJkaWvA71kH3mLwYOH+cQmSVI8Zt2Bc2Ztny+SewBt9cqvQAEAmHME7
tW2y5nAhzsJplMoOtTcRnT1Opdn5Zz0iLCwuc8avVa14KwqV53qEmXyjdL8DwIgQ
kkyog4j3W5bCIfKdAwQmsg9/Fr4TEVRiFHvNCkmpkCHVxQ0RBsTvW5wfHzfkSvL5
Z0FY20xq20LjTpwuk6yVl79+4dkSotXoXwSbkd3aa8ehyWIlGLrTyTkJeL5jmqXZ
ec9zWBN5xT6a1JiOxhVxGn/X3FLpSryOp7kzz5L4RrWbMPYnILUz0X5XzcRRZYWK
OovxW/z6Ug6uDAfMkgGuiLrdiHOGKnxaEzJdtVdDwtk2SMqM0B8qZEkunZIfUeKf
2BOy7xCxx8UP+mtdaHz/wdH6IvVMSewDLZUIOXKOlhqeYm58vulPPkHIKP4EVNpC
RUmbRenevrfvt/6WYujxvd3GEU6I6DEslryObS7ntypjESxPiuwVTPLffhCwlomC
Yg23qP395fi4ecer+8rLgANsb7YUKWk74Pl218Pcddfjaitrfx3UUyWynYtPmxHg
tj30jNlhz2owYag5WC0c76K2rmnQaAZ8dHZ5pza0FFGHbkn7Xcqy7xXK4K0b6+5h
OSuZ
=qHGv
-----END PGP SIGNATURE-----




Information forwarded to bug-guix <at> gnu.org:
bug#59069; Package guix. (Thu, 10 Nov 2022 09:46:02 GMT) Full text and rfc822 format available.

Message #8 received at 59069 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: dan <i <at> dan.games>
Cc: 59069 <at> debbugs.gnu.org
Subject: Re: bug#59069: `guix shell -CN' failed to access GPU
Date: Thu, 10 Nov 2022 10:45:08 +0100
Hi,

dan <i <at> dan.games> skribis:

> I was trying to run some GUI software in a guix container, and would like to have GPU access in it.  However, I later found out that if I gave network access to the container, it seems like unable to properly find the GPU.  The following are the commands that I run and the output I got:

Could you check with strace what it’s trying to access, both with and
without ‘-N’?

  guix shell mesa-utils strace … -C -- strace -o /tmp/log.strace glxinfo

It might be a /dev node, or it might be simply talking to the X server,
which requires network access.

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#59069; Package guix. (Thu, 10 Nov 2022 15:20:01 GMT) Full text and rfc822 format available.

Message #11 received at 59069 <at> debbugs.gnu.org (full text, mbox):

From: dan <i <at> dan.games>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 59069 <at> debbugs.gnu.org
Subject: Re: bug#59069: `guix shell -CN' failed to access GPU
Date: Thu, 10 Nov 2022 23:08:41 +0800
Ludovic Courtès <ludo <at> gnu.org> writes:

> Could you check with strace what it’s trying to access, both 
> with and
> without ‘-N’?
>
>   guix shell mesa-utils strace … -C -- strace -o /tmp/log.strace 
>   glxinfo

I looked into the strace logs, and found out that it's actually 
having trouble accessing /sys, which is not available in a '-N' 
container.  I run the following scripts to test:
> $ guix shell -C coreutils -- ls /
> bin  dev  etc  gnu  home  proc  sys  tmp
while with the '-N' flag:
> $guix shell -CN coreutils --ls /
> bin  dev  etc  gnu  home  proc  tmp

I have the strace logs in the paste bin, with the line indicating 
the problem[1][2].

[1]: 
https://paste.sr.ht/~lizog/950ef117109fb0d34e70a813852cf7cbf04919a6#log-cn.strace-L585
[2]: 
https://paste.sr.ht/~lizog/950ef117109fb0d34e70a813852cf7cbf04919a6#log-c.strace-L552

-- 
dan




Information forwarded to bug-guix <at> gnu.org:
bug#59069; Package guix. (Thu, 10 Nov 2022 15:50:03 GMT) Full text and rfc822 format available.

Message #14 received at 59069 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: dan <i <at> dan.games>
Cc: 59069 <at> debbugs.gnu.org, David Thompson <davet <at> gnu.org>
Subject: Re: bug#59069: `guix shell -CN' failed to access GPU
Date: Thu, 10 Nov 2022 16:49:00 +0100
Hi!

(Cc: Dave Thompson, the original author of this code.)

As you pointed out on IRC, the problem is that ‘guix shell -C’ provides
/sys whereas ‘guix shell -CN’ doesn’t.

This stems from this call in (gnu build linux-container), which has
always been there:

    (mount-file-systems root mounts
                        #:mount-/proc? (memq 'pid namespaces)
                        #:mount-/sys?  (memq 'net
                                             namespaces))

This is explained a few lines above:

  ;; A sysfs mount requires the user to have the CAP_SYS_ADMIN capability in
  ;; the current network namespace.
  (when mount-/sys?
    (mount* "none" (scope "/sys") "sysfs"
            (logior MS_NOEXEC MS_NOSUID MS_NODEV MS_RDONLY)))

As you noticed with ‘--expose=/sys’, bind-mounting /sys doesn’t work
either (‘mount’ fails with EINVAL).

Not sure what to do.  Thoughts?

Ludo’.




Merged 59069 59166. Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Sat, 12 Nov 2022 17:25:02 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 218 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.