GNU bug report logs - #76998
Guix Home leaves user shepherd on logout, starts new instance on login

Previous Next

Package: guix;

Reported by: dannym <at> friendly-machines.com

Date: Thu, 13 Mar 2025 19:11:02 UTC

Severity: important

Merged with 67863, 74912

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Tomas Volf <~@wolfsden.cz>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#76998: closed (Guix Home leaves user shepherd on logout,
 starts new instance on login)
Date: Sun, 15 Jun 2025 13:41:03 +0000
[Message part 1 (text/plain, inline)]
Your message dated Sun, 15 Jun 2025 15:40:33 +0200
with message-id <87wm9dazse.fsf <at> wolfsden.cz>
and subject line Re: bug#76998: Guix Home leaves user shepherd on logout, starts new instance on login
has caused the debbugs.gnu.org bug report #76998,
regarding Guix Home leaves user shepherd on logout, starts new instance on login
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
76998: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=76998
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: dannym <at> friendly-machines.com
To: Bug Guix <bug-guix <at> gnu.org>
Subject: user shepherd stays around with some zombies
Date: Thu, 13 Mar 2025 20:10:36 +0100
Steps to reproduce:

1. Log into the console using your regular user
2. Log into GUI using your regular user
3. Log out of GUI
4. Switch to logged-in console
5. Run "px --tree" there
6. Observe the following:

shepherd(1)
  accounts-daemon(1110)
  avahi-daemon:(2443)
    avahi-daemon:(2446)
  bluetoothd(1026)
  colord(25587)
  cupsd(2440)
  dbus-daemon(769)
  dnsmasq(1845)
    dnsmasq(1846)
  earlyoom(744)
  elogind(1024)
  gdm(1038)
  guix-daemon(740)
  libvirtd(1023)
  login(26536)
    -bash(6739)
  mcron(747)
  mingetty... (5×)
  ModemManager(1276)
  NetworkManager(1256)
  nginx:(797)
    nginx:(798)
  nscd(2177)
  polkitd(1231)
  postgres(852)
    postgres:... (6×)
  rasdaemon(796)
  rpc.idmapd(2447)
  rpc.mountd(2501)
  rpc.statd(2444)
  rpcbind(2441)
  shepherd(6395) <--- also dannym
    [dbus-daemon](6397)
    [ssh-agent](6444)
    [xdg-permission-](6411)
    wireplumber(6399)
  shepherd(26114) <--- dannym
    dbus-daemon(6881)
    pipewire(6882)
    pipewire-pulse(6883)
    ssh-agent(6880)
    wireplumber(6888)
    xdg-permission-store(7259)
  udevd(330)
  upowerd(1025)
  virtlogd(742)
  wpa_supplicant(1045)

Those "[...]" with brackets mean that these processes were not reaped 
(so is defunct).

What the hell?

$ guix describe
Generation 194	Mar 13 2025 19:11:33	(current)
  guix 678b3dd
    repository URL: https://git.savannah.gnu.org/git/guix.git
    branch: master
    commit: 678b3dddfe442e643fe5cff7730d4f9690c3e2c2


[Message part 3 (message/rfc822, inline)]
From: Tomas Volf <~@wolfsden.cz>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 74912 <at> debbugs.gnu.org, 76998 <at> debbugs.gnu.org, 76998-done <at> debbugs.gnu.org,
 Jake <jforst.mailman <at> gmail.com>, Daniel Littlewood <dan <at> danielittlewood.xyz>,
 Danny Milosavljevic <dannym <at> friendly-machines.com>
Subject: Re: bug#76998: Guix Home leaves user shepherd on logout, starts new
 instance on login
Date: Sun, 15 Jun 2025 15:40:33 +0200
Hello :)

Ludovic Courtès <ludo <at> gnu.org> writes:

>> In any case, what shepherd 1.0.4 does is stop the bleeding, but not fix the problem:
>> It prevents two (or 100) user shepherds for the same user from running in parallel.
>> It does not stop shepherd when a user closed all their sessions.
>
> Yes.  It just occurred to me that we probably just got it wrong from the
> start: ‘XDG_RUNTIME_DIR’ (/run/user/$UID) is specified as having limited
> lifetime.  Quoth
> <https://specifications.freedesktop.org/basedir-spec/latest/>:
>
>   The lifetime of the directory MUST be bound to the user being logged
>   in.  It MUST be created when the user first logs in and if the user
>   fully logs out the directory MUST be removed.
>
> So it was probably a bad idea in the first place for shepherd to store
> its socket in /run/user/$UID (even more so that this directory doesn’t
> exist on systems without elogind/systemd).  GnuPG avoids
> $XDG_RUNTIME_DIR for exactly this reason (there’s a comment in
> ‘homedir.c’).

Minor correction here.  Looking at the source code, GnuPG avoids the
XDG_RUNTIME_DIR environment variable, but it still tries to use the
/run/user/$UID directory, if it exists.

> So, what can we do?
>
> In the Shepherd 1.1, we could default to $XDG_STATE_HOME instead; we
> probably shouldn’t change that in 1.0.x.

Not sure here, the specification says the following about this location:

> The $XDG_STATE_HOME contains state data that *should persist between
> (application) restarts*, but that is not important or portable enough
> to the user that it should be stored in $XDG_DATA_HOME.

So... control socket does not seem to fit that description.

> Any other idea?

Well, since you have mentioned the GnuPG as an example, we could just
mirror what it does, and what I have suggested before.

--8<---------------cut here---------------start------------->8---
$ mkdir /tmp/xxx && cd /tmp/xxx
$ guix shell -u test -C findutils gnupg coreutils bash procps -- env HOME=/tmp/xxx GNUPGHOME=/tmp/xxx bash
test <at> xx ~ [env]$ gpg-agent --daemon
gpg-agent[2]: directory '/tmp/xxx/private-keys-v1.d' created
gpg-agent[3]: gpg-agent (GnuPG) 2.4.7 started
test <at> xx ~ [env]$ find /run/user
/run/user
/run/user/1000
/run/user/1000/gnupg
/run/user/1000/gnupg/d.j1yiifhhjrep9xunazyff54c
/run/user/1000/gnupg/d.j1yiifhhjrep9xunazyff54c/S.gpg-agent.ssh
/run/user/1000/gnupg/d.j1yiifhhjrep9xunazyff54c/S.gpg-agent.browser
/run/user/1000/gnupg/d.j1yiifhhjrep9xunazyff54c/S.gpg-agent.extra
/run/user/1000/gnupg/d.j1yiifhhjrep9xunazyff54c/S.gpg-agent
test <at> xx ~ [env]$ ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
test         1  0.0  0.0   5136  4068 ?        S    13:32   0:00 bash
test         3  0.0  0.0   5516  2400 ?        Ss   13:32   0:00 gpg-agent --daemon
test         5  0.0  0.0   5224  3852 ?        R+   13:32   0:00 ps aux
test <at> xx ~ [env]$ rm -r /run/user/1000/gnupg
gpg-agent[3]: socket file has been removed - shutting down
gpg-agent[3]: gpg-agent (GnuPG) 2.4.7 stopped
test <at> xx ~ [env]$ find /run/user
/run/user
/run/user/1000
test <at> xx ~ [env]$ ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
test         1  0.0  0.0   5136  4068 ?        S    13:32   0:00 bash
test         8  0.0  0.0   5224  3776 ?        R+   13:33   0:00 ps aux
--8<---------------cut here---------------end--------------->8---

So my suggestion is that when the socket is deleted, the shepherd
process stops itself.

Tomas

-- 
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.


This bug report was last modified today.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.