happy sunday > Yes, but only if ‘pid’ hasn’t been cached before, which I think would > mean that not a single line was logged before stopping the service. doesn't get cached if no output is read while service is 'running. fork+exec-command: if the pid-file doesnt show up immediately, there is an entire 1 second sleep. The logger can easily read the output while the service is still in 'starting also: if the service doesn't flush stdout, we dont get its output until it dies. ('stopping) > Could you explain exactly how that happens (sequence of actions leading > to the deadlock) and share the relevant /var/log/messages excerpt? ./shepherd --socket /tmp/s2/mysocket --config GNU Shepherd 1.0.3 (Guile 3.0.9, x86_64-unknown-linux-gnu) Starting service root... Service root started. Service root running with value #< id: 18114 command: #f>. Service root has been started. starting services... Configuration successfully loaded from ''. Starting service myservice... Service myservice has been started. Service myservice started. Successfully started 1 service in the background. Service myservice running with value #< id: 18132 command: ("/tmp/a.out")>. in other terminal: ./herd -s /tmp/s2/mysocket status myservice ./herd -s /tmp/s2/mysocket stop myservice works fine more shepherd output: Stopping service myservice... Service myservice stopped. Service myservice is now stopped. in other terminal, all of these hang: ./herd -s /tmp/s2/mysocket status myservice ./herd -s /tmp/s2/mysocket stop myservice ./herd -s /tmp/s2/mysocket start myservice ./herd -s /tmp/s2/mysocket status ./herd -s /tmp/s2/mysocket stop root does not hang: ./herd -s /tmp/s2/mysocket status aaaaa herd: error: service 'aaaaa' could not be found I have to kill -9 shepherd. c source code attached for the test program. I mentioned two possibilities above, and this is scenario #2. stdout not flushed. I also had what is probably scenario #1 with a different program. On 3/30/25 2:44 PM, Ludovic Courtès wrote: > Hi nathan, > > nathan skribis: > >> I definitely have a deadlock problem with Shepherd and I do believe I've found it. >> shepherd 1.0.3 > > Could you explain exactly how that happens (sequence of actions leading > to the deadlock) and share the relevant /var/log/messages excerpt? > >> This is in service-controller when the service has been stopped: >> (when logger >> (put-message logger 'terminate)) >> But in service-builtin-logger, this is called every time a line is read: >> (or pid >> (and service >> (eq? 'running (service-status service)) >> (match (service-running-value service) >> ((? process? process) >> (process-id process)) >> (value >> value)))) >> >> service-status -> service-control-message -> put-message to the service >> The fibers documentation says put-message is blocking. Surely this is a deadlock. > > Yes, but only if ‘pid’ hasn’t been cached before, which I think would > mean that not a single line was logged before stopping the service. > > I’ll take a closer look. > > Thanks for reporting it and for investigating! > > Ludo’.