GNU bug report logs - #36731
shepherd lost track of nginx

Previous Next

Package: guix;

Reported by: Robert Vollmert <rob <at> vllmrt.net>

Date: Fri, 19 Jul 2019 16:50:02 UTC

Severity: normal

To reply to this bug, email your comments to 36731 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#36731; Package guix. (Fri, 19 Jul 2019 16:50:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Robert Vollmert <rob <at> vllmrt.net>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Fri, 19 Jul 2019 16:50:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Robert Vollmert <rob <at> vllmrt.net>
To: bug-guix <at> gnu.org
Subject: shepherd lost track of nginx
Date: Fri, 19 Jul 2019 18:49:32 +0200
Not sure who’s at fault here, but without doing anything weird,
I ended up with a system where shepherd thought that nginx was
stopped, while there was still an nginx process around. I
certainly didn’t start it by hand.

The result was this:

$ sudo herd restart nginx
Service nginx is not running.
herd: exception caught while executing 'start' on service 'nginx':
Throw to key `srfi-34' with args `("#<condition &invoke-error [program: \"/gnu/store/mlg0xfbiq03s812rm3v7mrlhyngas4xp-nginx-1.17.1/sbin/nginx\" arguments: (\"-c\" \"/gnu/store/r6gl9n7pwf4npiri05qxr40vdihdm2yy-nginx.conf\" \"-p\" \"/var/run/nginx\") exit-status: 1 term-signal: #f stop-signal: #f] 147e000>")’.

That error message could also be clearer about what’s going on. At any
rate, after I killed the nginx process, “herd start nginx” worked fine.

I should add that nginx was still doing its job fine before I killed it.





Information forwarded to bug-guix <at> gnu.org:
bug#36731; Package guix. (Fri, 19 Jul 2019 22:50:02 GMT) Full text and rfc822 format available.

Message #8 received at 36731 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Robert Vollmert <rob <at> vllmrt.net>
Cc: 36731 <at> debbugs.gnu.org
Subject: Re: bug#36731: shepherd lost track of nginx
Date: Sat, 20 Jul 2019 00:49:03 +0200
Hello,

Robert Vollmert <rob <at> vllmrt.net> skribis:

> Not sure who’s at fault here, but without doing anything weird,
> I ended up with a system where shepherd thought that nginx was
> stopped, while there was still an nginx process around. I
> certainly didn’t start it by hand.

Did you try “herd status nginx” to see shepherd’s notion of the nginx
process?

> The result was this:
>
> $ sudo herd restart nginx
> Service nginx is not running.
> herd: exception caught while executing 'start' on service 'nginx':
> Throw to key `srfi-34' with args `("#<condition &invoke-error [program: \"/gnu/store/mlg0xfbiq03s812rm3v7mrlhyngas4xp-nginx-1.17.1/sbin/nginx\" arguments: (\"-c\" \"/gnu/store/r6gl9n7pwf4npiri05qxr40vdihdm2yy-nginx.conf\" \"-p\" \"/var/run/nginx\") exit-status: 1 term-signal: #f stop-signal: #f] 147e000>")’.

Do you use an “opaque” nginx config file, or do you use <nginx-...>
records?

In the former case, the ‘start’ method won’t attempt to read the PID
file (because it cannot be sure it’ll exist), so it’s effectively unable
to track the process.  See comment in ‘nginx-shepherd-service’.

> That error message could also be clearer about what’s going on. At any
> rate, after I killed the nginx process, “herd start nginx” worked fine.

I agree that we could and should improve the error message.  Redirecting
nginx’s stderr so that shepherd clients can see it would be best.

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#36731; Package guix. (Sat, 20 Jul 2019 07:43:01 GMT) Full text and rfc822 format available.

Message #11 received at 36731 <at> debbugs.gnu.org (full text, mbox):

From: Robert Vollmert <rob <at> vllmrt.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 36731 <at> debbugs.gnu.org
Subject: Re: bug#36731: shepherd lost track of nginx
Date: Sat, 20 Jul 2019 09:42:34 +0200

> On 20. Jul 2019, at 00:49, Ludovic Courtès <ludo <at> gnu.org> wrote:
> 
> Hello,
> 
> Robert Vollmert <rob <at> vllmrt.net> skribis:
> 
>> Not sure who’s at fault here, but without doing anything weird,
>> I ended up with a system where shepherd thought that nginx was
>> stopped, while there was still an nginx process around. I
>> certainly didn’t start it by hand.
> 
> Did you try “herd status nginx” to see shepherd’s notion of the nginx
> process?

Not at the time, no.

> 
>> The result was this:
>> 
>> $ sudo herd restart nginx
>> Service nginx is not running.
>> herd: exception caught while executing 'start' on service 'nginx':
>> Throw to key `srfi-34' with args `("#<condition &invoke-error [program: \"/gnu/store/mlg0xfbiq03s812rm3v7mrlhyngas4xp-nginx-1.17.1/sbin/nginx\" arguments: (\"-c\" \"/gnu/store/r6gl9n7pwf4npiri05qxr40vdihdm2yy-nginx.conf\" \"-p\" \"/var/run/nginx\") exit-status: 1 term-signal: #f stop-signal: #f] 147e000>")’.
> 
> Do you use an “opaque” nginx config file, or do you use <nginx-...>
> records?

The latter I think:

     (service nginx-service-type
              (nginx-configuration
               (extra-content “…”)))





Information forwarded to bug-guix <at> gnu.org:
bug#36731; Package guix. (Sat, 20 Jul 2019 13:52:02 GMT) Full text and rfc822 format available.

Message #14 received at 36731 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Robert Vollmert <rob <at> vllmrt.net>
Cc: 36731 <at> debbugs.gnu.org
Subject: Re: bug#36731: shepherd lost track of nginx
Date: Sat, 20 Jul 2019 15:51:46 +0200
Hi,

Robert Vollmert <rob <at> vllmrt.net> skribis:

>>> $ sudo herd restart nginx
>>> Service nginx is not running.
>>> herd: exception caught while executing 'start' on service 'nginx':
>>> Throw to key `srfi-34' with args `("#<condition &invoke-error [program: \"/gnu/store/mlg0xfbiq03s812rm3v7mrlhyngas4xp-nginx-1.17.1/sbin/nginx\" arguments: (\"-c\" \"/gnu/store/r6gl9n7pwf4npiri05qxr40vdihdm2yy-nginx.conf\" \"-p\" \"/var/run/nginx\") exit-status: 1 term-signal: #f stop-signal: #f] 147e000>")’.
>> 
>> Do you use an “opaque” nginx config file, or do you use <nginx-...>
>> records?
>
> The latter I think:
>
>      (service nginx-service-type
>               (nginx-configuration
>                (extra-content “…”)))

That’s actually the non-opaque variant, so shepherd should read the PID
file and it shouldn’t get it wrong.  Not sure what happened.

If you can reproduce it, it would be great to gather the output of “herd
status nginx” at the time shepherd is confused.

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#36731; Package guix. (Sat, 20 Jul 2019 23:11:01 GMT) Full text and rfc822 format available.

Message #17 received at 36731 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 36731 <at> debbugs.gnu.org, Robert Vollmert <rob <at> vllmrt.net>
Subject: Re: bug#36731: shepherd lost track of nginx
Date: Sat, 20 Jul 2019 19:07:58 -0400
Hello,

Ludovic Courtès <ludo <at> gnu.org> writes:

> Robert Vollmert <rob <at> vllmrt.net> skribis:
>
>> The result was this:
>>
>> $ sudo herd restart nginx
>> Service nginx is not running.
>> herd: exception caught while executing 'start' on service 'nginx':
>> Throw to key `srfi-34' with args `("#<condition &invoke-error
>> [program:
>> \"/gnu/store/mlg0xfbiq03s812rm3v7mrlhyngas4xp-nginx-1.17.1/sbin/nginx\"
>> arguments: (\"-c\"
>> \"/gnu/store/r6gl9n7pwf4npiri05qxr40vdihdm2yy-nginx.conf\" \"-p\"
>> \"/var/run/nginx\") exit-status: 1 term-signal: #f stop-signal: #f]
>> 147e000>")’.

[…]

>> That error message could also be clearer about what’s going on. At any
>> rate, after I killed the nginx process, “herd start nginx” worked fine.
>
> I agree that we could and should improve the error message.

On the subject of this error message, why was the &invoke-error
condition serialized to a string before apparently being embedded within
another exception?  In other words, why did it print:

  Throw to key `srfi-34' with args `("#<condition &invoke-error [program: \"/gnu/store/mlg0xfbiq03s812rm3v7mrlhyngas4xp-nginx-1.17.1/sbin/nginx\" arguments: (\"-c\" \"/gnu/store/r6gl9n7pwf4npiri05qxr40vdihdm2yy-nginx.conf\" \"-p\" \"/var/run/nginx\") exit-status: 1 term-signal: #f stop-signal: #f] 147e000>")’.

instead of something closer to:

  Throw to key `srfi-34' with args `(#<condition &invoke-error [program: "/gnu/store/mlg0xfbiq03s812rm3v7mrlhyngas4xp-nginx-1.17.1/sbin/nginx" arguments: ("-c" "/gnu/store/r6gl9n7pwf4npiri05qxr40vdihdm2yy-nginx.conf" "-p" "/var/run/nginx") exit-status: 1 term-signal: #f stop-signal: #f] 147e000>)’.

We may want to go further in this specific case to make a user-friendly
error message, but in the more general case of printing arbitrary
exceptions, eliminating that second layer of string serialization would
help make the error reports a bit nicer to read.

What do you think?

      Mark




Information forwarded to bug-guix <at> gnu.org:
bug#36731; Package guix. (Mon, 22 Jul 2019 10:32:01 GMT) Full text and rfc822 format available.

Message #20 received at 36731 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Mark H Weaver <mhw <at> netris.org>
Cc: 36731 <at> debbugs.gnu.org, Robert Vollmert <rob <at> vllmrt.net>
Subject: Re: bug#36731: shepherd lost track of nginx
Date: Mon, 22 Jul 2019 12:31:06 +0200
Hi Mark,

Mark H Weaver <mhw <at> netris.org> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Robert Vollmert <rob <at> vllmrt.net> skribis:
>>
>>> The result was this:
>>>
>>> $ sudo herd restart nginx
>>> Service nginx is not running.
>>> herd: exception caught while executing 'start' on service 'nginx':
>>> Throw to key `srfi-34' with args `("#<condition &invoke-error
>>> [program:
>>> \"/gnu/store/mlg0xfbiq03s812rm3v7mrlhyngas4xp-nginx-1.17.1/sbin/nginx\"
>>> arguments: (\"-c\"
>>> \"/gnu/store/r6gl9n7pwf4npiri05qxr40vdihdm2yy-nginx.conf\" \"-p\"
>>> \"/var/run/nginx\") exit-status: 1 term-signal: #f stop-signal: #f]
>>> 147e000>")’.
>
> […]
>
>>> That error message could also be clearer about what’s going on. At any
>>> rate, after I killed the nginx process, “herd start nginx” worked fine.
>>
>> I agree that we could and should improve the error message.
>
> On the subject of this error message, why was the &invoke-error
> condition serialized to a string before apparently being embedded within
> another exception?

That serialization comes from the Shepherd when it talks to its clients
(see ‘write-reply’ in (shepherd comm)).

Normally service methods should write a human-readable message instead
of throwing an exception, but when that happens, shepherd serializes
those things so that one can at least diagnose the problem.

In this case we could use ‘report-invoke-error’ from (guix build utils)
on ‘core-updates’.

Thanks,
Ludo’.




This bug report was last modified 6 years and 5 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.