GNU bug report logs -
#35550
Installer: wpa_supplicant fails to start
Previous Next
Reported by: Ludovic Courtès <ludo <at> gnu.org>
Date: Fri, 3 May 2019 19:32:01 UTC
Severity: important
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> skribis:
> So why is ‘wpa-supplicant’ marked as failing to start on the first
> attempt?
>
> The only reason I can think of is if ‘read-pid-file’ from (shepherd
> service) returns immediately and returns #f instead of a number. That
> can actually happen if the PID file exists but is empty (or contains
> garbage).
I’ve produced an ISO with the patch below and ran it on the bare metal
to get confirmation (too bad the bug doesn’t show in QEMU :-/). Indeed,
‘read-pid-file’ for /var/run/wpa_supplicant.pid systematically reads the
empty string the first time the ‘wpa-supplicant’ service is started.
(After that, if we kill the process and try to restart the service, the
problem doesn’t show up.)
[Message part 2 (text/x-patch, inline)]
diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 53437b6..e21492e 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -717,9 +717,12 @@ otherwise return the number that was read (a PID)."
(let loop ()
(catch 'system-error
(lambda ()
- (string->number
- (string-trim-both
- (call-with-input-file file get-string-all))))
+ (define str
+ (call-with-input-file file get-string-all))
+
+ (local-output (l10n "read-pid-file ~s -> ~s")
+ file str)
+ (string->number (string-trim-both str)))
(lambda args
(let ((errno (system-error-errno args)))
(if (= ENOENT errno)
[Message part 3 (text/plain, inline)]
With the second patch below, I confirm that the ‘wpa-supplicant’ starts
correctly. We can see in /var/log/messages that ‘read-pid-file’ first
reads the empty string from /var/run/wpa_supplicant.pid, then tries
again, and gets a valid PID on the second attempt.
It’s surprising that the timing is always like that, and only on the
bare metal, but that’s the way it is.
It’d be great if you could do some testing with the patch below. Then I
guess we’ll push a Shepherd release with this fix.
I wonder if this could also explain
<https://issues.guix.info/issue/30993>.
Thanks,
Ludo’.
[Message part 4 (text/x-patch, inline)]
diff --git a/gnu/packages/admin.scm b/gnu/packages/admin.scm
index dfc3467bf8..e1dd248679 100644
--- a/gnu/packages/admin.scm
+++ b/gnu/packages/admin.scm
@@ -188,7 +188,8 @@ and provides a \"top-like\" mode (monitoring).")
version ".tar.gz"))
(sha256
(base32
- "1ys2w83vm62spr8bx38sccfdpy9fqmj7wfywm5k8ihsy2k61da2i"))))
+ "1ys2w83vm62spr8bx38sccfdpy9fqmj7wfywm5k8ihsy2k61da2i"))
+ (patches (search-patches "shepherd-debug.patch"))))
(build-system gnu-build-system)
(arguments
'(#:configure-flags '("--localstatedir=/var")))
diff --git a/gnu/packages/patches/shepherd-debug.patch b/gnu/packages/patches/shepherd-debug.patch
new file mode 100644
index 0000000000..2fd97cc578
--- /dev/null
+++ b/gnu/packages/patches/shepherd-debug.patch
@@ -0,0 +1,43 @@
+diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
+index 53437b6..bef8f42 100644
+--- a/modules/shepherd/service.scm
++++ b/modules/shepherd/service.scm
+@@ -715,21 +715,28 @@ number. Return #f if FILE was not created or does not contain a number;
+ otherwise return the number that was read (a PID)."
+ (define start (current-time))
+ (let loop ()
++ (define (retry)
++ (and (< (current-time) (+ start max-delay))
++ (begin
++ ;; FILE does not exist yet, so wait and try again.
++ ;; XXX: Ideally we would yield to the main event loop
++ ;; and/or use inotify.
++ (sleep 1)
++ (loop))))
++
+ (catch 'system-error
+ (lambda ()
+- (string->number
+- (string-trim-both
+- (call-with-input-file file get-string-all))))
++ (define str
++ (call-with-input-file file get-string-all))
++
++ (local-output (l10n "read-pid-file ~s -> ~s")
++ file str)
++ (or (string->number (string-trim-both str))
++ (retry)))
+ (lambda args
+ (let ((errno (system-error-errno args)))
+ (if (= ENOENT errno)
+- (and (< (current-time) (+ start max-delay))
+- (begin
+- ;; FILE does not exist yet, so wait and try again.
+- ;; XXX: Ideally we would yield to the main event loop
+- ;; and/or use inotify.
+- (sleep 1)
+- (loop)))
++ (retry)
+ (apply throw args)))))))
+
+ (define* (exec-command command
This bug report was last modified 6 years and 91 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.