GNU bug report logs - #56709
Channel opening failure with guix deploy

Previous Next

Package: guix;

Reported by: Aleksandr Vityazev <avityazev <at> posteo.org>

Date: Fri, 22 Jul 2022 19:27:01 UTC

Severity: important

Merged with 58290

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: "Thompson, David" <dthompson2 <at> worcester.edu>
To: 56709 <at> debbugs.gnu.org
Subject: bug#56709: Channel opening failure with guix deploy
Date: Wed, 18 Jan 2023 15:49:20 -0500
Hello,

This problem is strangely transient.  I've seen it happen to others
when it wasn't happening to me with the same remote machine.  Now I am
having this problem again on 2 different servers that I manage.  I dug
around a bit and found that calls to 'open-remote-pipe*' from
guile-ssh have some chance of failure even though the SSH session is
fine. This procedure is called many times during a deploy, so the odds
are high that one of them will fail.  I got lucky once today and had a
deploy finish but that was after many failures.  I was able to unblock
myself by hacking call sites to repeatedly call 'open-remote-pipe*' in
a loop, like this:

    (let loop ()
         (or (false-if-exception
              (apply open-remote-pipe* session OPEN_BOTH repl-command))
             (loop)))

I also added some 'pk' logging and found that 'open-remote-pipe*'
would typically succeed on the first or second try.  I think there
could be a bit more investigation done to better understand *why* this
happens in the first place, but as a resiliency tactic I think it
would be appropriate to write a wrapper procedure that retries a few
times before giving up.

Thoughts?

- Dave




This bug report was last modified 2 years and 164 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.