GNU bug report logs -
#53463
ci.guix.gnu.org not building the 'guix' job
Previous Next
Reported by: Leo Famulari <leo <at> famulari.name>
Date: Sun, 23 Jan 2022 00:57:01 UTC
Severity: important
Done: Mathieu Othacehe <othacehe <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
Message #27 received at 53463 <at> debbugs.gnu.org (full text, mbox):
Hi,
Mathieu Othacehe <othacehe <at> gnu.org> skribis:
>> Oh! That indicates that it’s failing to offload to one of the
>> ‘localhost’ build machines specified in /etc/guix/machines.scm.
>> Normally there’s an SSH tunnel set up for those, but I guess it broke.
>>
>> Perhaps we can update /etc/guix/machines.scm to refer to armhf-linux
>> machines by their WireGuard IP?
>
> Seems like the right thing to do. This bit is also an unstaged change in
> the berlin maintenance repository, we should commit it. Tobias, could
> you have a look :) ?
>
> +(define powerpc64le
> + (list
> + ;; A VM donated/hosted by OSUOSL & administered by nckx.
> + ;; XXX: SSH tunnel via overdrive1:
> + ;; ssh -L 2224:p9.tobias.gr:22 hydra <at> 10.0.0.3
> + #;(build-machine
> + ;;(name "p9.tobias.gr")
> + (name "localhost")
> + (port 2224)
> + (user "hydra")
> + (systems '("powerpc64le-linux"))
> + (host-key "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJEbRxJ6WqnNLYEMNDUKFcdMtyZ9V/6oEfBFSHY8xE6A nckx"))))
IIRC this machine is now running WireGuard, Tobias? If so, could you
change this to refer to its WireGuard IP and commit it?
> I also found that other machines were unreachable and commented them:
>
> ;; CPU: 16 ARM Cortex-A72 cores
> ;; RAM: 32 GB
> - (list (build-machine
> + (list #;(build-machine
> ;;kreuzberg
> (name "10.0.0.9")
> (user "hydra")
Ricardo, could you check what’s wrong with kreuzberg?
> @@ -243,13 +256,13 @@
> ;; BeagleBoard X15 kindly hosted by Simon Josefsson.
> ;; CPU: Cortex A15 (2 cores)
> ;; RAM: 2 GB
> - (build-machine
> + #;(build-machine
> (name "10.0.0.5") ;guix-x15
> (user "hydra")
> (systems '("armhf-linux"))
> (host-key "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOfXjwCAFWeGiUoOVXEgtIeXxbtymjOTg7ph1ObMAcJ0 root <at> beaglebone"))
>
> - (build-machine
> + #;(build-machine
> (name "10.0.0.6") ;guix-x15b
> (user "hydra")
> (systems '("armhf-linux"))
Oops.
Note that it’s not necessary to comment them all out. As long as at
least one machine is available for a given system type, we’re fine:
‘guix offload’ will pick it up.
> Nevertheless we are hitting an offload issue here, maybe an occurrence
> of #24496. The offload mechanism should timeout when a machine is
> unreachable instead of retrying over and over, causing all evaluation
> processes to hang.
Yes, though the problem here is that some architectures were left with
zero machines IIRC, so it would have failed one way or another.
Thanks!
Ludo’.
This bug report was last modified 2 years and 336 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.