GNU bug report logs - #35181
Hydra offloads often get stuck while exporting build requisites

Previous Next

Package: guix;

Reported by: Mark H Weaver <mhw <at> netris.org>

Date: Sun, 7 Apr 2019 16:44:02 UTC

Severity: normal

Merged with 34157

Done: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #37 received at 35181 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 35181 <at> debbugs.gnu.org
Subject: Re: bug#35181: Hydra offloads often get stuck while exporting build
 requisites
Date: Tue, 09 Apr 2019 14:09:41 -0400
Hi Ludovic,

Ludovic Courtès <ludo <at> gnu.org> writes:

> The problem is that this is an ancient Guix.  In the meantime,
> offloading has seen relevant changes, in particular things like commit
> ed7b44370f71126087eb953f36aad8dc4c44109f which address stability issues
> with Guile-SSH (ssh dist node) that was previously used.
>
> I think we should upgrade Guix on hydra.gnu.org otherwise we’re likely
> to end up chasing old bugs.

Sure, that makes sense.  I also noticed the old Guix after writing my
last messages, so yesterday I tried updating Hydra's Guix to 0.16.0-11,
which at the time was the latest version built by Hydra.  After
updating, I quit and relaunched 'guix-daemon', as well as 'guix
publish', hydra-queue-runner, and hydra-evaluator.

With the new version of Guix, *all* offloads started failing in a
strange way: it got stuck in a loop, printing endlessly repeated
messages like this:

  process N acquired build slot '/var/guix/offload/hydra.gnunet.org/0'
  process N acquired build slot '/var/guix/offload/hydra.gnunet.org/0'
  process N acquired build slot '/var/guix/offload/hydra.gnunet.org/1'
  process N acquired build slot '/var/guix/offload/hydra.gnunet.org/2'
  process N acquired build slot '/var/guix/offload/hydra.gnunet.org/0'

This is from memory because after killing the queue-runner and
cancelling the 'mozjs-60' jobs (which I had intended to start building
as a test), the nix output above is no longer visible on those pages,
and I'm not sure offhand were to look for it.

Anyway, in every offloaded build, it printed a line like the above every
few seconds, with the build slot number at the end varying.  I don't
remember if the process number varied.

This reminds that I also ran into difficulties updating 'guix' on the
armhf build slaves, which are also currently stuck on an even more
ancient version of Guix (circa 0.12.0).

On both Hydra and its armhf build slaves, Guix is installed on top of a
Debian derivative, and both 'guix' and 'guix-daemon' are launched from
an environment without any Guix environment variable settings.  This
apparently works in ancient versions of Guix, but not recent ones.

So, could the problem simply be that the 'guix' wrapper is not
installing enough environment variable settings for offloading to work?

        Mark




This bug report was last modified 2 years and 38 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.