GNU bug report logs -
#24496
offloading should fall back to local build after n tries
Previous Next
Full log
View this message in rfc822 format
ng0 <ngillmann <at> runbox.com> skribis:
> Ludovic Courtès <ludo <at> gnu.org> writes:
[...]
>> Like you say, on Hydra-style setup this could be a problem: the
>> front-end machine may have --max-jobs=0, meaning that it cannot perform
>> builds on its own.
>>
>> So I guess we would need a command-line option to select a different
>> behavior. I’m not sure how to do that because ‘guix offload’ is
>> “hidden” behind ‘guix-daemon’, so there’s no obvious place for such an
>> option.
>
> Could the daemon run with --enable-hydra-style or --disable-hydra-style
> and --disable-hydra-style would allow falling back to local build if
> after a defined time - keeping slow connections in mind - the machine
> did not reply.
That would be too ad-hoc IMO, and the problem mentioned above remains.
>> In the meantime, you could also hack up your machines.scm: it would
>> return a list where unreachable machines have been filtered out.
>
> How can I achieve this?
Something like:
(define the-machine (build-machine …))
(if (managed-to-connect-timely the-machine)
(list the-machine)
'())
… where ‘managed-to-connect-timely’ would try to connect to the
machine with a timeout.
> And to append to this bug: it seems to me that offloading requires 1
> lsh-key for each
> build-machine.
The main machine needs to be able to connect to each build machine over
SSH, so indeed, that requires proper SSH key registration (host keys and
authorized user keys).
> (https://lists.gnu.org/archive/html/help-guix/2016-10/msg00007.html)
> and that you can not directly address them (say I want to create some
> system where I want to build on machine 1 AND machine 2. Having 2
> x86_64 in machines.scm only selects one of them (if 2 were working,
> see linked thread) and builds on the one which is accessible first. If
> however the first machine is somehow blocked and it fails, therefore
> terminates lsh connection, the build does not happen at all.
The code that selects machines is in (guix scripts offload),
specifically ‘choose-build-machine’. It tries to choose the “best”
machine, which means, roughly, the fastest and least loaded one.
HTH,
Ludo’.
This bug report was last modified 22 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.