GNU bug report logs - #61646
Bandwidth-induced offload timeout abort whole operating

Previous Next

Package: guix;

Reported by: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Date: Mon, 20 Feb 2023 03:29:02 UTC

Severity: normal

Done: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 61646 <at> debbugs.gnu.org
Subject: bug#61646: Bandwidth-induced offload timeout abort whole operating
Date: Fri, 24 Feb 2023 21:46:29 -0500
Hi Ludovic,

Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi Maxim,
>
> Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:
>
>> I can reproduce this rather easily on my system:
>>
>> $ ./pre-inst-env guix build icedove
>> The following derivations will be built:
>>   /gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv
>>   /gnu/store/8zi808086b3vlfjrhdm87fgljziwdqx2-icedove-l10n-102.7.2.drv
>>   /gnu/store/v0sq7rb8fk36kjasb27a71z1a27wxb1s-icedove-minimal-102.7.2.drv
>> process 19542 acquired build slot '/var/guix/offload/localhost:6666/0'
>> normalized load on machine 'localhost' is 0.08
>> building /gnu/store/8zi808086b3vlfjrhdm87fgljziwdqx2-icedove-l10n-102.7.2.drv...
>> process 19548 acquired build slot '/var/guix/offload/localhost:6666/1'
>> normalized load on machine 'localhost' is 0.08
>> building /gnu/store/v0sq7rb8fk36kjasb27a71z1a27wxb1s-icedove-minimal-102.7.2.drv...
>> guix offload: sending 1 store item (558 MiB) to 'localhost'...
>> exporting path `/gnu/store/bwb5hcdyzgq16kmbsva7ax0zq6lzg78z-icedove-102.7.2.tar.xz'
>> guix offload: error: failed to connect to 'localhost': Timeout connecting to localhost
>> cannot build derivation
>> `/gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv': 1
>> dependencies couldn't be built
>> guix build: error: build of
>>   `/gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv' failed
>>
>> The third derivation tries to get a build slot and times out, because
>> the first two have already saturated the bandwidth of the link and it
>> takes more time than expected to get a reply.
>
> Weird.  Since the it’s a timeout while connecting, I suppose the patch
> below would improve the situation:
>
> diff --git a/guix/scripts/offload.scm b/guix/scripts/offload.scm
> index 578b3b9888..90cf97401c 100644
> --- a/guix/scripts/offload.scm
> +++ b/guix/scripts/offload.scm
> @@ -220,7 +220,7 @@ (define* (open-ssh-session machine #:optional max-silent-time)
>          (session (make-session #:user (build-machine-user machine)
>                                 #:host (build-machine-name machine)
>                                 #:port (build-machine-port machine)
> -                               #:timeout 10       ;initial timeout (seconds)
> +                               #:timeout 30       ;initial timeout (seconds)
>                                 ;; #:log-verbosity 'protocol
>                                 #:identity (build-machine-private-key machine)

Hm, how can I test this again?

I tried launching a daemon both on the remote and locally, with
something like:

sudo -E ./pre-inst-env ./guix-daemon --build-users-group guixbuild
--max-silent-time 0 --timeout 0 --log-compression none --discover=yes
--substitute-urls "https://ci.guix.gnu.org
https://bordeaux.guix.gnu.org" --max-jobs=20

and the code edited doesn't seem to run (I put an (error 'hello) in
there and nothing happened).

-- 
Thanks,
Maxim




This bug report was last modified 2 years and 88 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.