GNU bug report logs - #41625
Sporadic guix-offload crashes due to EOF errors

Previous Next

Package: guix;

Reported by: Marius Bakke <marius <at> gnu.org>

Date: Sun, 31 May 2020 09:52:01 UTC

Severity: normal

Done: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Marius Bakke <marius <at> gnu.org>
Cc: 41625 <at> debbugs.gnu.org, Ludovic Courtès <ludo <at> gnu.org>
Subject: bug#41625: [PATCH v2] offload: Handle a possible EOF response from read-repl-response.
Date: Thu, 27 May 2021 07:51:01 -0400
Hi Marius,

Marius Bakke <marius <at> gnu.org> writes:

> Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skriver:
>
>>> Is running ‘guix offload test /etc/guix/machines.scm overdrive1’ on
>>> berlin enough to reproduce the issue?  If so, we could monitor/strace
>>> sshd on overdrive1 to get a better understanding of what’s going on.
>>
>> It's actually difficult to trigger it; it seems to happen mostly on the
>> first try after a long time without connecting to the machine; on the
>> 2nd and later tries, everything is smooth.  Waiting a few minutes is not
>> enough to re-trigger the problem.
>>
>> I've managed to see the problem a few lucky times with:
>>
>> --8<---------------cut here---------------start------------->8---
>> while true; do guix offload test /etc/guix/machines.scm overdrive1; done
>> --8<---------------cut here---------------end--------------->8---
>
> I used to be able to reproduce it by inducing a high load on the target
> machine and just let Guix keep trying to connect.  But now I did that,
> and set overload threshold to 0.0 for good measure, and Guix has been
> waiting patiently for two hours without failure.
>
> So AFAICT this bug has been fixed.  Perhaps Berlin or the Overdrive
> simply needs to be updated?

Ah!  Do you have root access to overdrive1?  It'd be interesting to
reconfigure it to update the guix-daemon and see if the problem
vanishes.

Thanks for the information!

Maxim




This bug report was last modified 3 years and 54 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.