GNU bug report logs - #55441
Use of 'primitive-fork' in (guix inferior) leads to hangs in 'cuirass evaluate'

Previous Next

Package: guix;

Reported by: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Date: Mon, 16 May 2022 03:50:02 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 55441 <at> debbugs.gnu.org, Mathieu Othacehe <othacehe <at> gnu.org>
Subject: bug#55441: [cuirass] hang in "In progress..."; runs out of pgsql connections
Date: Mon, 16 May 2022 13:32:26 -0400
Hello,

Ludovic Courtès <ludo <at> gnu.org> writes:

> Ludovic Courtès <ludo <at> gnu.org> skribis:
>
>> I’ve added a missing call to ‘close-inferior’.  It’s a good idea, though
>> I’m not entirely convinced yet it’ll solve the problem:
>>
>>   https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/commit/?id=f087aaf685dbc7cc18f0254895f4a4b0dfaba631
>
> I tested it like this and it ran to completion without hanging:
>
> ludo <at> berlin ~$ guix build cuirass --with-branch=cuirass=master
> […]
> /gnu/store/sqsf2q3qf9d485mcw6lm14abwr54na01-cuirass-git.master
> ludo <at> berlin ~$  sudo su - cuirass -s /bin/sh -c "/gnu/store/sqsf2q3qf9d485mcw6lm14abwr54na01-cuirass-git.master/bin/cuirass evaluate dbname=cuirass 323183"
> Password:
> Computing Guix derivation for 'x86_64-linux'... \
> 2022-05-16T12:39:18 Registering builds for evaluation 323183.
> 2022-05-16T12:40:34 Registering builds for evaluation 323183.
> GC Warning: Repeated allocation of very large block (appr. size 14385152):
>         May lead to memory leak and poor performance
> 2022-05-16T12:41:25 Registering builds for evaluation 323183.
> 2022-05-16T12:44:35 Registering builds for evaluation 323183.
>
> Should we try and deploy this commit on berlin?
>
> Maxim, how can we proceed?

Berlin was reconfigured with this commit of Cuirass, and is now running
the derivations with it, but so far still "In progress..." after more
than 100 minutes [0]

[0]  https://ci.guix.gnu.org/eval/325592

Looking in /var/log/cuirass.log, I can see:

--8<---------------cut here---------------start------------->8---
2022-05-16 15:57:41 2022-05-16T15:57:41 next evaluation in 300 seconds
2022-05-16 15:59:02 Uncaught exception in fiber ##f:
2022-05-16 15:59:02 In cuirass/base.scm:
2022-05-16 15:59:02    726:13  3 (_)
2022-05-16 15:59:02 In ice-9/boot-9.scm:
2022-05-16 15:59:02   1752:10  2 (with-exception-handler _ _ #:unwind? _ # _)
2022-05-16 15:59:02   1685:16  1 (raise-exception _ #:continuable? _)
2022-05-16 15:59:02   1683:16  0 (raise-exception _ #:continuable? _)
2022-05-16 15:59:02 ice-9/boot-9.scm:1683:16: In procedure raise-exception:
2022-05-16 15:59:02 ERROR:
2022-05-16 15:59:02   1. &evaluation-error:
2022-05-16 15:59:02       name: "core"
2022-05-16 15:59:02       id: 325375
2022-05-16 16:02:41 2022-05-16T16:02:41 Fetching channels for spec 'core'.
--8<---------------cut here---------------end--------------->8---

But it seems it should only affect another job specification ('core') ?

Before, there was also:

--8<---------------cut here---------------start------------->8---
2022-05-16 15:41:34 Uncaught exception in fiber ##f:
2022-05-16 15:41:34 In cuirass/base.scm:
2022-05-16 15:41:34    726:13  3 (_)
2022-05-16 15:41:34 In ice-9/boot-9.scm:
2022-05-16 15:41:34   1752:10  2 (with-exception-handler _ _ #:unwind? _ # _)
2022-05-16 15:41:34   1685:16  1 (raise-exception _ #:continuable? _)
2022-05-16 15:41:34   1683:16  0 (raise-exception _ #:continuable? _)
2022-05-16 15:41:34 ice-9/boot-9.scm:1683:16: In procedure raise-exception:
2022-05-16 15:41:34 ERROR:
2022-05-16 15:41:34   1. &evaluation-error:
2022-05-16 15:41:34       name: "purge-python2-packages"
2022-05-16 15:41:34       id: 325332
--8<---------------cut here---------------end--------------->8---

and

--8<---------------cut here---------------start------------->8---
2022-05-16 15:05:27 Uncaught exception in fiber ##f:
2022-05-16 15:05:27 In cuirass/base.scm:
2022-05-16 15:05:27    726:13  3 (_)
2022-05-16 15:05:27 In ice-9/boot-9.scm:
2022-05-16 15:05:27   1752:10  2 (with-exception-handler _ _ #:unwind? _ # _)
2022-05-16 15:05:27   1685:16  1 (raise-exception _ #:continuable? _)
2022-05-16 15:05:27   1683:16  0 (raise-exception _ #:continuable? _)
2022-05-16 15:05:27 ice-9/boot-9.scm:1683:16: In procedure raise-exception:
2022-05-16 15:05:27 ERROR:
2022-05-16 15:05:27   1. &evaluation-error:
2022-05-16 15:05:27       name: "master"
2022-05-16 15:05:27       id: 324938
--8<---------------cut here---------------end--------------->8---

and

--8<---------------cut here---------------start------------->8---
2022-05-16 13:02:07 Uncaught exception in fiber ##f:
2022-05-16 13:02:07 In cuirass/base.scm:
2022-05-16 13:02:07    726:13  3 (_)
2022-05-16 13:02:07 In ice-9/boot-9.scm:
2022-05-16 13:02:07   1752:10  2 (with-exception-handler _ _ #:unwind? _ # _)
2022-05-16 13:02:07   1685:16  1 (raise-exception _ #:continuable? _)
2022-05-16 13:02:07   1683:16  0 (raise-exception _ #:continuable? _)
2022-05-16 13:02:07 ice-9/boot-9.scm:1683:16: In procedure raise-exception:
2022-05-16 13:02:07 ERROR:
2022-05-16 13:02:07   1. &evaluation-error:
2022-05-16 13:02:07       name: "guix"
2022-05-16 13:02:07       id: 324937
--8<---------------cut here---------------end--------------->8---

I don't know if these are related or not; probably not, as their
timestamps are older by more than 3 hours, while the last derivations
were started less than 1h30 ago.

Thanks,

Maxim




This bug report was last modified 2 years and 121 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.