GNU bug report logs -
#71238
Installer image consistently fails to run system init due to TLS error
Previous Next
Full log
View this message in rfc822 format
Hi all,
On 29/05/2024 01:44, Richard Sent <richard <at> freakingpenguin.com> wrote:
> Richard Sent <richard <at> freakingpenguin.com> writes:
>
> > 1. There was a transient network issue for ~3 hours when I attempted to
> > install Guix ~4 times using different installation media that caused a
> > specific TLS handshake to fail.
> >
> > 2. A specific TLS handshake Guix undertakes during the installation
> > process fails to pass one of the built-in firewall rules shipped with
> > opnsense.
> >
> > 3. Some other odd aspect of my network messes things up for a specific
> > TLS handshake.
> >
> > My money is on 2 given how this is a seemingly common issue on
> > enterprise networks [1] and the rules I have added seem irrelevant. (I'd
> > rather not talk openly about my firewall rules in an archived public
> > forum, but can discuss off-list). However, there is another comment in
> > that thread that says IT didn't notice any firewall blocking.
>
> I ran the 1.4.0 installer again today behind my opnsense router and it
> completed successfully, which is horrifying. I was hoping starting from
> a constant image would make the error reproducible but that doesn't seem
> to be the case. Even with a consistent system image and network, it's
> only reproducible for somewhere between a few hours and one day. Perhaps
> server load plays a part?
>
> (Technically my process was a little bit different. Instead of fully
> completing the graphical installer I swapped to a TTY after activating
> the wired connection, mounted the root fs, and run $ guix system build
> /mnt/etc/config.scm, where config.scm was unmodified since initial
> installation. I'd be stunned if this caused the change in behavior but
> figured I'd mention for completeness.)
>
>
I've mananged to reproduce this bug. First, I run `sudo guix gc delete-generations && guix gc -d 2w` to clear my store. Then I run `guix upgrade && sudo guix system -L /home/ada/dotfiles/guix/ reconfigure --fallback /home/ada/dotfiles/guix/ada/system/kissakoira.scm` to redownload all of those deleted store items. The process 9/10 will fail halfway through the upgrade process. Then, a retry will work without a hitch. Even re-gc-ing my system will not let me reproduce the bug - I need to restart my system. Then, the likelyhood it works is 7/10 until the next day (just my perception). By the way, this is on my university's network.
I managed to capture the problem happening under strace using this command `strace -ff -tt -o log_up.strace -s 500 guix upgrade && sudo strace -ff -tt -o log_sr.strace -s 500 sudo guix system -L /home/ada/dotfiles/guix/ reconfigure --fallback /home/ada/dotfiles/guix/ada/system/kissakoira.scm`. I've uploaded the logs to my Google Drive[1]. You can use `strace-log-merge log_up.strace` to view to merged logs.
As I can reproduce this error fairly consistently now, please let me know if you want me to run any more tools to capture more data.
Warmly,
Ada
[1] https://drive.google.com/file/d/104DVqyMLGRi4imWzvFQ6TahAiRRKdR4_/view?usp=drive_link
This bug report was last modified 338 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.