GNU bug report logs - #74501
ntpd segfaults at boot (take 2)

Previous Next

Package: guix;

Reported by: Fredrik Salomonsson <plattfot <at> posteo.net>

Date: Sun, 24 Nov 2024 00:33:01 UTC

Severity: normal

To reply to this bug, email your comments to 74501 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#74501; Package guix. (Sun, 24 Nov 2024 00:33:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Fredrik Salomonsson <plattfot <at> posteo.net>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Sun, 24 Nov 2024 00:33:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Fredrik Salomonsson <plattfot <at> posteo.net>
To: bug-guix <at> gnu.org
Subject: ntpd segfaults at boot (take 2)
Date: Sun, 24 Nov 2024 00:32:34 +0000
Hi,

Similar to issue 73873 [0], I'm also seeing ntpd segfaults at boot and
it looks to be due to ipv6:
In /var/log/messages
--8<---------------cut here---------------start------------->8---
Nov 23 16:13:41 localhost ntpd[1019]: ntpd 4.2.8p18 <at> 1.4062-o Thu Jan  1 00:00:01 UTC 1970 (1): Starting
Nov 23 16:13:41 localhost ntpd[1019]: Command line: /gnu/store/s4ra0g0ym1q1wh5jrqs60092x1nrb8h9-ntp-4.2.8p18/bin/ntpd -n -c /gnu/store/ghh3m9wzraszf7p4ynac006x96svddbq-ntpd.conf -u ntpd -g
Nov 23 16:13:41 localhost ntpd[1019]: ----------------------------------------------------
Nov 23 16:13:41 localhost ntpd[1019]: ntp-4 is maintained by Network Time Foundation,
Nov 23 16:13:41 localhost ntpd[1019]: Inc. (NTF), a non-profit 501(c)(3) public-benefit
Nov 23 16:13:41 localhost ntpd[1019]: corporation.  Support and training for ntp-4 are
Nov 23 16:13:41 localhost ntpd[1019]: available at https://www.nwtime.org/support
Nov 23 16:13:41 localhost ntpd[1019]: ----------------------------------------------------
Nov 23 16:13:41 localhost ntpd[1019]: DEBUG behavior is enabled - a violation of any diagnostic assertion will cause ntpd to abort
Nov 23 16:13:41 localhost ntpd[1019]: proto: precision = 0.040 usec (-24)
Nov 23 16:13:41 localhost ntpd[1019]: baseday_set_day: invalid day (25556), UNIX epoch substituted
Nov 23 16:13:41 localhost ntpd[1019]: basedate set to 1970-01-01
Nov 23 16:13:41 localhost ntpd[1019]: gps base set to 1980-01-06 (week 0)
Nov 23 16:13:41 localhost ntpd[1019]: Listen and drop on 0 v6wildcard [::]:123
Nov 23 16:13:41 localhost ntpd[1019]: Listen and drop on 1 v4wildcard 0.0.0.0:123
Nov 23 16:13:41 localhost ntpd[1019]: Listen normally on 2 lo 127.0.0.1:123
Nov 23 16:13:41 localhost ntpd[1019]: Listen normally on 3 enp37s0 192.168.1.8:123
Nov 23 16:13:41 localhost vmunix: [   22.648239] ntpd[1019]: segfault at 24 ip 000055fe102ab29b sp 00007ffc26382ca0 error 4 in ntpd[7f29b,55fe1023e000+86000] likely on CPU 0 (core 0, socket 0)
Nov 23 16:13:41 localhost ntpd[1019]: Listen normally on 4 lo [::1]:123
Nov 23 16:13:41 localhost vmunix: [   22.649529] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
Nov 23 16:13:41 localhost ntpd[1019]: bind(21) AF_INET6 [2001:REDACTED:cedf]:123 flags 0x11 failed: Cannot assign requested address
Nov 23 16:13:41 localhost ntpd[1019]: unable to create socket on enp37s0 (5) for [2001:REDACTED:cedf]:123
Nov 23 16:13:41 localhost shepherd[1]: Service ntpd (PID 1019) terminated with signal 11. 
Nov 23 16:13:41 localhost shepherd[1]: Service ntpd has been disabled. 
Nov 23 16:13:41 localhost shepherd[1]:   (Respawning too fast.) 
--8<---------------cut here---------------end--------------->8---

And `sudo dmesg`:

--8<---------------cut here---------------start------------->8---
[   21.871447] ntpd[954]: segfault at 24 ip 000055abbdf0029b sp 00007ffebf673770 error 4 in ntpd[7f29b,55abbde93000+86000] likely on CPU 7 (core 9, socket 0)
[   21.871453] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[   22.002809] ntpd[1005]: segfault at 24 ip 000055ac349d229b sp 00007fff8be14a00 error 4 in ntpd[7f29b,55ac34965000+86000] likely on CPU 12 (core 0, socket 0)
[   22.002863] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[   22.131272] ntpd[1008]: segfault at 24 ip 0000556dc1ad529b sp 00007ffef46b9d50 error 4 in ntpd[7f29b,556dc1a68000+86000] likely on CPU 3 (core 3, socket 0)
[   22.132111] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[   22.264012] ntpd[1011]: segfault at 24 ip 000055e02824f29b sp 00007fffa1e29970 error 4 in ntpd[7f29b,55e0281e2000+86000] likely on CPU 4 (core 4, socket 0)
[   22.264019] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[   22.390893] ntpd[1014]: segfault at 24 ip 0000555b2757129b sp 00007ffe2d0ea050 error 4 in ntpd[7f29b,555b27504000+86000] likely on CPU 4 (core 4, socket 0)
[   22.390898] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[   22.517794] ntpd[1016]: segfault at 24 ip 000056387455529b sp 00007ffde75cabf0 error 4 in ntpd[7f29b,5638744e8000+86000] likely on CPU 4 (core 4, socket 0)
[   22.518953] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[   22.648239] ntpd[1019]: segfault at 24 ip 000055fe102ab29b sp 00007ffc26382ca0 error 4 in ntpd[7f29b,55fe1023e000+86000] likely on CPU 0 (core 0, socket 0)
[   22.649529] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
--8<---------------cut here---------------end--------------->8---

It's been doing that since around the time of issue 73873 [0].  I double
checked and it does use the 2.guix.pool.ntp.org pool.  I also reverted back
to 0.guix.pool.ntp.org in case for some reason that would work.  Both
segfaults ntpd.  Did 2.guix.pool.ntp.org stop supporting ipv6?

Thanks

[0] https://issues.guix.gnu.org/73873
-- 
s/Fred[re]+i[ck]+/Fredrik/g




Information forwarded to bug-guix <at> gnu.org:
bug#74501; Package guix. (Sun, 15 Dec 2024 00:53:02 GMT) Full text and rfc822 format available.

Message #8 received at 74501 <at> debbugs.gnu.org (full text, mbox):

From: "Danny Milosavljevic" <dannym <at> scratchpost.org>
To: 74501 <at> debbugs.gnu.org
Subject: Problem confirmed
Date: Sun, 15 Dec 2024 01:52:46 +0100 (CET)
Hi,

I also have this problem on x86_64 znver3.

I disassembled my "Code:" block and I get:

8b 04 25 28 00 00 00    mov    eax, DWORD PTR ds:0x28
48 89 44 24 08          mov    QWORD PTR [rsp+0x8], rax
31 c0                   xor    eax, eax
e8 dc 2d f9 ff          call   <relative_address>
44 8b 28                mov    r13d, DWORD PTR [rax]
48 89 c5                mov    rbp, rax
e8 61 9e ff ff          call   <relative_address>
49 89 c4                mov    r12, rax
48 85 db                test   rbx, rbx
0f 84 e5 00 00 00       je     <forward_jump>
<44> 0f b7 0b           movzx  r9d, WORD PTR [rbx]         ; <-- This is where <44> is
66 41 83 f9 02          cmp    r9w, 0x2
0f 84 f6 00 00 00       je     <forward_jump>
66 41 83 f9 0a          cmp    r9w, 0xa
74 57                   je     <forward_jump>

The 0x44 byte in this instruction is part of the REX prefix that indicates the use of an extended register (r9d in this case).

The error code is a combination of several error bits defined in fault.c in the Linux kernel:

/*
 * Page fault error code bits:
 *
 *   bit 0 ==    0: no page found       1: protection fault
 *   bit 1 ==    0: read access         1: write access
 *   bit 2 ==    0: kernel-mode access  1: user-mode access
 *   bit 3 ==                           1: use of reserved bit detected
 *   bit 4 ==                           1: fault was an instruction fetch
 *   bit 5 ==                           1: protection keys block access
 *   bit 6 ==                           1: shadow stack access fault
 *   bit 15 =                           1: SGX MMU page-fault
 */
enum x86_pf_error_code {
        X86_PF_PROT     =               1 << 0,
        X86_PF_WRITE    =               1 << 1,
        X86_PF_USER     =               1 << 2,
        X86_PF_RSVD     =               1 << 3,
        X86_PF_INSTR    =               1 << 4,
        X86_PF_PK       =               1 << 5,
        X86_PF_SHSTK    =               1 << 6,
        X86_PF_SGX      =               1 << 15,
};

Since ntpd is a user-mode program, X86_PF_USER is set and the error code is at least 4.

If the error code is 4, then the faulty memory access is a read from user space.

In total:

- User-mode access.
- Read access.
- No page found.




This bug report was last modified 182 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.