Package: guix;
Reported by: Robert Vollmert <rob <at> vllmrt.net>
Date: Fri, 30 Aug 2019 17:33:02 UTC
Severity: important
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
Message #15 received at 37237-done <at> debbugs.gnu.org (full text, mbox):
From: Ludovic Courtès <ludo <at> gnu.org> To: Robert Vollmert <rob <at> vllmrt.net> Cc: 37237-done <at> debbugs.gnu.org Subject: Re: bug#37237: mcron randomly stops running jobs Date: Wed, 18 Mar 2020 12:10:27 +0100
Hi, Robert Vollmert <rob <at> vllmrt.net> skribis: > I have numerous mcron jobs configured, many of which run every 15 minutes, > and one which runs once a minute. Just now, at 19:22, I noticed the minutely > cron job didn’t seem to run, and saw that one every-15-minute job which always > logs to mcron.log had last run at 18:00. I restarted mcron, and jobs started > running again. There is no timezone confusion either: That every-15-minute job > logged its first run at 19:30. When mcron is stuck, it’s apparently stuck in ‘pthread_join’, waiting for the finalization thread to terminate: --8<---------------cut here---------------start------------->8--- (gdb) bt #0 0x00007f7f730d8478 in __GI___pthread_timedjoin_ex (threadid=140185229293312, thread_return=thread_return <at> entry=0x0, abstime=abstime <at> entry=0x0, block=block <at> entry=true) at pthread_join_common.c:84 #1 0x00007f7f730d82dc in __pthread_join (threadid=<optimized out>, thread_return=thread_return <at> entry=0x0) at pthread_join.c:24 #2 0x00007f7f731d7508 in stop_finalization_thread () at finalizers.c:265 #3 0x00007f7f731d7729 in scm_i_finalizer_pre_fork () at finalizers.c:290 #4 0x00007f7f73250426 in scm_fork () at posix.c:1220 #5 0x00007f7f7324979f in vm_regular_engine (thread=0x7f7f6acb79d0, vp=0x7f7f6b299f30, registers=0xafaf, resume=1930265720) at vm-engine.c:786 #6 0x00007f7f7324bfd7 in scm_call_n (proc=#<program 7f7f6aeaf5d0>, argv=argv <at> entry=0x7ffd9b10bed8, nargs=nargs <at> entry=1) at vm.c:1260 #7 0x00007f7f731cdf58 in scm_call_1 (proc=<optimized out>, arg1=<optimized out>) at eval.c:485 #8 0x000000000040130c in inner_main () #9 0x00007f7f731e5cfd in invoke_main_func (body_data=0x7ffd9b10c3b0) at init.c:341 #10 0x00007f7f731c835a in c_body (d=0x7ffd9b10c2f0) at continuations.c:422 #11 0x00007f7f7324979f in vm_regular_engine (thread=0x7f7f6acb79d0, vp=0x7f7f6b299f30, registers=0xafaf, resume=1930265720) at vm-engine.c:786 #12 0x00007f7f7324bfd7 in scm_call_n (proc=proc <at> entry=#<smob catch-closure 7f7f6b2a1460>, argv=argv <at> entry=0x0, nargs=nargs <at> entry=0) at vm.c:1260 #13 0x00007f7f731cdf39 in scm_call_0 (proc=proc <at> entry=#<smob catch-closure 7f7f6b2a1460>) at eval.c:479 #14 0x00007f7f7323ac12 in catch (tag=tag <at> entry=#t, thunk=#<smob catch-closure 7f7f6b2a1460>, handler=<error reading variable: ERROR: Cannot access memory at address 0x73776b3469667269>0x7f7f6b2a1420, pre_unwind_handler=<error reading variable: ERROR: Cannot access memory at address 0x73776b3469667269>0x7f7f6b2a1400) at throw.c:137 #15 0x00007f7f7323aef5 in scm_catch_with_pre_unwind_handler (key=key <at> entry=#t, thunk=<optimized out>, handler=<optimized out>, pre_unwind_handler=<optimized out>) at throw.c:254 #16 0x00007f7f7323b0bf in scm_c_catch (tag=tag <at> entry=#t, body=body <at> entry=0x7f7f731c8350 <c_body>, body_data=body_data <at> entry=0x7ffd9b10c2f0, handler=handler <at> entry=0x7f7f731c85e0 <c_handler>, handler_data=handler_data <at> entry=0x7ffd9b10c2f0, pre_unwind_handler=pre_unwind_handler <at> entry=0x7f7f731c8440 <pre_unwind_handler>, pre_unwind_handler_data=0x7f7f6b2939a0) at throw.c:377 #17 0x00007f7f731c8940 in scm_i_with_continuation_barrier (body=body <at> entry=0x7f7f731c8350 <c_body>, body_data=body_data <at> entry=0x7ffd9b10c2f0, handler=handler <at> entry=0x7f7f731c85e0 <c_handler>, handler_data=handler_data <at> entry=0x7ffd9b10c2f0, pre_unwind_handler=pre_unwind_handler <at> entry=0x7f7f731c8440 <pre_unwind_handler>, pre_unwind_handler_data=0x7f7f6b2939a0) at continuations.c:360 #18 0x00007f7f731c89d5 in scm_c_with_continuation_barrier (func=<optimized out>, data=<optimized out>) at continuations.c:456 #19 0x00007f7f7323984c in with_guile (base=base <at> entry=0x7ffd9b10c358, data=data <at> entry=0x7ffd9b10c380) at threads.c:661 #20 0x00007f7f73127a68 in GC_call_with_stack_base (fn=fn <at> entry=0x7f7f73239800 <with_guile>, arg=arg <at> entry=0x7ffd9b10c380) at misc.c:1941 #21 0x00007f7f73239b68 in scm_i_with_guile (dynamic_state=<optimized out>, data=data <at> entry=0x7ffd9b10c380, func=func <at> entry=0x7f7f731e5ce0 <invoke_main_func>) at threads.c:704 #22 scm_with_guile (func=func <at> entry=0x7f7f731e5ce0 <invoke_main_func>, data=data <at> entry=0x7ffd9b10c3b0) at threads.c:710 #23 0x00007f7f731e5e92 in scm_boot_guile (argc=12, argv=0x7ffd9b10c4f8, main_func=0x4012a0 <inner_main>, closure=0x0) at init.c:324 #24 0x000000000040118b in main () --8<---------------cut here---------------end--------------->8--- The finalization thread lives its life as if it hadn’t received the “stop” message in its pipe: --8<---------------cut here---------------start------------->8--- (gdb) thread 17 [Switching to thread 17 (Thread 0x7f7f6acb7700 (LWP 44975))] #0 0x00007f7f730e0344 in __libc_read (fd=5, buf=buf <at> entry=0x7f7f6acb6a40, nbytes=nbytes <at> entry=1) at ../sysdeps/unix/sysv/linux/read.c:26 26 ../sysdeps/unix/sysv/linux/read.c: No such file or directory. (gdb) bt #0 0x00007f7f730e0344 in __libc_read (fd=5, buf=buf <at> entry=0x7f7f6acb6a40, nbytes=nbytes <at> entry=1) at ../sysdeps/unix/sysv/linux/read.c:26 #1 0x00007f7f731d7497 in read_finalization_pipe_data (data=0x7f7f6acb6a40) at finalizers.c:199 #2 0x00007f7f7312d503 in GC_do_blocking_inner (data=0x7f7f6acb6a00 "\200t\035s\177\177", context=<optimized out>) at pthread_support.c:1362 #3 0x00007f7f73121d62 in GC_with_callee_saves_pushed (fn=0x7f7f7312d4c0 <GC_do_blocking_inner>, arg=arg <at> entry=0x7f7f6acb6a00 "\200t\035s\177\177") at mach_dep.c:328 #4 0x00007f7f73127a9c in GC_do_blocking (fn=fn <at> entry=0x7f7f731d7480 <read_finalization_pipe_data>, client_data=client_data <at> entry=0x7f7f6acb6a40) at misc.c:2053 #5 0x00007f7f73239bba in scm_without_guile (func=0x7f7f731d7480 <read_finalization_pipe_data>, data=0x7f7f6acb6a40) at threads.c:722 #6 0x00007f7f731d784b in finalization_thread_proc (unused=<optimized out>) at finalizers.c:212 #7 0x00007f7f731c835a in c_body (d=0x7f7f6acb6e50) at continuations.c:422 #8 0x00007f7f7324979f in vm_regular_engine (thread=0x5, vp=0x7f7f6b299ea0, registers=0x1, resume=1930298180) at vm-engine.c:786 #9 0x00007f7f7324bfd7 in scm_call_n (proc=proc <at> entry=#<smob catch-closure 7f7f6b22fb80>, argv=argv <at> entry=0x0, nargs=nargs <at> entry=0) at vm.c:1260 #10 0x00007f7f731cdf39 in scm_call_0 (proc=proc <at> entry=#<smob catch-closure 7f7f6b22fb80>) at eval.c:479 #11 0x00007f7f7323ac12 in catch (tag=tag <at> entry=#t, thunk=#<smob catch-closure 7f7f6b22fb80>, handler=<error reading variable: ERROR: Cannot access memory at address 0x73776b3469667269>0x7f7f6b22f740, pre_unwind_handler=<error reading variable: ERROR: Cannot access memory at address 0x73776b3469667269>0x7f7f6b22f720) at throw.c:137 #12 0x00007f7f7323aef5 in scm_catch_with_pre_unwind_handler (key=key <at> entry=#t, thunk=<optimized out>, handler=<optimized out>, pre_unwind_handler=<optimized out>) at throw.c:254 #13 0x00007f7f7323b0bf in scm_c_catch (tag=tag <at> entry=#t, body=body <at> entry=0x7f7f731c8350 <c_body>, body_data=body_data <at> entry=0x7f7f6acb6e50, handler=handler <at> entry=0x7f7f731c85e0 <c_handler>, handler_data=handler_data <at> entry=0x7f7f6acb6e50, pre_unwind_handler=pre_unwind_handler <at> entry=0x7f7f731c8440 <pre_unwind_handler>, pre_unwind_handler_data=0x7f7f6b2939a0) at throw.c:377 #14 0x00007f7f731c8940 in scm_i_with_continuation_barrier (body=body <at> entry=0x7f7f731c8350 <c_body>, body_data=body_data <at> entry=0x7f7f6acb6e50, handler=handler <at> entry=0x7f7f731c85e0 <c_handler>, handler_data=handler_data <at> entry=0x7f7f6acb6e50, pre_unwind_handler=pre_unwind_handler <at> entry=0x7f7f731c8440 <pre_unwind_handler>, pre_unwind_handler_data=0x7f7f6b2939a0) at continuations.c:360 #15 0x00007f7f731c89d5 in scm_c_with_continuation_barrier (func=<optimized out>, data=<optimized out>) at continuations.c:456 #16 0x00007f7f7323984c in with_guile (base=base <at> entry=0x7f7f6acb6eb8, data=data <at> entry=0x7f7f6acb6ee0) at threads.c:661 #17 0x00007f7f73127a68 in GC_call_with_stack_base (fn=fn <at> entry=0x7f7f73239800 <with_guile>, arg=arg <at> entry=0x7f7f6acb6ee0) at misc.c:1941 #18 0x00007f7f73239b68 in scm_i_with_guile (dynamic_state=<optimized out>, data=<optimized out>, func=<optimized out>) at threads.c:704 #19 scm_with_guile (func=<optimized out>, data=<optimized out>) at threads.c:710 #20 0x00007f7f730d7015 in start_thread (arg=0x7f7f6acb7700) at pthread_create.c:486 #21 0x00007f7f7300891f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 --8<---------------cut here---------------end--------------->8--- This is on 2.2.6 and very likely the same issue as <https://bugs.gnu.org/37757>: ‘finalization_thread_proc’ gets EINTR, by chance ‘data.byte’ is zero, and so it keeps going instead of exiting. Commit e002332b8a136638c2a4fc1a2f2a26541ce0211d moves mcron to 2.2.7, which should fix this problem. \o/ Ludo’.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.