GNU bug report logs - #75209
30.0.93; Emacs reader failed to read data in "/home/nlj/.cache/org-persist/gc-lock.eld"

Previous Next

Package: emacs;

Reported by: "N. Jackson" <njackson <at> posteo.net>

Date: Mon, 30 Dec 2024 18:49:01 UTC

Severity: normal

Found in version 30.0.93

To reply to this bug, email your comments to 75209 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Mon, 30 Dec 2024 18:49:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to "N. Jackson" <njackson <at> posteo.net>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Mon, 30 Dec 2024 18:49:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: bug-gnu-emacs <at> gnu.org
Subject: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Mon, 30 Dec 2024 18:48:31 +0000
In the Emacs 30 pretest I have been getting the following warning
every few days:

  Warning (emacs): Emacs reader failed to read data in
  "/home/nlj/.cache/org-persist/gc-lock.eld". The error was: "End of
  file during parsing"

The `gc-lock' part suggests this might have something to do with
garbage collection, whereas `org-persist' suggests Org mode, but
I could find nothing in the Org manual about org-persist or about
gc-lock.

I don't know how to reproduce this.  The warning pops up seemingly
at random, often when the only visible window [before the *Warnings*
window appears] is NOT in Org mode.


In GNU Emacs 30.0.93 (build 1, x86_64-pc-linux-gnu, GTK+ Version
 3.24.43, cairo version 1.18.0) of 2024-12-20 built on fedora
Windowing system distributor 'The X.Org Foundation', version 11.0.12014000
System Description: Fedora Linux 40 (Xfce)

Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ
JPEG LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES
NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3
THREADS TIFF TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XINPUT2 XPM GTK3
ZLIB

Important settings:
  value of $LANG: en_CA.utf8
  value of $XMODIFIERS: @im=none
  locale-coding-system: utf-8-unix

Major mode: Text

Minor modes in effect:
  TeX-PDF-mode: t
  flyspell-mode: t
  recentf-mode: t
  yas-global-mode: t
  yas-minor-mode: t
  savehist-mode: t
  save-place-mode: t
  electric-pair-mode: t
  display-time-mode: t
  display-battery-mode: t
  desktop-save-mode: t
  delete-selection-mode: t
  cua-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  show-paren-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  minibuffer-regexp-mode: t
  size-indication-mode: t
  column-number-mode: t
  line-number-mode: t
  global-visual-line-mode: t
  visual-line-mode: t
  transient-mark-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  temp-buffer-resize-mode: t
  abbrev-mode: t

Load-path shadows:
None found.

Features:
(shadow sort bbdb-message mail-extr emacsbug message puny rfc822 mml
mml-sec epa epg rfc6068 epg-config gnus-util mm-decode mm-bodies
mm-encode mail-parse rfc2231 gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util mail-prsvr mail-utils mule-util cdlatex
reftex reftex-loaddefs reftex-vars dired-aux dired dired-loaddefs
emacs-news-mode bug-reference display-fill-column-indicator
display-line-numbers tex-mode font-latex latexenc preview
latex-mode-expansions latex edmacro latex-flymake flymake warnings
tex-ispell tex-style tex texmathp auctex yank-media oc-basic bibtex
iso8601 org-habit vc-git diff-mode track-changes easy-mmode
vc-dispatcher flyspell ispell kmacro mines derived cookie1 gamegrid
transpar expand-region text-mode-expansions the-org-mode-expansions
python-el-fgallina-expansions er-basic-expansions expand-region-core
expand-region-custom hydra advice lv compile text-property-search
org-clock comp-run comp-common org-agenda org-element org-persist
xdg org-id org-element-ast inline avl-tree generator org-refile org
org-macro org-pcomplete org-list org-footnote org-faces org-entities
time-date noutline outline ob-shell shell pcomplete ob-R ob-python
python project compat ob-plantuml ob-org ob-gnuplot ob-ditaa ob-calc
calc-store calc-trail calc-ext calc calc-loaddefs rect calc-macs
ob-awk ob-dot ob-maxima ob ob-tangle org-src sh-script smie treesit
executable ob-ref ob-lob ob-table ob-exp ob-comint comint ansi-osc
ansi-color ring ob-emacs-lisp ob-core ob-eval org-cycle org-table
org-keys oc org-loaddefs thingatpt find-func ol org-fold
org-fold-core org-compat org-version org-macs bbdb-anniv diary-lib
diary-loaddefs cal-menu calendar cal-loaddefs bbdb-com crm
mailabbrev bbdb bbdb-site timezone recentf tree-widget cus-edit pp
wid-edit ido format-spec modus-vivendi-theme modus-themes
yasnippet-classic-snippets cl-extra yasnippet help-mode savehist
saveplace company pcase elec-pair time battery dbus xml desktop
frameset delsel cua-base cus-load ace-window-autoloads
auctex-autoloads tex-site avy-autoloads bbdb-autoloads
cdlatex-autoloads company-autoloads csv-mode-autoloads
debbugs-autoloads ess-autoloads expand-region-autoloads
geiser-autoloads info orderless-autoloads rx sql-indent-autoloads
yasnippet-autoloads package browse-url url url-proxy url-privacy
url-expand url-methods url-history url-cookie generate-lisp-file
url-domsuf url-util mailcap url-handlers url-parse auth-source
cl-seq eieio eieio-core cl-macs icons password-cache json subr-x map
byte-opt gv bytecomp byte-compile url-vars cl-loaddefs cl-lib rmc
iso-transl tooltip cconv eldoc paren electric uniquify ediff-hook
vc-hooks lisp-float-type elisp-mode mwheel term/x-win x-win
term/common-win x-dnd touch-screen tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode
lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow
isearch easymenu timer select scroll-bar mouse jit-lock font-lock
syntax font-core term/tty-colors frame minibuffer nadvice seq simple
cl-generic indonesian philippine cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms
cp51932 hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese composite emoji-zwj charscript charprop case-table
epa-hook jka-cmpr-hook help abbrev obarray oclosure cl-preloaded
button loaddefs theme-loaddefs faces cus-face macroexp files window
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget keymap hashtable-print-readable backquote threads
dbusbind inotify dynamic-setting system-font-setting
font-render-setting cairo gtk x-toolkit xinput2 x multi-tty
move-toolbar make-network-process native-compile emacs)

Memory information:
((conses 16 874702 109492) (symbols 48 33757 0) (strings 32 141824 6883)
 (string-bytes 1 4270475) (vectors 16 88842)
 (vector-slots 8 1784278 36799) (floats 8 367 91)
 (intervals 56 24167 1981) (buffers 984 43))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Mon, 30 Dec 2024 19:32:02 GMT) Full text and rfc822 format available.

Message #8 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "N. Jackson" <njackson <at> posteo.net>, Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Mon, 30 Dec 2024 21:31:36 +0200
> From: "N. Jackson" <njackson <at> posteo.net>
> Date: Mon, 30 Dec 2024 18:48:31 +0000
> 
> In the Emacs 30 pretest I have been getting the following warning
> every few days:
> 
>   Warning (emacs): Emacs reader failed to read data in
>   "/home/nlj/.cache/org-persist/gc-lock.eld". The error was: "End of
>   file during parsing"
> 
> The `gc-lock' part suggests this might have something to do with
> garbage collection, whereas `org-persist' suggests Org mode, but
> I could find nothing in the Org manual about org-persist or about
> gc-lock.

Does the file exist?  If so, what is its content (assuming you can
post it here)?

> I don't know how to reproduce this.  The warning pops up seemingly
> at random, often when the only visible window [before the *Warnings*
> window appears] is NOT in Org mode.

Ihor, any ideas or suggestions?  Should this be reported to the Org
list first?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Mon, 30 Dec 2024 20:02:02 GMT) Full text and rfc822 format available.

Message #11 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Mon, 30 Dec 2024 22:01:30 +0200
> From: "N. Jackson" <njackson <at> posteo.net>
> Cc: Ihor Radchenko <yantar92 <at> posteo.net>,  75209 <at> debbugs.gnu.org
> Date: Mon, 30 Dec 2024 19:53:42 +0000
> 
> At 21:31 +0200 on Monday 2024-12-30, Eli Zaretskii wrote:
> 
> >> From: "N. Jackson" <njackson <at> posteo.net>
> >> Date: Mon, 30 Dec 2024 18:48:31 +0000
> >> 
> >>   Warning (emacs): Emacs reader failed to read data in
> >>   "/home/nlj/.cache/org-persist/gc-lock.eld". The error was: "End
> >>   of file during parsing"
> 
> > Does the file exist?  If so, what is its content (assuming you can
> > post it here)?
> 
> It exists and currently has the following contents:
> 
>   ;;   -*- mode: lisp-data; -*-
>   (((26482 57035 301257 992000) 26482 60639 74163 973000) ((26482 62694 821331 522000) 26482 62698 583212 450000))

This one seems okay.  I guess we need to wait for the warning and see
then?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Tue, 31 Dec 2024 17:41:02 GMT) Full text and rfc822 format available.

Message #14 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 75209 <at> debbugs.gnu.org, "N. Jackson" <njackson <at> posteo.net>
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Tue, 31 Dec 2024 17:42:07 +0000
Eli Zaretskii <eliz <at> gnu.org> writes:

>> I don't know how to reproduce this.  The warning pops up seemingly
>> at random, often when the only visible window [before the *Warnings*
>> window appears] is NOT in Org mode.
>
> Ihor, any ideas or suggestions?  Should this be reported to the Org
> list first?

Sounds like Emacs being killed by force in the middle of writing to that
file. Or, alternatively, C-g in the middle of writing.

If that's kill -9 or similar, I would not call this unexpected.

gc-lock.eld is a file used to flag that cache dir is being worked on by
multiple emacs instances. GC here refers to garbage-collecting cache data.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Tue, 31 Dec 2024 19:02:02 GMT) Full text and rfc822 format available.

Message #17 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Tue, 31 Dec 2024 19:02:35 +0000
"N. Jackson" <njackson <at> posteo.net> writes:

>> Or, alternatively, C-g in the middle of writing.
>
> I use C-g very frequently.  I type `M-x' and then realise I want to
> do `C-h f' first instead, so I do `C-g' to exit from the M-x prompt.
> Or I do an isearch and then change my mind (or find and read
> whatever I was looking for) then I do `C-g' (twice, I think) to exit
> the isearch and get back to where I started.  Usages like that.
> From my (probably naive) point of view, if that messes up Org Mode,
> then Org Mode is doing something wrong.

That should not be a problem then.
Reading/writing GC file is done using timer and, AFAIK, Emacs should not
run timers while you are running a command.

>> gc-lock.eld is a file used to flag that cache dir is being worked
>> on by multiple emacs instances. GC here refers to
>> garbage-collecting cache data.
>
> I do run multiple (two) instances of Emacs.  One is my normal
> session where I use Org quite heavily.  The other is my Gnus session
> in which I never open an Org file and never (as far as I know) use
> any Org features.

Gnus may load Org. (AFAIU, it does it when viewing gnus articles)

Another possible scenario is two Org instances writing to the same file
at the same time.
If it is what is happening in your case, your problem may be similar to 
https://list.orgmode.org/orgmode/CAMJKaZxA_VmLdFP_u1rNiF2s0X2kVivjT31jEM_r3BYCHri1PQ <at> mail.gmail.com/

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Tue, 31 Dec 2024 19:57:01 GMT) Full text and rfc822 format available.

Message #20 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Tue, 31 Dec 2024 21:54:02 +0200
> From: Ihor Radchenko <yantar92 <at> posteo.net>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
> Date: Tue, 31 Dec 2024 19:02:35 +0000
> 
> "N. Jackson" <njackson <at> posteo.net> writes:
> 
> >> Or, alternatively, C-g in the middle of writing.
> >
> > I use C-g very frequently.  I type `M-x' and then realise I want to
> > do `C-h f' first instead, so I do `C-g' to exit from the M-x prompt.
> > Or I do an isearch and then change my mind (or find and read
> > whatever I was looking for) then I do `C-g' (twice, I think) to exit
> > the isearch and get back to where I started.  Usages like that.
> > From my (probably naive) point of view, if that messes up Org Mode,
> > then Org Mode is doing something wrong.
> 
> That should not be a problem then.
> Reading/writing GC file is done using timer and, AFAIK, Emacs should not
> run timers while you are running a command.

If this happens while the user types some command, then timers could
fire during that typing, since people rarely type fast enough to not
let timers run.

But all this is not relevant, because Emacs binds inhibit-quit to a
non-nil value while it runs the timer function.  So, unless the timer
in question somehow forcibly resets inhibit-quit to nil, C-g should
not be able to interrupt a timer.

> >> gc-lock.eld is a file used to flag that cache dir is being worked
> >> on by multiple emacs instances. GC here refers to
> >> garbage-collecting cache data.
> >
> > I do run multiple (two) instances of Emacs.  One is my normal
> > session where I use Org quite heavily.  The other is my Gnus session
> > in which I never open an Org file and never (as far as I know) use
> > any Org features.
> 
> Gnus may load Org. (AFAIU, it does it when viewing gnus articles)
> 
> Another possible scenario is two Org instances writing to the same file
> at the same time.
> If it is what is happening in your case, your problem may be similar to 
> https://list.orgmode.org/orgmode/CAMJKaZxA_VmLdFP_u1rNiF2s0X2kVivjT31jEM_r3BYCHri1PQ <at> mail.gmail.com/

Can't Org prevent more than one session writing to this file?  We have
file locks which can be used here, I think.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Wed, 01 Jan 2025 09:42:02 GMT) Full text and rfc822 format available.

Message #23 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Wed, 01 Jan 2025 09:42:28 +0000
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Another possible scenario is two Org instances writing to the same file
>> at the same time.
>> If it is what is happening in your case, your problem may be similar to 
>> https://list.orgmode.org/orgmode/CAMJKaZxA_VmLdFP_u1rNiF2s0X2kVivjT31jEM_r3BYCHri1PQ <at> mail.gmail.com/
>
> Can't Org prevent more than one session writing to this file?  We have
> file locks which can be used here, I think.

That's exactly the idea I am trying in the linked thread to address the
issue.

It is not the biggest problem there though. The problem is when there is
a race between Emacs processes writing to the same file one after
another (without any locking). Contents of the file may then become
unexpected compared to other Emacs session.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Wed, 01 Jan 2025 12:16:02 GMT) Full text and rfc822 format available.

Message #26 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Wed, 01 Jan 2025 14:14:38 +0200
> From: Ihor Radchenko <yantar92 <at> posteo.net>
> Cc: njackson <at> posteo.net, 75209 <at> debbugs.gnu.org
> Date: Wed, 01 Jan 2025 09:42:28 +0000
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >> Another possible scenario is two Org instances writing to the same file
> >> at the same time.
> >> If it is what is happening in your case, your problem may be similar to 
> >> https://list.orgmode.org/orgmode/CAMJKaZxA_VmLdFP_u1rNiF2s0X2kVivjT31jEM_r3BYCHri1PQ <at> mail.gmail.com/
> >
> > Can't Org prevent more than one session writing to this file?  We have
> > file locks which can be used here, I think.
> 
> That's exactly the idea I am trying in the linked thread to address the
> issue.
> 
> It is not the biggest problem there though. The problem is when there is
> a race between Emacs processes writing to the same file one after
> another (without any locking). Contents of the file may then become
> unexpected compared to other Emacs session.

Is the gc-lock.eld file supposed to be a singleton across all the
Emacs sessions?  Earlier you said:

> gc-lock.eld is a file used to flag that cache dir is being worked
> on by multiple emacs instances. GC here refers to
> garbage-collecting cache data.

Can you tell more about the purpose and use of this file?  What is
written to it, and how is it supposed to be used after being written?
And what bad things happen when the Lisp readers errors out because it
is unable to read the data for some reason?

I'm asking because I'd like to think about, and then suggest, some
suitable solutions, but I don't want to suggest nonsensical ones.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Wed, 01 Jan 2025 15:56:01 GMT) Full text and rfc822 format available.

Message #29 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Wed, 01 Jan 2025 15:54:54 +0000
At 19:02 +0000 on Tuesday 2024-12-31, Ihor Radchenko wrote:
>
> Gnus may load Org. (AFAIU, it does it when viewing gnus articles)

Yes, I suppose it might.  [When I read an email with Org markup in
Gnus (source blocks, for example), the email gets nicely fontified.
Gnus might be using Org to do that rather than rolling its own
parser.]

> Another possible scenario is two Org instances writing to the same
> file at the same time.

I don't think that's the case here.

Certainly not the same user-owned Org Mode data file.  (I'm careful
not to open my Org mode files in my Gnus instance of Emacs [and
consequently I have to live with the inconvenience of not being able
to Capture directly from email or News].)

However now that I've learned that Org has internal files that it
writes to, it seems quite possible that two instances of Org might
write to one of those files at the same time.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Wed, 01 Jan 2025 17:43:01 GMT) Full text and rfc822 format available.

Message #32 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> protonmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org,
 "N. Jackson" <njackson <at> posteo.net>
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Wed, 01 Jan 2025 17:41:57 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes:

>> From: "N. Jackson" <njackson <at> posteo.net>
>> Cc: Ihor Radchenko <yantar92 <at> posteo.net>,  75209 <at> debbugs.gnu.org
>> Date: Mon, 30 Dec 2024 19:53:42 +0000
>>
>> At 21:31 +0200 on Monday 2024-12-30, Eli Zaretskii wrote:
>>
>> >> From: "N. Jackson" <njackson <at> posteo.net>
>> >> Date: Mon, 30 Dec 2024 18:48:31 +0000
>> >>
>> >>   Warning (emacs): Emacs reader failed to read data in
>> >>   "/home/nlj/.cache/org-persist/gc-lock.eld". The error was: "End
>> >>   of file during parsing"
>>
>> > Does the file exist?  If so, what is its content (assuming you can
>> > post it here)?
>>
>> It exists and currently has the following contents:
>>
>>   ;;   -*- mode: lisp-data; -*-
>>   (((26482 57035 301257 992000) 26482 60639 74163 973000) ((26482 62694 821331 522000) 26482 62698 583212 450000))
>
> This one seems okay.  I guess we need to wait for the warning and see
> then?

I'm assuming that the resolution was that the file was read before we
finished writing it.  I've run into the same issue a number of times
(interrupting and resuming Emacs builds leads to build failures, "make
bootstrap" makes them go away).

Can we consider modifying the .elc format to have a footer indicating
that the file is complete?  Ideally, it would also indicate the checksum
of the file as well as the fact that it is complete, but this would have
a performance impact which might be significant in some cases (very
large .elc files; of course, we could simply modify the footer to
indicate a "too large to checksum" condition has occurred, if the file
is large).

It's tempting to put this information in the header, the way we do for
pdumps (they are first written to start with "!UMPEDGNUEMACS", then the
last thing pdumper does is to rewrite the first character to be "D"),
but using a footer is more reliable: it detects truncation (or
modification) for whatever reason, and makes fewer assumptions about
data atomicity.

While we're in there, let's indicate in the ELC header whether the
special circumstances of native compilation applied to the compilation
process of this file.  This is particularly important if we use
benchmarks defined in .elc files: using the wrong compiled version would
lead to unreliable benchmark results, and be somewhat difficult to
detect otherwise.  (I'm assuming it is still the case that
native-compiling a Lisp file leaves behind user-visible .elc artifacts.
If that has been fixed, please ignore this paragraph).

But, please, no timestamp.  Let's keep things reproducible where we can,
and not leak sensitive information by accident.

It may be necessary to bump the produced ELC version code for this.

The equivalent issues are less urgent, but ultimately identical, for
pdumper files (apparently, we don't detect truncation or modification)
and object files produced during the build (it's the job of the make
implementation and the compiler to avoid truncated .o files, but if they
don't do that, we might want to write x.o.tmp first, then rename it, in
the usual fashion of Makefiles).

Note that it is, of course, possible to usefully modify .elc (and .pdmp)
files after creation, so we shouldn't make detected modifications an
unconditional error.

Pip





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Wed, 01 Jan 2025 18:53:02 GMT) Full text and rfc822 format available.

Message #35 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Pip Cet <pipcet <at> protonmail.com>
Cc: yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Wed, 01 Jan 2025 20:52:03 +0200
> Date: Wed, 01 Jan 2025 17:41:57 +0000
> From: Pip Cet <pipcet <at> protonmail.com>
> Cc: "N. Jackson" <njackson <at> posteo.net>, yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org
> 
> I'm assuming that the resolution was that the file was read before we
> finished writing it.  I've run into the same issue a number of times
> (interrupting and resuming Emacs builds leads to build failures, "make
> bootstrap" makes them go away).
> 
> Can we consider modifying the .elc format to have a footer indicating
> that the file is complete?

The .elc file is supposed to be created only when the compilation is
complete and successful.  If you look at byte-compile-file, you will
see that we first compile the Lisp code, then write the produced
bytecode to a temporary file, and only after that we rename the
temporary file into the target .elc file.  Renaming a file is an
atomic operation on Posix filesystems, so it either completely
succeeds or completely fails.

We only write directly to the target file if that file's directory is
unwritable.

So I don't understand why you see incomplete .elc files when you
interrupt the build.  What happens in my case is that I see those
temporary files left around, but I don't think I've ever saw an
incomplete .elc file after interrupting the build.

Is it likely that the directory where you build Emacs is not writable
by your user?  That's the only way I could explain what you see.  Or
maybe there's some other factor at work here, in which case we should
find out what that factor is, before we consider how to fix it.

In any case, I think this is a separate issue, so I'd prefer to have a
separate bug report for it.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Wed, 01 Jan 2025 21:10:02 GMT) Full text and rfc822 format available.

Message #38 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> protonmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Wed, 01 Jan 2025 21:09:24 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes:

>> Date: Wed, 01 Jan 2025 17:41:57 +0000
>> From: Pip Cet <pipcet <at> protonmail.com>
>> Cc: "N. Jackson" <njackson <at> posteo.net>, yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org
>>
>> I'm assuming that the resolution was that the file was read before we
>> finished writing it.  I've run into the same issue a number of times
>> (interrupting and resuming Emacs builds leads to build failures, "make
>> bootstrap" makes them go away).
>>
>> Can we consider modifying the .elc format to have a footer indicating
>> that the file is complete?
>
> The .elc file is supposed to be created only when the compilation is
> complete and successful.  If you look at byte-compile-file, you will

You're right.  Sorry for the noise.  The most likely explanation is I
missed a "Pure Lisp storage overflowed" message which "explained" it,
because I just tried and that's what happened.

There's a bug in pin_string which assumes no purespace overflow, and
corrupts bytecode after one, so it's entirely possible that an .elc file
was truncated.  Probably not worth fixing at this point.

I'll try interrupting a few more builds when no-purespace is merged.

Pip





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Thu, 02 Jan 2025 13:35:01 GMT) Full text and rfc822 format available.

Message #41 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: 75209 <at> debbugs.gnu.org
Cc: Eli Zaretskii <eliz <at> gnu.org>, Ihor Radchenko <yantar92 <at> posteo.net>
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Thu, 02 Jan 2025 13:34:10 +0000
At 18:48 +0000 on Monday 2024-12-30, N. Jackson wrote:
>
> In the Emacs 30 pretest I have been getting the following warning
> every few days:
>
>   Warning (emacs): Emacs reader failed to read data in
>   "/home/nlj/.cache/org-persist/gc-lock.eld". The error was: "End of
>   file during parsing"

When I woke my system from suspend this morning and switched to the
Emacs session running Gnus, the warning had popped up in that
session -- possibly during suspend or resume.

I immediately looked at gc-lock.eld which contained this:

  ;;   -*- mode: lisp-data; -*-
  (((26485 51608 866710 15000) 26486 34929 321731 426000))

(which I guess is again unremarkable).

I checked my buffer list to be absolutely certain that I hadn't
opened any Org Mode files and the only buffers were *Group*,
*scratch*, *Messages*, .newsrc-dribble, and *Warnings*.

The session has one frame and in that frame there was just the Gnus
Group buffer open (until the *Warnings* buffer popped up below it).

With Gnus sitting idle with just it's Group buffer open, I can't see
how it would use any Org Mode features -- or even _do_ anything at
all -- and as I said, no Org Mode (user) files were open in the
session.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Thu, 02 Jan 2025 17:26:02 GMT) Full text and rfc822 format available.

Message #44 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Thu, 02 Jan 2025 17:28:02 +0000
Eli Zaretskii <eliz <at> gnu.org> writes:

>> gc-lock.eld is a file used to flag that cache dir is being worked
>> on by multiple emacs instances. GC here refers to
>> garbage-collecting cache data.
>
> Can you tell more about the purpose and use of this file?  What is
> written to it, and how is it supposed to be used after being written?
> And what bad things happen when the Lisp readers errors out because it
> is unable to read the data for some reason?

Let me then describe briefly what org-persist does.

In the nutshell, it is cache manager.
The main cache data consists of:
1. index describing everything stored in the cache and its expiry
   settings
2. cache data stored in individual files. Each file in the cache is
   mentioned in the index file

From time to time (before quitting Emacs), org-persist needs to do some
"garbage collection" and remove cache files that are expired or
unreferenced from index to avoid cache growing infinitely.

The GC process works well, and helps keeping the cache directory
clean. However, there are problems when multiple Emacs processes are
running simultaneously.

Consider Emacs A loading cache index into memory and doing nothing.
Then, Emacs B also loads the cache index, but adds data to the cache.
If Emacs A is closed while Emacs B is running (and Emacs B not yet
updating cache index on disk), it also performs garbage
collection. However, Emacs A has no knowledge about cache data written
by Emacs B and may "garabge collect" this data. We do not want that.

"gc-lock.eld" keeps track of the running Emacs processes - every Emacs
process regularly write to "gc-lock.eld", putting a record in the form
of (before-init-time . <last known time that Emacs is running>). If
there are no known recently running Emacs processes (apart from
current), garbage collection process is suppressed to avoid removing
cache data from other Emacsen.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Thu, 02 Jan 2025 18:49:01 GMT) Full text and rfc822 format available.

Message #47 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Thu, 02 Jan 2025 20:48:42 +0200
> From: Ihor Radchenko <yantar92 <at> posteo.net>
> Cc: njackson <at> posteo.net, 75209 <at> debbugs.gnu.org
> Date: Thu, 02 Jan 2025 17:28:02 +0000
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Can you tell more about the purpose and use of this file?  What is
> > written to it, and how is it supposed to be used after being written?
> > And what bad things happen when the Lisp readers errors out because it
> > is unable to read the data for some reason?
> 
> Let me then describe briefly what org-persist does.
> 
> In the nutshell, it is cache manager.
> The main cache data consists of:
> 1. index describing everything stored in the cache and its expiry
>    settings
> 2. cache data stored in individual files. Each file in the cache is
>    mentioned in the index file
> 
> >From time to time (before quitting Emacs), org-persist needs to do some
> "garbage collection" and remove cache files that are expired or
> unreferenced from index to avoid cache growing infinitely.
> 
> The GC process works well, and helps keeping the cache directory
> clean. However, there are problems when multiple Emacs processes are
> running simultaneously.
> 
> Consider Emacs A loading cache index into memory and doing nothing.
> Then, Emacs B also loads the cache index, but adds data to the cache.
> If Emacs A is closed while Emacs B is running (and Emacs B not yet
> updating cache index on disk), it also performs garbage
> collection. However, Emacs A has no knowledge about cache data written
> by Emacs B and may "garabge collect" this data. We do not want that.

Thanks.  I think I still don't have a clear idea of the usage of these
caches.  Are the caches supposed to be common to all Emacs sessions?
E.g., when a cache changes by one session, are other sessions supposed
to know about the change?  If the cache is for a single session, then
why are several session allowed to write to the cache simultaneously?
And if the cache is common to all sessions, then perhaps reading the
index before writing it should avoid several sessions step on each
other's toes?

> "gc-lock.eld" keeps track of the running Emacs processes - every Emacs
> process regularly write to "gc-lock.eld", putting a record in the form
> of (before-init-time . <last known time that Emacs is running>). If
> there are no known recently running Emacs processes (apart from
> current), garbage collection process is suppressed to avoid removing
> cache data from other Emacsen.

One way of rewriting a file atomically is to write the stuff to a
temporary file, then rename it to the target name.  If Org doesn't
already do that, maybe you should try doing that (together with
reading the file before updating it)?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 05 Jan 2025 10:02:01 GMT) Full text and rfc822 format available.

Message #50 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 05 Jan 2025 10:03:49 +0000
Eli Zaretskii <eliz <at> gnu.org> writes:

> Thanks.  I think I still don't have a clear idea of the usage of these
> caches.  Are the caches supposed to be common to all Emacs sessions?

Yes.

> E.g., when a cache changes by one session, are other sessions supposed
> to know about the change?

Usually yes, except for cache entries that are supposed to live until
the end of Emacs session.

> And if the cache is common to all sessions, then perhaps reading the
> index before writing it should avoid several sessions step on each
> other's toes?

You are right. The only problem is short-living caches that should be
cleared at the end of Emacs session that created it.

> One way of rewriting a file atomically is to write the stuff to a
> temporary file, then rename it to the target name.  If Org doesn't
> already do that, maybe you should try doing that (together with
> reading the file before updating it)?

Org uses `with-temp-file'. Is there an alternative built-in and more
robust way to write string to file?

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 05 Jan 2025 11:16:01 GMT) Full text and rfc822 format available.

Message #53 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 05 Jan 2025 13:15:32 +0200
> From: Ihor Radchenko <yantar92 <at> posteo.net>
> Cc: njackson <at> posteo.net, 75209 <at> debbugs.gnu.org
> Date: Sun, 05 Jan 2025 10:03:49 +0000
> 
> > And if the cache is common to all sessions, then perhaps reading the
> > index before writing it should avoid several sessions step on each
> > other's toes?
> 
> You are right. The only problem is short-living caches that should be
> cleared at the end of Emacs session that created it.

Does this mean you have ideas for solving this problem by reading the
file before it is written?  Or does this mean you already read the
file before writing to it?

> > One way of rewriting a file atomically is to write the stuff to a
> > temporary file, then rename it to the target name.  If Org doesn't
> > already do that, maybe you should try doing that (together with
> > reading the file before updating it)?
> 
> Org uses `with-temp-file'. Is there an alternative built-in and more
> robust way to write string to file?

Writing to a file is not atomic.  If you instead write to a temporary
file, then rename it to the final file name, the renaming is atomic on
Posix filesystems.

This would mean you can still use with-temp-file, but with a temporary
file name as its argument, and you need to add a single rename-file
call afterwards.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 05 Jan 2025 13:17:01 GMT) Full text and rfc822 format available.

Message #56 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 05 Jan 2025 13:18:53 +0000
Eli Zaretskii <eliz <at> gnu.org> writes:

>> You are right. The only problem is short-living caches that should be
>> cleared at the end of Emacs session that created it.
>
> Does this mean you have ideas for solving this problem by reading the
> file before it is written?  Or does this mean you already read the
> file before writing to it?

The problem is that not every piece of cached data is stored in the
index on disk. Some caches should just live for the duration of Emacs
session that creates them.

Yet, other parallel Emacs sessions should not "GC" those transient
caches.

>> Org uses `with-temp-file'. Is there an alternative built-in and more
>> robust way to write string to file?
>
> Writing to a file is not atomic.  If you instead write to a temporary
> file, then rename it to the final file name, the renaming is atomic on
> Posix filesystems.
>
> This would mean you can still use with-temp-file, but with a temporary
> file name as its argument, and you need to add a single rename-file
> call afterwards.

That's fine. I just thought that there is some existing function doing
exactly this. If not, I can do it manually, of course.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 05 Jan 2025 14:19:01 GMT) Full text and rfc822 format available.

Message #59 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: 75209 <at> debbugs.gnu.org
Cc: Eli Zaretskii <eliz <at> gnu.org>, Ihor Radchenko <yantar92 <at> posteo.net>
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 05 Jan 2025 14:18:14 +0000
The following might be "normal", in which case I apologise for the
noise, but it seems odd to me and it might have some bearing on the
bug.

Running list-timers shows:

Idle                Next  Repeat Function
       -1d 15h 43m 30.2s      1h org-persist--refresh-gc-lock
               2.4s           1m battery-update-handler
               2.4s           5m savehist-autosave
               4.2s            - undo-auto--boundary-timer
              50.1s           1m display-time-event-handler
   *           0.1s            t show-paren-function
   *           0.5s            t #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_9>
   *           0.5s      :repeat blink-cursor-start
   *          30.0s            - desktop-auto-save

I see nothing in the manuals about what it means for a
relative timer to be negative.  (Or is org-persist--refresh-gc-lock
running on a timer set with an absolute time that list-timers is
merely displaying as a relative time?)  And it seems odd that this
time is before this Emacs session started (emacs-uptime shows 1 day,
1 hour, 16 minutes, 40 seconds).





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 05 Jan 2025 17:22:02 GMT) Full text and rfc822 format available.

Message #62 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 05 Jan 2025 19:21:08 +0200
> From: "N. Jackson" <njackson <at> posteo.net>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  Ihor Radchenko <yantar92 <at> posteo.net>
> Date: Sun, 05 Jan 2025 14:18:14 +0000
> 
> The following might be "normal", in which case I apologise for the
> noise, but it seems odd to me and it might have some bearing on the
> bug.
> 
> Running list-timers shows:
> 
> Idle                Next  Repeat Function
>        -1d 15h 43m 30.2s      1h org-persist--refresh-gc-lock
>                2.4s           1m battery-update-handler
>                2.4s           5m savehist-autosave
>                4.2s            - undo-auto--boundary-timer
>               50.1s           1m display-time-event-handler
>    *           0.1s            t show-paren-function
>    *           0.5s            t #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_9>
>    *           0.5s      :repeat blink-cursor-start
>    *          30.0s            - desktop-auto-save
> 
> I see nothing in the manuals about what it means for a
> relative timer to be negative.  (Or is org-persist--refresh-gc-lock
> running on a timer set with an absolute time that list-timers is
> merely displaying as a relative time?)  And it seems odd that this
> time is before this Emacs session started (emacs-uptime shows 1 day,
> 1 hour, 16 minutes, 40 seconds).

This timer is disabled.  See bug#39824 for some related discussions,
in particular

  https://debbugs.gnu.org/cgi/bugreport.cgi?bug=39824#53




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Mon, 06 Jan 2025 00:59:01 GMT) Full text and rfc822 format available.

Message #65 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Mon, 06 Jan 2025 00:58:21 +0000
At 19:21 +0200 on Sunday 2025-01-05, Eli Zaretskii wrote:

>> From: "N. Jackson" <njackson <at> posteo.net>
>> 
>> Running list-timers shows:
>> 
>> Idle                Next  Repeat Function
>>        -1d 15h 43m 30.2s      1h org-persist--refresh-gc-lock
>
> This timer is disabled.  See bug#39824 for some related discussions,
> in particular
>
>   https://debbugs.gnu.org/cgi/bugreport.cgi?bug=39824#53

I have now read that bug report and I admit I don't fully understand
it[1].

IIUC, the user's timer had encountered an error, a backtrace had
been produced, and the user had said `q' to the debugger, leaving
the timer in a broken state.

Here, like in that bug report, the broken timer has `t' in the first
element:

  [t 26490 7240 117604 3600 org-persist--refresh-gc-lock nil nil 17000 nil]

However, I have seen no error.  (If I were presented with a
backtrace, I would almost certainly make a copy of the buffer and
then hit `c' rather than `q', but in fact I haven't seen a backtrace
in a long time.  Indeed, debug-on-error is nil.  I have seen no
error messages in this run of Emacs and there are no errors (or
anything else unexpected in Messages).)

I'm guessing (wildly) that what happened is this:

1.  I woke my system from suspend.

2.  All timers in both my instances of Emacs ran roughly
    simultaneously.

3.  Org Mode's locking mechanisms are not working properly when two
    copies of org-persist--refresh-gc-lock run at essentially the 
    same time, and it failed in one instance of Emacs.

4.  Org Mode (or something else) caught the failure and reported

      Warning (emacs): Emacs reader failed to read data in
      "/home/nlj/.cache/org-persist/gc-lock.eld". The error was:
      "End of file during parsing"

    and the running of the timer was aborted, leaving it in a broken
    state.

I think this wild conjecture would explain why sometimes (but by no
means always) I see this warning when I resume from suspend; why I
rarely see the warning at other times; and why sometimes I see the
warning in my regular Emacs session and sometimes in the instance in
which I'm running Gnus.


(One other observation: IIUC, it says in bug#39824 that the broken
timer moves farther into the past, but here my broken timer is
counting forward (it is now due in negative 1 day and 5 hours) so
presumably in a day or so it will no longer be negative.  Will it
then start running again I wonder?  If this behaviour is expected,
perhaps it should be mentioned in the documentation.  It seems a bit
peculiar to me.)

[1] I don't understand why bug#39824 was closed as Not A Bug when
the mystery of how the timers got in an incoherent state wasn't
fully clarified.  (But maybe it was well understood and the
mechanism was too trivial to record.) [And I don't think, just
because a timer fails once, that one necessarily wants that timer
disabled (because the problem might be transient).  Also it seems to
me that if a timer is going to be disabled then that should be done
explicitly rather than as a side-effect of an abort.]  But that is
all irrelevant here.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Mon, 06 Jan 2025 13:50:02 GMT) Full text and rfc822 format available.

Message #68 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Mon, 06 Jan 2025 15:49:37 +0200
> From: "N. Jackson" <njackson <at> posteo.net>
> Cc: 75209 <at> debbugs.gnu.org,  yantar92 <at> posteo.net
> Date: Mon, 06 Jan 2025 00:58:21 +0000
> 
> I'm guessing (wildly) that what happened is this:
> 
> 1.  I woke my system from suspend.
> 
> 2.  All timers in both my instances of Emacs ran roughly
>     simultaneously.
> 
> 3.  Org Mode's locking mechanisms are not working properly when two
>     copies of org-persist--refresh-gc-lock run at essentially the 
>     same time, and it failed in one instance of Emacs.
> 
> 4.  Org Mode (or something else) caught the failure and reported
> 
>       Warning (emacs): Emacs reader failed to read data in
>       "/home/nlj/.cache/org-persist/gc-lock.eld". The error was:
>       "End of file during parsing"
> 
>     and the running of the timer was aborted, leaving it in a broken
>     state.
> 
> I think this wild conjecture would explain why sometimes (but by no
> means always) I see this warning when I resume from suspend; why I
> rarely see the warning at other times; and why sometimes I see the
> warning in my regular Emacs session and sometimes in the instance in
> which I'm running Gnus.

I think you are right.  I think the mechanisms involved in this
scenario should be audited to find possible problems and solutions.
For example, if the timer function could signal an error, it should
catch the error and handle it instead of leading to the timer being
disabled.

> [1] I don't understand why bug#39824 was closed as Not A Bug when
> the mystery of how the timers got in an incoherent state wasn't
> fully clarified.

Because the data for investigating it was not available.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 11 Jan 2025 14:04:01 GMT) Full text and rfc822 format available.

Message #71 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 11 Jan 2025 14:05:22 +0000
Ihor Radchenko <yantar92 <at> posteo.net> writes:

>> Writing to a file is not atomic.  If you instead write to a temporary
>> file, then rename it to the final file name, the renaming is atomic on
>> Posix filesystems.
>>
>> This would mean you can still use with-temp-file, but with a temporary
>> file name as its argument, and you need to add a single rename-file
>> call afterwards.
>
> That's fine. I just thought that there is some existing function doing
> exactly this. If not, I can do it manually, of course.

Done on Org main.
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=7999433067
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=2a620113c1

I will not risk bugfix.

Hopefully, the bug is resolved now.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 11 Jan 2025 14:35:02 GMT) Full text and rfc822 format available.

Message #74 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 11 Jan 2025 16:34:11 +0200
> From: Ihor Radchenko <yantar92 <at> posteo.net>
> Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
> Date: Sat, 11 Jan 2025 14:05:22 +0000
> 
> Ihor Radchenko <yantar92 <at> posteo.net> writes:
> 
> >> Writing to a file is not atomic.  If you instead write to a temporary
> >> file, then rename it to the final file name, the renaming is atomic on
> >> Posix filesystems.
> >>
> >> This would mean you can still use with-temp-file, but with a temporary
> >> file name as its argument, and you need to add a single rename-file
> >> call afterwards.
> >
> > That's fine. I just thought that there is some existing function doing
> > exactly this. If not, I can do it manually, of course.
> 
> Done on Org main.
> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=7999433067
> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=2a620113c1
> 
> I will not risk bugfix.
> 
> Hopefully, the bug is resolved now.

Thanks, I hope the OP will be able to install the changes locally and
verify them.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 11 Jan 2025 15:20:02 GMT) Full text and rfc822 format available.

Message #77 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Ihor Radchenko <yantar92 <at> posteo.net>, 75209 <at> debbugs.gnu.org,
 "N. Jackson" <njackson <at> posteo.net>
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 11 Jan 2025 10:19:25 -0500
[Message part 1 (text/x-markdown, inline)]
>>   Warning (emacs): Emacs reader failed to read data in
>>   "/home/nlj/.cache/org-persist/gc-lock.eld". The error was: "End of
>>   file during parsing"
>> 
>> The `gc-lock' part suggests this might have something to do with
>> garbage collection, whereas `org-persist' suggests Org mode, but
>> I could find nothing in the Org manual about org-persist or about
>> gc-lock.
>
> Does the file exist?  If so, what is its content (assuming you can
> post it here)?

FWIW, I've seen such errors enough times that I changed the code:

    @@ -444,8 +443,9 @@ org-persist--read-elisp-file
              (if (string-match-p "Invalid read syntax" (error-message-string err))
                  (message "Emacs reader failed to read data in %S. The error was: %S"
                           buffer-or-file (error-message-string err))
    -           (warn "Emacs reader failed to read data in %S. The error was: %S"
    -                 buffer-or-file (error-message-string err)))
    +           (warn "Emacs reader failed to read data in %S. The error was: %S\nFrom startpos %S in text (at %s):\n%S"
    +                 buffer-or-file (error-message-string err)
    +                 startpos (format-time-string "%F %T") (buffer-string)))
              nil)))))
     
     ;; FIXME: `pp' is very slow when writing even moderately large datasets

[ The "time" part is because it seems to happen mostly while I'm not
  using Emacs.  ]

The `buffer-string` is always empty in that error message, for me.
Maybe it's because the file doesn't exist, I have not investigated that far.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Mon, 13 Jan 2025 15:37:01 GMT) Full text and rfc822 format available.

Message #80 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Ihor Radchenko <yantar92 <at> posteo.net>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Mon, 13 Jan 2025 15:36:04 +0000
[Message part 1 (text/plain, inline)]
At 16:34 +0200 on Saturday 2025-01-11, Eli Zaretskii wrote:
>
>> From: Ihor Radchenko <yantar92 <at> posteo.net>
>> Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
>> Date: Sat, 11 Jan 2025 14:05:22 +0000
>> 
>> Ihor Radchenko <yantar92 <at> posteo.net> writes:
>> 
>> Done on Org main.
>> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=7999433067
>> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=2a620113c1
>> 
>> I will not risk bugfix.
>> 
>> Hopefully, the bug is resolved now.
>
> Thanks, I hope the OP will be able to install the changes locally

I will do my best to do that.

The commits are against the Org "Main" branch whereas I'm running
Emacs 30.0.93 which I where I see the warnings, and the diffs don't
apply against this version of org-persist.el.

For example, in org-persist--write-elisp-file in Emacs 30.0.93 (at
line 481 or org-persist.el) -- in the thick of where these changes
are happening -- we have this horror:

    ;; Force writing even when the file happens to be opened by
    ;; another Emacs process.
    (cl-letf (((symbol-function #'ask-user-about-lock)
               ;; FIXME: Emacs 27 does not yet have `always'.
               (lambda (&rest _) t)))

This (thank goodness) had disappeared from the Org "Main" branch
before Ihor's two commits above.


IIUC Ihor's two commits make two independent fixes.

One fix (comprising all four changes in the first commit, and one of
the changes in the second commit) is to "Write index before writing
cache data.  This makes sure that index and data are always in
sync."

The other fix (comprising two of the three changes in the second
commit), is to ensure an atomic write by writing to a temp file and
then moving that into place.

I have insufficient knowledge of Elisp and no understanding, at all
really, of org-persist, so I am not able to attempt to make the
first fix here, and I'm not sure that it would make sense to even
try to do so.

The latter fix is simple and easy to think about so I have applied
those changes to my local version of the Emacs 30.0.93 org-persist
(patch below).  It seems quite plausible that this change (alone)
will fix the problem of the warnings in this bug report and the
broken timer.

> and verify them.

Verifying that any change resolves this bug is a bit tricky as it
involves waiting for nothing to happen!  That is, if the warning
reappears I'll be able to say that this change (alone) does not
(by itself) resolve the problem, but if the warning doesn't reappear
in a day, a week, a month, I won't be able to say that it never
will reappear.

Anyway, I will run with this change and I will wait and see what
happens.  If the problem is not solved, I wouldn't expect it to show
itself for several days, given its usual frequency.  I'll try to
put my machine into suspend for over an hour (so the timers fire
"simultaneously" on resume) as often as possible to hurry things
along.  I'll report back if the problem is not solved.


This is the change I have made here for testing:

[org-persist-atomic-write.diff (text/x-patch, inline)]
--- Emacs-30.0.93/org-persist.el	2024-12-18 17:30:29.000000000 -0500
+++ atomic_write/org-persist.el	2025-01-12 18:58:58.230823402 -0500
@@ -475,7 +475,8 @@
         (print-escape-nonascii t)
         (print-continuous-numbering t)
         print-number-table
-        (start-time (float-time)))
+        (start-time (float-time))
+        (tmp-file (make-temp-file "org-persist-")))
     (unless (file-exists-p (file-name-directory file))
       (make-directory (file-name-directory file) t))
     ;; Force writing even when the file happens to be opened by
@@ -483,12 +484,19 @@
     (cl-letf (((symbol-function #'ask-user-about-lock)
                ;; FIXME: Emacs 27 does not yet have `always'.
                (lambda (&rest _) t)))
-      (with-temp-file file
+      ;; Do not write to FILE directly.  Another Emacs instance may be
+      ;; doing the same at the same time.  Instead, write to new
+      ;; temporary file and then rename it (renaming is atomic
+      ;; operation that does not create data races).
+      ;; See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=75209#35
+      (with-temp-file tmp-file
         (insert ";;   -*- mode: lisp-data; -*-\n")
         (if pp
             (let ((pp-use-max-width nil)) ; Emacs bug#58687
               (pp data (current-buffer)))
-          (prin1 data (current-buffer)))))
+          (prin1 data (current-buffer))))
+      (rename-file tmp-file file 'overwrite))
+    
     (org-persist--display-time
      (- (float-time) start-time)
      "Writing to %S" file)))

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Mon, 13 Jan 2025 17:26:01 GMT) Full text and rfc822 format available.

Message #83 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Mon, 13 Jan 2025 17:27:58 +0000
"N. Jackson" <njackson <at> posteo.net> writes:

>>> Hopefully, the bug is resolved now.
>>
>> Thanks, I hope the OP will be able to install the changes locally
>
> I will do my best to do that.
>
> The commits are against the Org "Main" branch whereas I'm running
> Emacs 30.0.93 which I where I see the warnings, and the diffs don't
> apply against this version of org-persist.el.
> ...

I recommend following https://orgmode.org/manual/Installation.html
In particular, the part about installing Org from git repository.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 25 Jan 2025 08:54:02 GMT) Full text and rfc822 format available.

Message #86 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 25 Jan 2025 10:53:29 +0200
Ping!  Any progress with this issue?  Any new data?

> From: "N. Jackson" <njackson <at> posteo.net>
> Cc: Ihor Radchenko <yantar92 <at> posteo.net>,  75209 <at> debbugs.gnu.org
> Date: Mon, 13 Jan 2025 15:36:04 +0000
> 
> 
> At 16:34 +0200 on Saturday 2025-01-11, Eli Zaretskii wrote:
> >
> >> From: Ihor Radchenko <yantar92 <at> posteo.net>
> >> Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
> >> Date: Sat, 11 Jan 2025 14:05:22 +0000
> >> 
> >> Ihor Radchenko <yantar92 <at> posteo.net> writes:
> >> 
> >> Done on Org main.
> >> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=7999433067
> >> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=2a620113c1
> >> 
> >> I will not risk bugfix.
> >> 
> >> Hopefully, the bug is resolved now.
> >
> > Thanks, I hope the OP will be able to install the changes locally
> 
> I will do my best to do that.
> 
> The commits are against the Org "Main" branch whereas I'm running
> Emacs 30.0.93 which I where I see the warnings, and the diffs don't
> apply against this version of org-persist.el.
> 
> For example, in org-persist--write-elisp-file in Emacs 30.0.93 (at
> line 481 or org-persist.el) -- in the thick of where these changes
> are happening -- we have this horror:
> 
>     ;; Force writing even when the file happens to be opened by
>     ;; another Emacs process.
>     (cl-letf (((symbol-function #'ask-user-about-lock)
>                ;; FIXME: Emacs 27 does not yet have `always'.
>                (lambda (&rest _) t)))
> 
> This (thank goodness) had disappeared from the Org "Main" branch
> before Ihor's two commits above.
> 
> 
> IIUC Ihor's two commits make two independent fixes.
> 
> One fix (comprising all four changes in the first commit, and one of
> the changes in the second commit) is to "Write index before writing
> cache data.  This makes sure that index and data are always in
> sync."
> 
> The other fix (comprising two of the three changes in the second
> commit), is to ensure an atomic write by writing to a temp file and
> then moving that into place.
> 
> I have insufficient knowledge of Elisp and no understanding, at all
> really, of org-persist, so I am not able to attempt to make the
> first fix here, and I'm not sure that it would make sense to even
> try to do so.
> 
> The latter fix is simple and easy to think about so I have applied
> those changes to my local version of the Emacs 30.0.93 org-persist
> (patch below).  It seems quite plausible that this change (alone)
> will fix the problem of the warnings in this bug report and the
> broken timer.
> 
> > and verify them.
> 
> Verifying that any change resolves this bug is a bit tricky as it
> involves waiting for nothing to happen!  That is, if the warning
> reappears I'll be able to say that this change (alone) does not
> (by itself) resolve the problem, but if the warning doesn't reappear
> in a day, a week, a month, I won't be able to say that it never
> will reappear.
> 
> Anyway, I will run with this change and I will wait and see what
> happens.  If the problem is not solved, I wouldn't expect it to show
> itself for several days, given its usual frequency.  I'll try to
> put my machine into suspend for over an hour (so the timers fire
> "simultaneously" on resume) as often as possible to hurry things
> along.  I'll report back if the problem is not solved.
> 
> 
> This is the change I have made here for testing:
> 
> --- Emacs-30.0.93/org-persist.el	2024-12-18 17:30:29.000000000 -0500
> +++ atomic_write/org-persist.el	2025-01-12 18:58:58.230823402 -0500
> @@ -475,7 +475,8 @@
>          (print-escape-nonascii t)
>          (print-continuous-numbering t)
>          print-number-table
> -        (start-time (float-time)))
> +        (start-time (float-time))
> +        (tmp-file (make-temp-file "org-persist-")))
>      (unless (file-exists-p (file-name-directory file))
>        (make-directory (file-name-directory file) t))
>      ;; Force writing even when the file happens to be opened by
> @@ -483,12 +484,19 @@
>      (cl-letf (((symbol-function #'ask-user-about-lock)
>                 ;; FIXME: Emacs 27 does not yet have `always'.
>                 (lambda (&rest _) t)))
> -      (with-temp-file file
> +      ;; Do not write to FILE directly.  Another Emacs instance may be
> +      ;; doing the same at the same time.  Instead, write to new
> +      ;; temporary file and then rename it (renaming is atomic
> +      ;; operation that does not create data races).
> +      ;; See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=75209#35
> +      (with-temp-file tmp-file
>          (insert ";;   -*- mode: lisp-data; -*-\n")
>          (if pp
>              (let ((pp-use-max-width nil)) ; Emacs bug#58687
>                (pp data (current-buffer)))
> -          (prin1 data (current-buffer)))))
> +          (prin1 data (current-buffer))))
> +      (rename-file tmp-file file 'overwrite))
> +    
>      (org-persist--display-time
>       (- (float-time) start-time)
>       "Writing to %S" file)))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 26 Jan 2025 00:43:01 GMT) Full text and rfc822 format available.

Message #89 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>, Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 26 Jan 2025 00:42:21 +0000
At 10:53 +0200 on Saturday 2025-01-25, Eli Zaretskii wrote:
>
> Ping!  Any progress with this issue?  Any new data?
>
>> From: "N. Jackson" <njackson <at> posteo.net>
>> Cc: Ihor Radchenko <yantar92 <at> posteo.net>,  75209 <at> debbugs.gnu.org
>> Date: Mon, 13 Jan 2025 15:36:04 +0000
>> 
>> This is the change I have made here for testing:
>> 
>> --- Emacs-30.0.93/org-persist.el	2024-12-18 17:30:29.000000000 -0500
>> +++ atomic_write/org-persist.el	2025-01-12 18:58:58.230823402 -0500
>> @@ -475,7 +475,8 @@
>>          (print-escape-nonascii t)
>>          (print-continuous-numbering t)
>>          print-number-table
>> -        (start-time (float-time)))
>> +        (start-time (float-time))
>> +        (tmp-file (make-temp-file "org-persist-")))
>>      (unless (file-exists-p (file-name-directory file))
>>        (make-directory (file-name-directory file) t))
>>      ;; Force writing even when the file happens to be opened by
>> @@ -483,12 +484,19 @@
>>      (cl-letf (((symbol-function #'ask-user-about-lock)
>>                 ;; FIXME: Emacs 27 does not yet have `always'.
>>                 (lambda (&rest _) t)))
>> -      (with-temp-file file
>> +      ;; Do not write to FILE directly.  Another Emacs instance may be
>> +      ;; doing the same at the same time.  Instead, write to new
>> +      ;; temporary file and then rename it (renaming is atomic
>> +      ;; operation that does not create data races).
>> +      ;; See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=75209#35
>> +      (with-temp-file tmp-file
>>          (insert ";;   -*- mode: lisp-data; -*-\n")
>>          (if pp
>>              (let ((pp-use-max-width nil)) ; Emacs bug#58687
>>                (pp data (current-buffer)))
>> -          (prin1 data (current-buffer)))))
>> +          (prin1 data (current-buffer))))
>> +      (rename-file tmp-file file 'overwrite))
>> +    
>>      (org-persist--display-time
>>       (- (float-time) start-time)
>>       "Writing to %S" file)))

Hello Eli,

Yes.  Unfortunately I can report that when I woke my system from
suspend this morning I saw the bug -- after about eleven days.  So I
can say that the patch I showed above (that tests doing an atomic
write by renaming a temporary file) is not sufficient.

I'm not sure where to go from here.

[I have now advised the org-persist--refresh-gc-lock timer handler
to log (each time the timer is called and in separate log files,
one for each instance of Emacs) the exact time, the state of the
timer, and whether the handler returned cleanly.  Maybe this will
shed some more light on what is happening -- or at least suggest a
way to trigger the bug more quickly.]

At 17:27 +0000 on Monday 2025-01-13, Ihor Radchenko wrote:
>
> I recommend following https://orgmode.org/manual/Installation.html
> In particular, the part about installing Org from git repository.

If the presence of the bug were something that could be tested in a
few minutes, I would be happy to test with Org mainline.  But the
bug takes several days (or weeks) to manifest.  That would mean I
would have to run the bleeding edge version of Org for my everyday
tasks and I just wouldn't be comfortable not using a version that's
been released and tested, because Org is critical to organising and
scheduling everything in my world.

Also, even if I did run Org mainline and after a suitable period of
testing the bug seemed to be gone, where would that leave us in
Emacs 30?





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 26 Jan 2025 03:32:02 GMT) Full text and rfc822 format available.

Message #92 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: 75209 <at> debbugs.gnu.org
Cc: Eli Zaretskii <eliz <at> gnu.org>, Ihor Radchenko <yantar92 <at> posteo.net>
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 26 Jan 2025 03:31:12 +0000
The following is for completeness (with regards to bug#39824) and
not directly relevant to the current bug.

At 19:58 -0500 on Sunday 2025-01-05, N. Jackson wrote:
>
> it says in bug#39824 that the broken timer moves farther into the
> past, but here my broken timer is counting forward

I have to retract this assertion.  Further observations revealed
that sometimes it seems the broken timer counts forwards in time and
sometimes it jumps backwards by days.

This is because the information displayed by `list-timers' is bogus
because it prepares the information using `format-seconds' which
doesn't work properly for negative arguments (bug#75849 [1]).

[1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=75849





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 26 Jan 2025 07:03:01 GMT) Full text and rfc822 format available.

Message #95 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 26 Jan 2025 09:02:28 +0200
> From: "N. Jackson" <njackson <at> posteo.net>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, Ihor Radchenko <yantar92 <at> posteo.net>
> Date: Sun, 26 Jan 2025 03:31:12 +0000
> 
> The following is for completeness (with regards to bug#39824) and
> not directly relevant to the current bug.
> 
> At 19:58 -0500 on Sunday 2025-01-05, N. Jackson wrote:
> >
> > it says in bug#39824 that the broken timer moves farther into the
> > past, but here my broken timer is counting forward
> 
> I have to retract this assertion.  Further observations revealed
> that sometimes it seems the broken timer counts forwards in time and
> sometimes it jumps backwards by days.
> 
> This is because the information displayed by `list-timers' is bogus
> because it prepares the information using `format-seconds' which
> doesn't work properly for negative arguments (bug#75849 [1]).

I think this is a tangent here.  The problem in this bug is not caused
by timers, IMO.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Tue, 04 Feb 2025 18:04:01 GMT) Full text and rfc822 format available.

Message #98 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Tue, 04 Feb 2025 18:06:04 +0000
"N. Jackson" <njackson <at> posteo.net> writes:

> Yes.  Unfortunately I can report that when I woke my system from
> suspend this morning I saw the bug -- after about eleven days.  So I
> can say that the patch I showed above (that tests doing an atomic
> write by renaming a temporary file) is not sufficient.

This is strange. Was it still "end of file while reading"?

>> I recommend following https://orgmode.org/manual/Installation.html
>> In particular, the part about installing Org from git repository.
>
> If the presence of the bug were something that could be tested in a
> few minutes, I would be happy to test with Org mainline.  But the
> bug takes several days (or weeks) to manifest.  That would mean I
> would have to run the bleeding edge version of Org for my everyday
> tasks and I just wouldn't be comfortable not using a version that's
> been released and tested, because Org is critical to organising and
> scheduling everything in my world.

Well. I was hoping that e095d269e2 could improve the situation for
you. At least, it should reduce the frequency of the observed problem.

> Also, even if I did run Org mainline and after a suitable period of
> testing the bug seemed to be gone, where would that leave us in
> Emacs 30?

Emacs 30 will have to live with this bug. It is not critical, but the
fixes are not exactly trivial. And we do not usually install non-trivial
fixes onto release branch unless those fixes are against critical bugs.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 09 Feb 2025 23:45:01 GMT) Full text and rfc822 format available.

Message #101 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 09 Feb 2025 23:44:41 +0000
At 18:06 +0000 on Tuesday 2025-02-04, Ihor Radchenko wrote:
>
> "N. Jackson" <njackson <at> posteo.net> writes:
>
>> Yes.  Unfortunately I can report that when I woke my system from
>> suspend this morning I saw the bug -- after about eleven days.  So I
>> can say that the patch I showed above (that tests doing an atomic
>> write by renaming a temporary file) is not sufficient.
>
> This is strange. Was it still "end of file while reading"?

No.  Yes, it was exactly the same error message, but the actual
message is this:

  Warning (emacs): Emacs reader failed to read data in
  "/home/nlj/.cache/org-persist/gc-lock.eld". The error was: "End of
  file during parsing"

It comes from the error handler for the condition-case in
org-persist--read-elisp-file.

(To be absolutely clear, the changes I applied to my Emacs are the
ones I showed in my earlier post[1].  This only applies the atomic
write part of your patches, not the "write index before writing
cache data" part (which I didn't see how to easily integrate into
the org-persist.el in Emacs 30).)

[1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=75209#80





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Mon, 10 Feb 2025 18:00:03 GMT) Full text and rfc822 format available.

Message #104 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Mon, 10 Feb 2025 18:01:10 +0000
[Message part 1 (text/plain, inline)]
"N. Jackson" <njackson <at> posteo.net> writes:

>> This is strange. Was it still "end of file while reading"?
>
> No.  Yes, it was exactly the same error message, but the actual
> message is this:
>
>   Warning (emacs): Emacs reader failed to read data in
>   "/home/nlj/.cache/org-persist/gc-lock.eld". The error was: "End of
>   file during parsing"
>
> It comes from the error handler for the condition-case in
> org-persist--read-elisp-file.
>
> (To be absolutely clear, the changes I applied to my Emacs are the
> ones I showed in my earlier post[1].  This only applies the atomic
> write part of your patches, not the "write index before writing
> cache data" part (which I didn't see how to easily integrate into
> the org-persist.el in Emacs 30).)

Very strange.
How can it be if we do atomic writes?

Maybe try to install the attached diff that will also display the
contents of the file as an additional warning.
Maybe it can give us more clues.

[extra-warning.diff (text/x-patch, inline)]
diff --git a/lisp/org-persist.el b/lisp/org-persist.el
index a639699d93..58facc0b30 100644
--- a/lisp/org-persist.el
+++ b/lisp/org-persist.el
@@ -449,7 +449,9 @@ (defun org-persist--read-elisp-file (&optional buffer-or-file)
              (message "Emacs reader failed to read data in %S. The error was: %S"
                       buffer-or-file (error-message-string err))
            (warn "Emacs reader failed to read data in %S. The error was: %S"
-                 buffer-or-file (error-message-string err)))
+                 buffer-or-file (error-message-string err))
+           (warn "The problematic file contents is:\n-----\n%s\n------\n"
+                 (buffer-string)))
          nil)))))
 
 ;; FIXME: `pp' is very slow when writing even moderately large datasets
[Message part 3 (text/plain, inline)]
-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Fri, 14 Feb 2025 14:04:02 GMT) Full text and rfc822 format available.

Message #107 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Fri, 14 Feb 2025 14:02:55 +0000
At 18:01 +0000 on Monday 2025-02-10, Ihor Radchenko wrote:
>
> Very strange.
> How can it be if we do atomic writes?

I don't know enough of the details to be able to even begin to
answer that question.

Maybe(?) if insert-file-contents reads the file in chunks and if one
instance of Emacs runs org-persist--write-elisp-file at the same
time as the other instance if running org-persist--read-elisp-file,
perhaps the file could be replaced in the middle of the read.  Then
maybe chunks could be read in from both the old and the new files?

> Maybe try to install the attached diff that will also display the
> contents of the file as an additional warning.
> Maybe it can give us more clues.
>
> diff --git a/lisp/org-persist.el b/lisp/org-persist.el
> index a639699d93..58facc0b30 100644
> --- a/lisp/org-persist.el
> +++ b/lisp/org-persist.el
> @@ -449,7 +449,9 @@ (defun org-persist--read-elisp-file (&optional buffer-or-file)
>               (message "Emacs reader failed to read data in %S. The error was: %S"
>                        buffer-or-file (error-message-string err))
>             (warn "Emacs reader failed to read data in %S. The error was: %S"
> -                 buffer-or-file (error-message-string err)))
> +                 buffer-or-file (error-message-string err))
> +           (warn "The problematic file contents is:\n-----\n%s\n------\n"
> +                 (buffer-string)))
>           nil)))))
>  
>  ;; FIXME: `pp' is very slow when writing even moderately large datasets

I have applied your diff and will report back with details when I
next get the warning.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Fri, 14 Feb 2025 15:14:01 GMT) Full text and rfc822 format available.

Message #110 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Fri, 14 Feb 2025 17:12:48 +0200
> From: "N. Jackson" <njackson <at> posteo.net>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  75209 <at> debbugs.gnu.org
> Date: Fri, 14 Feb 2025 14:02:55 +0000
> 
> Maybe(?) if insert-file-contents reads the file in chunks and if one
> instance of Emacs runs org-persist--write-elisp-file at the same
> time as the other instance if running org-persist--read-elisp-file,
> perhaps the file could be replaced in the middle of the read.  Then
> maybe chunks could be read in from both the old and the new files?

Emacs reads files in chunks of 16KB.  Is the file in question likely
to be larger than that?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Fri, 14 Feb 2025 16:35:01 GMT) Full text and rfc822 format available.

Message #113 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 75209 <at> debbugs.gnu.org, "N. Jackson" <njackson <at> posteo.net>
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Fri, 14 Feb 2025 16:33:47 +0000
Eli Zaretskii <eliz <at> gnu.org> writes:

> Emacs reads files in chunks of 16KB.  Is the file in question likely
> to be larger than that?

The file size is around 60 bytes x number of Emacs instances started within
the last 24 hours (one record for each process until `org-persist-gc-lock-expiry').

For it to exceed 16kb, one needs to run Emacs >250 times per day.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Fri, 14 Feb 2025 22:34:01 GMT) Full text and rfc822 format available.

Message #116 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Fri, 14 Feb 2025 22:33:24 +0000
At 16:33 +0000 on Friday 2025-02-14, Ihor Radchenko wrote:
>
> Eli Zaretskii <eliz <at> gnu.org> writes:
>
>> Emacs reads files in chunks of 16KB.  Is the file in question likely
>> to be larger than that?
>
> The file size is around 60 bytes x number of Emacs instances started within
> the last 24 hours (one record for each process until `org-persist-gc-lock-expiry').
>
> For it to exceed 16kb, one needs to run Emacs >250 times per day.

That's not it then.  (I only have two instances of Emacs open at any
time and I mostly keep them running until I get a kernel update and
restart my system -- which happens about once a week. In the last
few years, I doubt I've opened Emacs more than a dozen times in any
24 hour period -- I haven't done that since I stopped working mostly
from the command line and started working mostly from within Emacs.
(But I can imagine someone working from the command line potentially
opening Emacs more than 256 times a day.))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Fri, 21 Feb 2025 15:27:03 GMT) Full text and rfc822 format available.

Message #119 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: 75209 <at> debbugs.gnu.org
Cc: Ihor Radchenko <yantar92 <at> posteo.net>, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Fri, 21 Feb 2025 15:26:23 +0000
At 09:02 -0500 on Friday 2025-02-14, N. Jackson wrote:
>
> At 18:01 +0000 on Monday 2025-02-10, Ihor Radchenko wrote:
>>
>> diff --git a/lisp/org-persist.el b/lisp/org-persist.el
>> index a639699d93..58facc0b30 100644
>> --- a/lisp/org-persist.el
>> +++ b/lisp/org-persist.el
>> @@ -449,7 +449,9 @@ (defun org-persist--read-elisp-file (&optional buffer-or-file)
>>               (message "Emacs reader failed to read data in %S. The error was: %S"
>>                        buffer-or-file (error-message-string err))
>>             (warn "Emacs reader failed to read data in %S. The error was: %S"
>> -                 buffer-or-file (error-message-string err)))
>> +                 buffer-or-file (error-message-string err))
>> +           (warn "The problematic file contents is:\n-----\n%s\n------\n"
>> +                 (buffer-string)))
>>           nil)))))
>>  
>>  ;; FIXME: `pp' is very slow when writing even moderately large datasets
>
> I have applied your diff and will report back with details when I
> next get the warning.

The bug occurred again this morning (when the system was waking up
from suspend).  The diagnostic information from the new warning was
this:

  ⛔ Warning (emacs): In org-persist--read-elisp-file: The problematic file contents is:
  -----
  ;;   -*- mode: lisp-data; -*-
  (((26548 34513 530425 770000) 26551 61114 665219 11000) ((26548 34583 367592 501000) 26552 26502 142724 470000)
  ------

Unfortunately however, I cannot say with certainty that this was the
"End of file during parsing" error again (but as I have seen no
other error with this bug, I feel fairly sure that it was).


I apologise that I don't have this information.  My elisp is poor
and I am having difficulties with the above patch -- only the first
warning in the else clause gets displayed.

Initially I applied the diff using diff-mode.  It seemed to apply
cleanly.  Then I byte-compiled and loaded, and native-compiled and
loaded the file.  Fairly soon the bug occurred again but the only
warning shown in the *Warnings* buffer was the usual

  ⛔ Warning (emacs): Emacs reader failed to read data in "/home/nlj/.cache/org-persist/gc-lock.eld". The error was: "End of file during parsing"

.  There was no sign of the output from the new diagnostic.

I convinced myself that the change had applied to the right file and
that Emacs was using the modified code.

I find with Elisp that I have difficulty seeing easily which
statements are inside which block (and I find the asymmetry of the
then part and the else part(s) of if statements especially tricky),
but as far as I could tell the code looked right provided that the
`warn' function is expected to return.

Anyway, to see if poking it would help, I switched the order of the
two warning statements.  I now have this in the error handler in
org-persist--read-elisp-file:

  (error
   ;; Remove problematic file.
   (unless (bufferp buffer-or-file) (delete-file buffer-or-file))
   ;; Do not report the known error to user.
   (if (string-match-p "Invalid read syntax" (error-message-string err))
       (message "Emacs reader failed to read data in %S. The error was: %S"
                buffer-or-file (error-message-string err))
     (warn "In org-persist--read-elisp-file: The problematic file contents is:\n-----\n%s\n------\n"
           (buffer-string))
     (warn "In org-persist--read-elisp-file: Emacs reader failed to read data in %S. The error was: %S"
           buffer-or-file (error-message-string err)))
   nil)))))

Today was the first time the bug occurred with this version of the
diagnostic and as I reported above, again I only got the output from
the first warning -- which is now the new diagnostic information but
I didn't get the information from the second warning about what the
error was.

I'm not sure what is wrong with this else clause.


With regards to the timing of the triggering of the bug, I have
recently added an :around advice to org-persist--refresh-gc-lock
timer handler to log the exact time and the state of the timer
immediately before the function is entered and immediately after it
exits.  (If it exits.  It doesn't when the bug happens).

When the bug happened this morning, my logs show that my normal
instance of Emacs entered the org-persist--refresh-gc-lock timer
handler at 06:46:07.188999 and never returned, whereas my Gnus
instance of Emacs entered org-persist--refresh-gc-lock eight times
(the number hours of suspend) starting at 06:46:07.295370 and
finishing at 06:46:14.192108.


FWIW, the details from my log (showing the last two invocations of
the timer before I put the system into suspend last night, the
invocations at the time when the system resumed from suspend, and
(in the case of the timer that didn't break), the first regular
invocation of the timer since I woke the system up.)

(I apologise if this is just noise -- I realise that the breaking of
the timer is orthogonal to the problem with the Org persist cache.)

Note: The due time in seconds is nonsense when it is negative
because of the bug in float-time.

Normal Emacs instance (bug happened here):

2025-02-20 21:10:50.625671 Norm entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In 3599.989584552 s (26551 61114 615308 958000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-20 21:10:50.630169 Norm leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In 3599.985085596 s (26551 61114 615308 958000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

2025-02-20 22:10:50.622359 Norm entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In 3599.992908008 s (26551 64714 615308 958000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-20 22:10:50.666588 Norm leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In 3599.948640987 s (26551 64714 615308 958000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

2025-02-21 06:46:07.188999 Norm entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -23716.573737732 s (26552 2778 615308 958000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil


Gnus Emacs instance:

2025-02-20 21:11:55.035594 Gnus entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In 3599.992966593 s (26551 61179 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-20 21:11:55.063103 Gnus leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In 3599.965491277 s (26551 61179 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

2025-02-20 22:11:55.043835 Gnus entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In 3599.984760117 s (26551 64779 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-20 22:11:55.058196 Gnus leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In 3599.970393228 s (26551 64779 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

2025-02-21 06:46:07.295370 Gnus entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -23652.266766201 s (26552 2843 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-21 06:46:14.143748 Gnus leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -23659.115148904 s (26552 2843 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

2025-02-21 06:46:14.148166 Gnus entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -20059.119561876 s (26552 6443 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-21 06:46:14.179593 Gnus leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -20059.151004079 s (26552 6443 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

2025-02-21 06:46:14.180383 Gnus entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -16459.151768947 s (26552 10043 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-21 06:46:14.181836 Gnus leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -16459.153224063 s (26552 10043 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

2025-02-21 06:46:14.182532 Gnus entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -12859.153917435 s (26552 13643 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-21 06:46:14.183997 Gnus leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -12859.155382942 s (26552 13643 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

2025-02-21 06:46:14.184627 Gnus entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -9259.156007449 s (26552 17243 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-21 06:46:14.186050 Gnus leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -9259.157438955 s (26552 17243 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

2025-02-21 06:46:14.186715 Gnus entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -5659.158116718 s (26552 20843 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-21 06:46:14.188074 Gnus leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -5659.159457743 s (26552 20843 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

2025-02-21 06:46:14.188685 Gnus entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -2059.160065622 s (26552 24443 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-21 06:46:14.190071 Gnus leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In -2059.161456207 s (26552 24443 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

2025-02-21 06:46:14.190744 Gnus entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In 1540.837851387 s (26552 28043 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-21 06:46:14.192108 Gnus leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In 1540.836489647 s (26552 28043 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

2025-02-21 07:11:55.051565 Gnus entering org-persist--refresh-gc-lock:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In 3599.97701699 s (26552 31643 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil

  2025-02-21 07:11:55.072352 Gnus leaving org-persist--refresh-gc-lock normally:
  Timer: org-persist--refresh-gc-lock (nil)
  Due: In 3599.956240079 s (26552 31643 28641 22000)
  Triggered: t		Integral Multiple: nil
  Repeat Delay: 3600	Idle Delay: nil





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 23 Feb 2025 18:28:02 GMT) Full text and rfc822 format available.

Message #122 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 23 Feb 2025 18:26:52 +0000
"N. Jackson" <njackson <at> posteo.net> writes:

> The bug occurred again this morning (when the system was waking up
> from suspend).  The diagnostic information from the new warning was
> this:

First of all, thanks a lot of the detailed investigation!

>   ⛔ Warning (emacs): In org-persist--read-elisp-file: The problematic file contents is:
>   -----
>   ;;   -*- mode: lisp-data; -*-
>   (((26548 34513 530425 770000) 26551 61114 665219 11000) ((26548 34583 367592 501000) 26552 26502 142724 470000)
>   ------
>
> Unfortunately however, I cannot say with certainty that this was the
> "End of file during parsing" error again (but as I have seen no
> other error with this bug, I feel fairly sure that it was).

That was it. I can see it from unclosed parenthesis.

> I find with Elisp that I have difficulty seeing easily which
> statements are inside which block (and I find the asymmetry of the
> then part and the else part(s) of if statements especially tricky),
> but as far as I could tell the code looked right provided that the
> `warn' function is expected to return.

"Else" part may contain any number of sexps, while "then" part only one.
That's why asymmetry.

> Today was the first time the bug occurred with this version of the
> diagnostic and as I reported above, again I only got the output from
> the first warning -- which is now the new diagnostic information but
> I didn't get the information from the second warning about what the
> error was.

I am also clueless why only a single warning is shown though.

> When the bug happened this morning, my logs show that my normal
> instance of Emacs entered the org-persist--refresh-gc-lock timer
> handler at 06:46:07.188999 and never returned, whereas my Gnus
> instance of Emacs entered org-persist--refresh-gc-lock eight times
> (the number hours of suspend) starting at 06:46:07.295370 and
> finishing at 06:46:14.192108.

Ouch! It is actually expected (and customizable via
`timer-max-repeats'), but was not intended by me in the code logic.
It does not explain your bug though.

> FWIW, the details from my log (showing the last two invocations of
> the timer before I put the system into suspend last night, the
> invocations at the time when the system resumed from suspend, and
> (in the case of the timer that didn't break), the first regular
> invocation of the timer since I woke the system up.)

Very helpful!

> Normal Emacs instance (bug happened here):
> ...
> 2025-02-21 06:46:07.188999 Norm entering org-persist--refresh-gc-lock:
>   Timer: org-persist--refresh-gc-lock (nil)
>   Due: In -23716.573737732 s (26552 2778 615308 958000)
>   Triggered: t		Integral Multiple: nil
>   Repeat Delay: 3600	Idle Delay: nil

Never returned likely means that it threw an error.
Most likely because

(setf (alist-get before-init-time alist nil nil #'equal)
            (current-time))
will fail when ALIST is nil. I should fix this.
However, it does not solve the mistery of incomplete data in the
lockfile.

> Gnus Emacs instance:
>
> 2025-02-20 22:11:55.043835 Gnus entering org-persist--refresh-gc-lock:
>   Timer: org-persist--refresh-gc-lock (nil)
>   Due: In 3599.984760117 s (26551 64779 28641 22000)
>   Triggered: t		Integral Multiple: nil
>   Repeat Delay: 3600	Idle Delay: nil
>
>   2025-02-20 22:11:55.058196 Gnus leaving org-persist--refresh-gc-lock normally:
>   Timer: org-persist--refresh-gc-lock (nil)
>   Due: In 3599.970393228 s (26551 64779 28641 22000)
>   Triggered: t		Integral Multiple: nil
>   Repeat Delay: 3600	Idle Delay: nil

This looks like the most recent write to the lock file.

The one at 6am happens after the failing read at 06:46:07.188999.

On the other hand, it is just a fraction of second apart.
Is `insert-file-contents' atomic operation?

> 2025-02-21 06:46:07.295370 Gnus entering org-persist--refresh-gc-lock:
>   Timer: org-persist--refresh-gc-lock (nil)
>   Due: In -23652.266766201 s (26552 2843 28641 22000)
>   Triggered: t		Integral Multiple: nil
>   Repeat Delay: 3600	Idle Delay: nil
>
>   2025-02-21 06:46:14.143748 Gnus leaving org-persist--refresh-gc-lock normally:
>   Timer: org-persist--refresh-gc-lock (nil)
>   Due: In -23659.115148904 s (26552 2843 28641 22000)
>   Triggered: t		Integral Multiple: nil
>   Repeat Delay: 3600	Idle Delay: nil


-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Fri, 07 Mar 2025 16:52:01 GMT) Full text and rfc822 format available.

Message #125 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Fri, 07 Mar 2025 16:50:24 +0000
Ihor Radchenko <yantar92 <at> posteo.net> writes:

>>   2025-02-20 22:11:55.058196 Gnus leaving org-persist--refresh-gc-lock normally:
>>   Timer: org-persist--refresh-gc-lock (nil)
>>   Due: In 3599.970393228 s (26551 64779 28641 22000)
>>   Triggered: t		Integral Multiple: nil
>>   Repeat Delay: 3600	Idle Delay: nil
>
> This looks like the most recent write to the lock file.
>
> The one at 6am happens after the failing read at 06:46:07.188999.
>
> On the other hand, it is just a fraction of second apart.
> Is `insert-file-contents' atomic operation?

Eli, do you have any idea if `insert-file-contents' may be affected by
the inserted file being written simultaneously?

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Fri, 07 Mar 2025 19:13:01 GMT) Full text and rfc822 format available.

Message #128 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Fri, 07 Mar 2025 21:12:10 +0200
> From: Ihor Radchenko <yantar92 <at> posteo.net>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
> Date: Fri, 07 Mar 2025 16:50:24 +0000
> 
> Ihor Radchenko <yantar92 <at> posteo.net> writes:
> 
> >>   2025-02-20 22:11:55.058196 Gnus leaving org-persist--refresh-gc-lock normally:
> >>   Timer: org-persist--refresh-gc-lock (nil)
> >>   Due: In 3599.970393228 s (26551 64779 28641 22000)
> >>   Triggered: t		Integral Multiple: nil
> >>   Repeat Delay: 3600	Idle Delay: nil
> >
> > This looks like the most recent write to the lock file.
> >
> > The one at 6am happens after the failing read at 06:46:07.188999.
> >
> > On the other hand, it is just a fraction of second apart.
> > Is `insert-file-contents' atomic operation?
> 
> Eli, do you have any idea if `insert-file-contents' may be affected by
> the inserted file being written simultaneously?

Emacs reads the file in chunks, so I think the answer depends on the
filesystem.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 11:10:02 GMT) Full text and rfc822 format available.

Message #131 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 13:09:31 +0200
Ping! How can we make some progress with this issue?

> From: Ihor Radchenko <yantar92 <at> posteo.net>
> Cc: 75209 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
> Date: Sun, 23 Feb 2025 18:26:52 +0000
> 
> "N. Jackson" <njackson <at> posteo.net> writes:
> 
> > The bug occurred again this morning (when the system was waking up
> > from suspend).  The diagnostic information from the new warning was
> > this:
> 
> First of all, thanks a lot of the detailed investigation!
> 
> >   ⛔ Warning (emacs): In org-persist--read-elisp-file: The problematic file contents is:
> >   -----
> >   ;;   -*- mode: lisp-data; -*-
> >   (((26548 34513 530425 770000) 26551 61114 665219 11000) ((26548 34583 367592 501000) 26552 26502 142724 470000)
> >   ------
> >
> > Unfortunately however, I cannot say with certainty that this was the
> > "End of file during parsing" error again (but as I have seen no
> > other error with this bug, I feel fairly sure that it was).
> 
> That was it. I can see it from unclosed parenthesis.
> 
> > I find with Elisp that I have difficulty seeing easily which
> > statements are inside which block (and I find the asymmetry of the
> > then part and the else part(s) of if statements especially tricky),
> > but as far as I could tell the code looked right provided that the
> > `warn' function is expected to return.
> 
> "Else" part may contain any number of sexps, while "then" part only one.
> That's why asymmetry.
> 
> > Today was the first time the bug occurred with this version of the
> > diagnostic and as I reported above, again I only got the output from
> > the first warning -- which is now the new diagnostic information but
> > I didn't get the information from the second warning about what the
> > error was.
> 
> I am also clueless why only a single warning is shown though.
> 
> > When the bug happened this morning, my logs show that my normal
> > instance of Emacs entered the org-persist--refresh-gc-lock timer
> > handler at 06:46:07.188999 and never returned, whereas my Gnus
> > instance of Emacs entered org-persist--refresh-gc-lock eight times
> > (the number hours of suspend) starting at 06:46:07.295370 and
> > finishing at 06:46:14.192108.
> 
> Ouch! It is actually expected (and customizable via
> `timer-max-repeats'), but was not intended by me in the code logic.
> It does not explain your bug though.
> 
> > FWIW, the details from my log (showing the last two invocations of
> > the timer before I put the system into suspend last night, the
> > invocations at the time when the system resumed from suspend, and
> > (in the case of the timer that didn't break), the first regular
> > invocation of the timer since I woke the system up.)
> 
> Very helpful!
> 
> > Normal Emacs instance (bug happened here):
> > ...
> > 2025-02-21 06:46:07.188999 Norm entering org-persist--refresh-gc-lock:
> >   Timer: org-persist--refresh-gc-lock (nil)
> >   Due: In -23716.573737732 s (26552 2778 615308 958000)
> >   Triggered: t		Integral Multiple: nil
> >   Repeat Delay: 3600	Idle Delay: nil
> 
> Never returned likely means that it threw an error.
> Most likely because
> 
> (setf (alist-get before-init-time alist nil nil #'equal)
>             (current-time))
> will fail when ALIST is nil. I should fix this.
> However, it does not solve the mistery of incomplete data in the
> lockfile.
> 
> > Gnus Emacs instance:
> >
> > 2025-02-20 22:11:55.043835 Gnus entering org-persist--refresh-gc-lock:
> >   Timer: org-persist--refresh-gc-lock (nil)
> >   Due: In 3599.984760117 s (26551 64779 28641 22000)
> >   Triggered: t		Integral Multiple: nil
> >   Repeat Delay: 3600	Idle Delay: nil
> >
> >   2025-02-20 22:11:55.058196 Gnus leaving org-persist--refresh-gc-lock normally:
> >   Timer: org-persist--refresh-gc-lock (nil)
> >   Due: In 3599.970393228 s (26551 64779 28641 22000)
> >   Triggered: t		Integral Multiple: nil
> >   Repeat Delay: 3600	Idle Delay: nil
> 
> This looks like the most recent write to the lock file.
> 
> The one at 6am happens after the failing read at 06:46:07.188999.
> 
> On the other hand, it is just a fraction of second apart.
> Is `insert-file-contents' atomic operation?
> 
> > 2025-02-21 06:46:07.295370 Gnus entering org-persist--refresh-gc-lock:
> >   Timer: org-persist--refresh-gc-lock (nil)
> >   Due: In -23652.266766201 s (26552 2843 28641 22000)
> >   Triggered: t		Integral Multiple: nil
> >   Repeat Delay: 3600	Idle Delay: nil
> >
> >   2025-02-21 06:46:14.143748 Gnus leaving org-persist--refresh-gc-lock normally:
> >   Timer: org-persist--refresh-gc-lock (nil)
> >   Due: In -23659.115148904 s (26552 2843 28641 22000)
> >   Triggered: t		Integral Multiple: nil
> >   Repeat Delay: 3600	Idle Delay: nil
> 
> 
> -- 
> Ihor Radchenko // yantar92,
> Org mode maintainer,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>
> 




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 11:22:04 GMT) Full text and rfc822 format available.

Message #134 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 11:20:36 +0000
Eli Zaretskii <eliz <at> gnu.org> writes:

> Ping! How can we make some progress with this issue?

Any chance to make `insert-file-contents' (or a new function) atomic?

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 12:20:01 GMT) Full text and rfc822 format available.

Message #137 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 12:19:07 +0000
Ihor Radchenko <yantar92 <at> posteo.net> writes:

> Never returned likely means that it threw an error.
> Most likely because
>
> (setf (alist-get before-init-time alist nil nil #'equal)
>             (current-time))
> will fail when ALIST is nil. I should fix this.

I was wrong. if ALIST is nil, the above code should work just fine.
So, it looks like all the problems boil down to incomplete read.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 12:34:01 GMT) Full text and rfc822 format available.

Message #140 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 14:33:26 +0200
> From: Ihor Radchenko <yantar92 <at> posteo.net>
> Cc: njackson <at> posteo.net, 75209 <at> debbugs.gnu.org
> Date: Sat, 15 Mar 2025 11:20:36 +0000
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Ping! How can we make some progress with this issue?
> 
> Any chance to make `insert-file-contents' (or a new function) atomic?

As a temporary measure to investigate this bug, maybe.  Otherwise, no.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 12:41:01 GMT) Full text and rfc822 format available.

Message #143 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 12:39:23 +0000
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Any chance to make `insert-file-contents' (or a new function) atomic?
>
> As a temporary measure to investigate this bug, maybe.  Otherwise, no.

Then, I am kind of out of ideas.
The problems appear to be writing+reading the same file simultaneously
by different Emacs processes. The fact that Emacs reads the file
partially without ever notifying Elisp caller that something is off
sounds like a fundamental problem that cannot be addressed on Elisp level.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 12:46:01 GMT) Full text and rfc822 format available.

Message #146 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 14:45:44 +0200
> From: Ihor Radchenko <yantar92 <at> posteo.net>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
> Date: Sat, 15 Mar 2025 12:19:07 +0000
> 
> Ihor Radchenko <yantar92 <at> posteo.net> writes:
> 
> > Never returned likely means that it threw an error.
> > Most likely because
> >
> > (setf (alist-get before-init-time alist nil nil #'equal)
> >             (current-time))
> > will fail when ALIST is nil. I should fix this.
> 
> I was wrong. if ALIST is nil, the above code should work just fine.
> So, it looks like all the problems boil down to incomplete read.

Could the code which writes the file do it in atomic fashion,
i.e. write a temporary file, then rename the original file, then move
the new to the original's name, then delete the original file?

(On Posix filesystems moving a file to the original name is an atomic
operation, but not on MS-Windows, which is why I suggest a slightly
more complicated procedure.)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 12:57:02 GMT) Full text and rfc822 format available.

Message #149 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 12:55:22 +0000
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

> Could the code which writes the file do it in atomic fashion,
> i.e. write a temporary file, then rename the original file, then move
> the new to the original's name, then delete the original file?
>
> (On Posix filesystems moving a file to the original name is an atomic
> operation, but not on MS-Windows, which is why I suggest a slightly
> more complicated procedure.)

See the attached diff.
It should be applied on top of previous patches (or on top of the latest
Org main).

[test.diff (text/x-patch, inline)]
diff --git a/lisp/org-persist.el b/lisp/org-persist.el
index a639699d93..202c5e645b 100644
--- a/lisp/org-persist.el
+++ b/lisp/org-persist.el
@@ -506,7 +506,12 @@ (defun org-persist--write-elisp-file
             (let ((pp-use-max-width nil)) ; Emacs bug#58687
               (pp data (current-buffer)))
           (prin1 data (current-buffer))))
-      (rename-file tmp-file file 'overwrite)
+      (let ((tmp-file-2 (make-temp-file "org-persist-")))
+        ;; Just renaming may still not be atomic on Windows, so we do
+        ;; a bit more complex juggle.
+        (rename-file file tmp-file-2)
+        (rename-file tmp-file file)
+        (delete-file tmp-file-2))
       (org-persist--display-time
        (- (float-time) start-time)
        "Writing to %S" file))))
[Message part 3 (text/plain, inline)]
-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 13:17:02 GMT) Full text and rfc822 format available.

Message #152 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 75209 <at> debbugs.gnu.org, njackson <at> posteo.net
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 15:16:33 +0200
> From: Ihor Radchenko <yantar92 <at> posteo.net>
> Cc: njackson <at> posteo.net, 75209 <at> debbugs.gnu.org
> Date: Sat, 15 Mar 2025 12:55:22 +0000
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Could the code which writes the file do it in atomic fashion,
> > i.e. write a temporary file, then rename the original file, then move
> > the new to the original's name, then delete the original file?
> >
> > (On Posix filesystems moving a file to the original name is an atomic
> > operation, but not on MS-Windows, which is why I suggest a slightly
> > more complicated procedure.)
> 
> See the attached diff.
> It should be applied on top of previous patches (or on top of the latest
> Org main).

Thanks, I hope the OP could test this and tell if it makes the problem
go away for good.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 14:41:02 GMT) Full text and rfc822 format available.

Message #155 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Ihor Radchenko <yantar92 <at> posteo.net>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 14:40:00 +0000
At 15:16 +0200 on Saturday 2025-03-15, Eli Zaretskii wrote:

>> From: Ihor Radchenko <yantar92 <at> posteo.net>
>> Cc: njackson <at> posteo.net, 75209 <at> debbugs.gnu.org
>> Date: Sat, 15 Mar 2025 12:55:22 +0000
>> 
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>> 
>> > Could the code which writes the file do it in atomic fashion,
>> > i.e. write a temporary file, then rename the original file, then move
>> > the new to the original's name, then delete the original file?
>> >
>> > (On Posix filesystems moving a file to the original name is an atomic
>> > operation, but not on MS-Windows, which is why I suggest a slightly
>> > more complicated procedure.)
>> 
>> See the attached diff.
>> It should be applied on top of previous patches (or on top of the latest
>> Org main).
>
> Thanks, I hope the OP could test this and tell if it makes the problem
> go away for good.

I have applied the new patch and will report back.

I don't think it will help as I only have GNU/Linux systems at the
moment and I think on such systems the earlier patch already made
the write atomic.

I think what is needed is for the read to be atomic.  I might be
completely wrong, but what it seems is happening is that one
instance of Emacs starts a read and reads the first part of the
existing file, then the other instance of Emacs writes the file
(atomically), and then the first instance continues its read,
getting the end of the new file.  So it gets the beginning of one
version of the file and the end of another.

Oughtn't there to be some sort locking mechanism so that the
instance doing the read can lock the file and then the instance that
is about to write the file can see that it shouldn't write it then
and wait until the lock is cleared -- or something along those
lines?





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 15:53:04 GMT) Full text and rfc822 format available.

Message #158 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: yantar92 <at> posteo.net, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 17:52:03 +0200
> From: "N. Jackson" <njackson <at> posteo.net>
> Cc: Ihor Radchenko <yantar92 <at> posteo.net>,  75209 <at> debbugs.gnu.org
> Date: Sat, 15 Mar 2025 14:40:00 +0000
> 
> At 15:16 +0200 on Saturday 2025-03-15, Eli Zaretskii wrote:
> 
> >> See the attached diff.
> >> It should be applied on top of previous patches (or on top of the latest
> >> Org main).
> >
> > Thanks, I hope the OP could test this and tell if it makes the problem
> > go away for good.
> 
> I have applied the new patch and will report back.
> 
> I don't think it will help as I only have GNU/Linux systems at the
> moment and I think on such systems the earlier patch already made
> the write atomic.

Yes, I see that now, see below.

> I think what is needed is for the read to be atomic.

It is already atomic, because, as we've already established, the file
is smaller than 16KB, the size of the chunks read by
insert-file-contents in one go.

And if the replacement of the file is atomic, then how the file is
read cannot possibly matter.

Note: I'm not talking about writing to a file that is being read at
the same time, I'm talking about writing to a temporary file, and then
renaming that temporary file into the original name when all of the
new data has been written completely.  On Posix filesystems, as long
as the original file is open in some application, renaming another
file into that original one will NOT delete the original file or
replace it; instead, it will unlink the original file's data from its
directory entry, thus allowing the application that had it open to
keep reading from the original (now outdated) data.  Any application
that will attempt to open the original file's name will get the new
data.  IOW, the original data is still there, but it cannot be
accessed by any application that didn't have it open before the
rename.

> I might be
> completely wrong, but what it seems is happening is that one
> instance of Emacs starts a read and reads the first part of the
> existing file, then the other instance of Emacs writes the file
> (atomically), and then the first instance continues its read,
> getting the end of the new file.  So it gets the beginning of one
> version of the file and the end of another.

That cannot happen in the rename method described above.  So if the
problem still happens, there's some other factor at work here.

> Oughtn't there to be some sort locking mechanism so that the
> instance doing the read can lock the file and then the instance that
> is about to write the file can see that it shouldn't write it then
> and wait until the lock is cleared -- or something along those
> lines?

See above: Posix filesystems make that unnecessary.

Hmm... but now I see that the previous code already renamed the file
with the new data after the new data was completely written, is that
right?  If so, I don't think this last change will help, and we need
to understand how come rename-file doesn't already solve this problem.

Is the file being read and written a regular file or a symlink?

One thing to try is to let-bind write-region-inhibit-fsync to a nil
value around the code which writes the data to the org-persist file.
Maybe we have some tricky race condition between the filesystem
flushing its buffer after one instance of Emacs wrote the file, and
the other instance of Emacs that opens the file for reading.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 16:07:01 GMT) Full text and rfc822 format available.

Message #161 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 75209 <at> debbugs.gnu.org, "N. Jackson" <njackson <at> posteo.net>
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 16:05:44 +0000
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

> Hmm... but now I see that the previous code already renamed the file
> with the new data after the new data was completely written, is that
> right?  If so, I don't think this last change will help, and we need
> to understand how come rename-file doesn't already solve this problem.

Yes, it is right.
You suggested using write to temporary file -> rename method earlier and
I implemented it.

> Is the file being read and written a regular file or a symlink?

Regular file.

> One thing to try is to let-bind write-region-inhibit-fsync to a nil
> value around the code which writes the data to the org-persist file.
> Maybe we have some tricky race condition between the filesystem
> flushing its buffer after one instance of Emacs wrote the file, and
> the other instance of Emacs that opens the file for reading.

The attached diff does it. Need to test.

[enable-fsync.diff (text/x-patch, inline)]
diff --git a/lisp/org-persist.el b/lisp/org-persist.el
index a639699d93..c0a0dd53d6 100644
--- a/lisp/org-persist.el
+++ b/lisp/org-persist.el
@@ -474,7 +474,7 @@ (defun org-persist--write-elisp-file
   ;;
   ;; To read more about this, see the comments in Emacs's fileio.c, in
   ;; particular the large comment block in init_fileio.
-  (let ((write-region-inhibit-fsync t)
+  (let ((write-region-inhibit-fsync nil)
         ;; We set UTF-8 here and in `org-persist--read-elisp-file'
         ;; to avoid the overhead from `find-auto-coding'.
         (coding-system-for-write 'emacs-internal)
[Message part 3 (text/plain, inline)]
-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 17:39:02 GMT) Full text and rfc822 format available.

Message #164 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 17:38:24 +0000
At 16:05 +0000 on Saturday 2025-03-15, Ihor Radchenko wrote:
>
> The attached diff does it. Need to test.
>
> diff --git a/lisp/org-persist.el b/lisp/org-persist.el
> index a639699d93..c0a0dd53d6 100644
> --- a/lisp/org-persist.el
> +++ b/lisp/org-persist.el
> @@ -474,7 +474,7 @@ (defun org-persist--write-elisp-file
>    ;;
>    ;; To read more about this, see the comments in Emacs's fileio.c, in
>    ;; particular the large comment block in init_fileio.
> -  (let ((write-region-inhibit-fsync t)
> +  (let ((write-region-inhibit-fsync nil)
>          ;; We set UTF-8 here and in `org-persist--read-elisp-file'
>          ;; to avoid the overhead from `find-auto-coding'.
>          (coding-system-for-write 'emacs-internal)

I will test this diff and report back.

By the way, the previous diff (for the atomic write juggle for
Windows systems) results in errors like these:

Error running timer `org-persist--refresh-gc-lock': (file-already-exists "File already exists" "/tmp/org-persist-1p8805")
Error running timer `org-persist--refresh-gc-lock': (file-already-exists "File already exists" "/tmp/org-persist-CkTLDc")
Error running timer `org-persist--refresh-gc-lock': (file-already-exists "File already exists" "/tmp/org-persist-K9qMAt")
Error running timer `org-persist--refresh-gc-lock': (file-already-exists "File already exists" "/tmp/org-persist-ho8aMe")
Error running timer `org-persist--refresh-gc-lock': (file-already-exists "File already exists" "/tmp/org-persist-2Kz2sK")

I think you need the 'overwrite flag in the call to rename-file.
I.e.:

      (let ((tmp-file-2 (make-temp-file "org-persist-")))
        ;; Just renaming may still not be atomic on Windows, so we do
        ;; a bit more complex juggle.
        (rename-file file tmp-file-2 'overwrite)
        (rename-file tmp-file file 'overwrite)
        (delete-file tmp-file-2)))






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 17:45:02 GMT) Full text and rfc822 format available.

Message #167 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 17:43:49 +0000
"N. Jackson" <njackson <at> posteo.net> writes:

> By the way, the previous diff (for the atomic write juggle for
> Windows systems) results in errors like these:
>
> Error running timer `org-persist--refresh-gc-lock': (file-already-exists "File already exists" "/tmp/org-persist-1p8805")
> ...
> I think you need the 'overwrite flag in the call to rename-file.
> I.e.:
>
>       (let ((tmp-file-2 (make-temp-file "org-persist-")))
>         ;; Just renaming may still not be atomic on Windows, so we do
>         ;; a bit more complex juggle.
>         (rename-file file tmp-file-2 'overwrite)
>         (rename-file tmp-file file 'overwrite)
>         (delete-file tmp-file-2)))

Ouch. Not 'overwrite flag, but `make-temp-name' instead of `make-temp-file'.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 18:04:01 GMT) Full text and rfc822 format available.

Message #170 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 18:03:00 +0000
At 17:43 +0000 on Saturday 2025-03-15, Ihor Radchenko wrote:
>
> "N. Jackson" <njackson <at> posteo.net> writes:
>
>> By the way, the previous diff (for the atomic write juggle for
>> Windows systems) results in errors like these:
>>
>> Error running timer `org-persist--refresh-gc-lock': (file-already-exists "File already exists" "/tmp/org-persist-1p8805")
>> ...
>> I think you need the 'overwrite flag in the call to rename-file.
>> I.e.:
>>
>>       (let ((tmp-file-2 (make-temp-file "org-persist-")))
>>         ;; Just renaming may still not be atomic on Windows, so we do
>>         ;; a bit more complex juggle.
>>         (rename-file file tmp-file-2 'overwrite)
>>         (rename-file tmp-file file 'overwrite)
>>         (delete-file tmp-file-2)))
>
> Ouch. Not 'overwrite flag, but `make-temp-name' instead of `make-temp-file'.

For tmp-file you are using make-temp-file and then you use
'overwrite.  I suppose it's probably best if tmp-file-2 is handled
similarly.  (Unless there's a reason not to that I'm missing.)
Also, in the doc string:

  There is a race condition between calling `make-temp-name' and
  later creating the file, which opens all kinds of security holes.
  For that reason, you should normally use `make-temp-file' instead.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 15 Mar 2025 18:16:02 GMT) Full text and rfc822 format available.

Message #173 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 15 Mar 2025 18:14:29 +0000
"N. Jackson" <njackson <at> posteo.net> writes:

>>>       (let ((tmp-file-2 (make-temp-file "org-persist-")))
>>>         ;; Just renaming may still not be atomic on Windows, so we do
>>>         ;; a bit more complex juggle.
>>>         (rename-file file tmp-file-2 'overwrite)
>>>         (rename-file tmp-file file 'overwrite)
>>>         (delete-file tmp-file-2)))
>>
>> Ouch. Not 'overwrite flag, but `make-temp-name' instead of `make-temp-file'.
>
> For tmp-file you are using make-temp-file and then you use
> 'overwrite.  I suppose it's probably best if tmp-file-2 is handled
> similarly.  (Unless there's a reason not to that I'm missing.)

The whole point (AFAIU) is that we do not want to overwrite the original
file; just move. `make-temp-file' won't work.

> Also, in the doc string:
>
>   There is a race condition between calling `make-temp-name' and
>   later creating the file, which opens all kinds of security holes.
>   For that reason, you should normally use `make-temp-file' instead.

Not sure. AFAIU, the race condition is someone else create an actual
file in place of the generated file name. But we have no delay between
the operations here, so there is no difference.
That said, I have no preference about the best approach here.
The diff is just for testing.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 20 Apr 2025 17:26:05 GMT) Full text and rfc822 format available.

Message #176 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 20 Apr 2025 17:24:58 +0000
"N. Jackson" <njackson <at> posteo.net> writes:

>> -  (let ((write-region-inhibit-fsync t)
>> +  (let ((write-region-inhibit-fsync nil)
>>          ;; We set UTF-8 here and in `org-persist--read-elisp-file'
>>          ;; to avoid the overhead from `find-auto-coding'.
>>          (coding-system-for-write 'emacs-internal)
>
> I will test this diff and report back.

It has been a while since the last update in this thread.
Does it mean that the diff fixed the problem for good?

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Thu, 24 Apr 2025 22:16:01 GMT) Full text and rfc822 format available.

Message #179 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Thu, 24 Apr 2025 22:15:11 +0000
At 17:24 +0000 on Sunday 2025-04-20, Ihor Radchenko wrote:
>
> "N. Jackson" <njackson <at> posteo.net> writes:
>
>>> -  (let ((write-region-inhibit-fsync t)
>>> +  (let ((write-region-inhibit-fsync nil)
>>>          ;; We set UTF-8 here and in `org-persist--read-elisp-file'
>>>          ;; to avoid the overhead from `find-auto-coding'.
>>>          (coding-system-for-write 'emacs-internal)
>>
>> I will test this diff and report back.
>
> It has been a while since the last update in this thread.
> Does it mean that the diff fixed the problem for good?

I haven't seen the bug since installing the change quoted above that
let binds `write-region-inhibit-fsync' to nil.

Unfortunately, however, there is a confounding factor because around
the same time as I made this change, I switched from mainly using a
very slow desktop system to using a considerably faster laptop.


A slight tangent follows about the patch to do the atomic write
juggle for Windows systems.

At 12:55 +0000 on Saturday 2025-03-15, Ihor Radchenko wrote:
>
> diff --git a/lisp/org-persist.el b/lisp/org-persist.el
> index a639699d93..202c5e645b 100644
> --- a/lisp/org-persist.el
> +++ b/lisp/org-persist.el
> @@ -506,7 +506,12 @@ (defun org-persist--write-elisp-file
>              (let ((pp-use-max-width nil)) ; Emacs bug#58687
>                (pp data (current-buffer)))
>            (prin1 data (current-buffer))))
> -      (rename-file tmp-file file 'overwrite)
> +      (let ((tmp-file-2 (make-temp-file "org-persist-")))
> +        ;; Just renaming may still not be atomic on Windows, so we do
> +        ;; a bit more complex juggle.
> +        (rename-file file tmp-file-2)
> +        (rename-file tmp-file file)
> +        (delete-file tmp-file-2))
>        (org-persist--display-time
>         (- (float-time) start-time)
>         "Writing to %S" file))))

At 17:43 +0000 on Saturday 2025-03-15, Ihor Radchenko wrote:
>
> `make-temp-name' instead of `make-temp-file'.

After the fix using `make-temp-name' instead of `make-temp-file',
there are still two problems with the code:

Problem 1. `rename-file' produces an error if the file doesn't exist
-- which it doesn't the first time the code runs.

Problem 2. Unlike `make-temp-file', `make-temp-name' requires an
absolute file name otherwise the file ends up in the current
directory, whatever that happens to be at the time of the write.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Fri, 25 Apr 2025 17:16:02 GMT) Full text and rfc822 format available.

Message #182 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Fri, 25 Apr 2025 17:14:06 +0000
"N. Jackson" <njackson <at> posteo.net> writes:

>>>> -  (let ((write-region-inhibit-fsync t)
>>>> +  (let ((write-region-inhibit-fsync nil)
>>>>          ;; We set UTF-8 here and in `org-persist--read-elisp-file'
>>>>          ;; to avoid the overhead from `find-auto-coding'.
>>>>          (coding-system-for-write 'emacs-internal)
>>>
>>> I will test this diff and report back.
>>
>> It has been a while since the last update in this thread.
>> Does it mean that the diff fixed the problem for good?
>
> I haven't seen the bug since installing the change quoted above that
> let binds `write-region-inhibit-fsync' to nil.
>
> Unfortunately, however, there is a confounding factor because around
> the same time as I made this change, I switched from mainly using a
> very slow desktop system to using a considerably faster laptop.

Hmm. Then, may you go back to no patches at all?
If we find that `write-region-inhibit-fsync' has undesired side effects,
it will be an important piece of information - its default value has
been changed in the whole Emacs recently on the grounds that fsync makes
no difference on modern systems.

> A slight tangent follows about the patch to do the atomic write
> juggle for Windows systems.
> ...
> ...
> After the fix using `make-temp-name' instead of `make-temp-file',
> there are still two problems with the code:
>
> Problem 1. `rename-file' produces an error if the file doesn't exist
> -- which it doesn't the first time the code runs.
>
> Problem 2. Unlike `make-temp-file', `make-temp-name' requires an
> absolute file name otherwise the file ends up in the current
> directory, whatever that happens to be at the time of the write.

True. That was a very approximate patch.
However, I am not sure if that patch will be needed at all. Maybe all
the problems were related to your old system + fsync, for example.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sat, 26 Apr 2025 22:11:01 GMT) Full text and rfc822 format available.

Message #185 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: "N. Jackson" <njackson <at> posteo.net>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sat, 26 Apr 2025 22:10:19 +0000
At 17:14 +0000 on Friday 2025-04-25, Ihor Radchenko wrote:
>
> "N. Jackson" <njackson <at> posteo.net> writes:
>
>>>>> -  (let ((write-region-inhibit-fsync t)
>>>>> +  (let ((write-region-inhibit-fsync nil)
>>>>>          ;; We set UTF-8 here and in `org-persist--read-elisp-file'
>>>>>          ;; to avoid the overhead from `find-auto-coding'.
>>>>>          (coding-system-for-write 'emacs-internal)
>>
>> I haven't seen the bug since installing the change quoted above that
>> let binds `write-region-inhibit-fsync' to nil.
>>
>> Unfortunately, however, there is a confounding factor because around
>> the same time as I made this change, I switched from mainly using a
>> very slow desktop system to using a considerably faster laptop.
>
> Hmm. Then, may you go back to no patches at all?
>
> If we find that `write-region-inhibit-fsync' has undesired side
> effects, it will be an important piece of information - its
> default value has been changed in the whole Emacs recently on the
> grounds that fsync makes no difference on modern systems.

Yes, after I made my last post here I built a fresh Emacs 30.1 on
this faster system.  I wanted to see if the bug happens on this
system without any of the patches.

After lunch today I set the system into suspend and on resume this
evening I saw the bug, the same as I was seeing it on my other
system before the patches:

  ⛔ Warning (emacs): Emacs reader failed to read data in
  "/home/nlj/.cache/org-persist/gc-lock.eld". The error was: "End of
  file during parsing"

(This was after my main Emacs session had been up for about 1 day,
8 hours, 16 minutes, 25 seconds.)


FWIW, I still have my advice around org-persist--refresh-gc-lock and
below I report the timings of the timer.  The part of the logs shown
show the last regular firing before suspending and the catch up
firings on resume, one of which caused the warning.

Of the two sessions reported below, in this case it was my normal
Emacs session ("Norm") in which the warning appeared, not the Gnus
session.  (Back when I was seeing the bug fairly often, sometimes
the warning would be in the Gnus session, sometimes in my normal
session, and once it was in both.)

In the Gnus session:

  2025-04-26 14:53:59.424097 Gnus entering org-persist--refresh-gc-lock:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In 3599.9813794 s (26637 14807 405558 759000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil
    2025-04-26 14:53:59.429168 Gnus leaving org-persist--refresh-gc-lock normally:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In 3599.976262068 s (26637 14807 405558 759000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil

  2025-04-26 17:16:13.333838 Gnus entering org-persist--refresh-gc-lock:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In -1333.928314059 s (26637 18407 405558 759000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil
    2025-04-26 17:16:13.335319 Gnus leaving org-persist--refresh-gc-lock normally:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In -1333.929791253 s (26637 18407 405558 759000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil

  2025-04-26 17:16:13.336025 Gnus entering org-persist--refresh-gc-lock:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In 2266.069505628 s (26637 22007 405558 759000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil
    2025-04-26 17:16:13.514180 Gnus leaving org-persist--refresh-gc-lock normally:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In 2265.89133824 s (26637 22007 405558 759000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil

In the normal session:

  2025-04-26 14:03:01.561952 Norm entering org-persist--refresh-gc-lock:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In 3599.998112807 s (26637 11749 560131 555000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil
    2025-04-26 14:03:01.564459 Norm leaving org-persist--refresh-gc-lock normally:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In 3599.9956178 s (26637 11749 560131 555000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil

  2025-04-26 17:16:13.443913 Norm entering org-persist--refresh-gc-lock:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In -4391.883804868 s (26637 15349 560131 555000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil
    2025-04-26 17:16:13.561083 Norm leaving org-persist--refresh-gc-lock normally:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In -4392.000986789 s (26637 15349 560131 555000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil

  2025-04-26 17:16:13.561702 Norm entering org-persist--refresh-gc-lock:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In -792.001585113 s (26637 18949 560131 555000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil
    2025-04-26 17:16:13.562635 Norm leaving org-persist--refresh-gc-lock normally:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In -792.002523338 s (26637 18949 560131 555000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil

  2025-04-26 17:16:13.563119 Norm entering org-persist--refresh-gc-lock:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In 2807.996994882 s (26637 22549 560131 555000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil
    2025-04-26 17:16:15.764233 Norm leaving org-persist--refresh-gc-lock normally:
    Timer: org-persist--refresh-gc-lock (nil)
    Due: In 2805.795855986 s (26637 22549 560131 555000)
    Triggered: t		Integral Multiple: nil
    Repeat Delay: 3600	Idle Delay: nil

I must have suspended between about 14:43 and 15:03ish so that on
resume at about 17:16 the normal session had three timers to catch
up but the Gnus session had only two -- so the discrepancy there is
easy to explain.

What I'm surprised by is that all the timers returned normally.
Previously, whenever I've seen the bug and checked this log, there
was a timer that failed to return.  (That is, my "after" advice in
org-persist--refresh-gc-lock failed to run.)

I hope some of this helps.


Going forward, now that it's established that this system _is_
susceptible to the bug (with an unpatched Emacs 30.1), what is the
best test to do next?  Should I restore the atomic write patch or
restore the patch that let binds `write-region-inhibit-fsync' to
nil?

Thanks.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 27 Apr 2025 11:25:01 GMT) Full text and rfc822 format available.

Message #188 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 27 Apr 2025 11:23:37 +0000
"N. Jackson" <njackson <at> posteo.net> writes:

> Yes, after I made my last post here I built a fresh Emacs 30.1 on
> this faster system.  I wanted to see if the bug happens on this
> system without any of the patches.
> ....
> After lunch today I set the system into suspend and on resume this
> evening I saw the bug, the same as I was seeing it on my other
> system before the patches:

Thanks for testing!

> Going forward, now that it's established that this system _is_
> susceptible to the bug (with an unpatched Emacs 30.1), what is the
> best test to do next?  Should I restore the atomic write patch or
> restore the patch that let binds `write-region-inhibit-fsync' to
> nil?

Let's first try the `write-region-inhibit-fsync' patch and see if it
makes the bug disappear.

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75209; Package emacs. (Sun, 08 Jun 2025 09:30:02 GMT) Full text and rfc822 format available.

Message #191 received at 75209 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: "N. Jackson" <njackson <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 75209 <at> debbugs.gnu.org
Subject: Re: bug#75209: 30.0.93; Emacs reader failed to read data in
 "/home/nlj/.cache/org-persist/gc-lock.eld"
Date: Sun, 08 Jun 2025 09:28:09 +0000
Ihor Radchenko <yantar92 <at> posteo.net> writes:

>> Going forward, now that it's established that this system _is_
>> susceptible to the bug (with an unpatched Emacs 30.1), what is the
>> best test to do next?  Should I restore the atomic write patch or
>> restore the patch that let binds `write-region-inhibit-fsync' to
>> nil?
>
> Let's first try the `write-region-inhibit-fsync' patch and see if it
> makes the bug disappear.

It has been a while. Did you have a chance to do some testing?

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




This bug report was last modified 64 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.