Package: emacs;
Reported by: Pip Cet <pipcet <at> protonmail.com>
Date: Wed, 22 Jan 2025 10:20:01 UTC
Severity: normal
Done: Pip Cet <pipcet <at> protonmail.com>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: eggert <at> cs.ucla.edu, michael.albinus <at> gmx.de, 75754 <at> debbugs.gnu.org Subject: bug#75754: styled_format stack usage/GC protection Date: Fri, 24 Jan 2025 15:02:20 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Fri, 24 Jan 2025 12:51:22 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org, Michael Albinus <michael.albinus <at> gmx.de> >> >> Michael Albinus, could you please C-s ert-how in this message and >> comment briefly if appropriate? I think adding a new defstruct may help >> improve ert in at least three ways, including this one. Please indicate >> whether this would be *potentially* acceptable; if it is, I'll prepare a >> separate bug. If it isn't, I might have to think of something else. > > I think it will take time for Michael to respond, due to his personal > circumstances. So if you expect the answers soon, that might not > happen. Will file a new bug, with an initial message hopefully clear enough to save Michael some time for when he gets around to it. >> "Eli Zaretskii" <eliz <at> gnu.org> writes: >> >> >> Date: Thu, 23 Jan 2025 23:58:41 +0000 >> >> From: Pip Cet <pipcet <at> protonmail.com> >> >> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org >> >> >> >> 1. Reload format_start and format (and end, and format0) after every >> >> call which might have GC'd. If you think we should do this, please >> >> tell me whether lisp_string_width can GC. >> > >> > It can, if called with the last argument 'true' (because >> > find_automatic_composition calls into Lisp). Currently, we call it >> > with 'false', so it cannot. >> >> Is this documented anywhere? > > If you mean under what circumstances some internal function calls Lisp > (or can otherwise GC), then no, we don't document that anywhere. I > just looked at the source code to answer your question. (This is an honest question: do you believe Emacs would be worse off if we added an ASSUME_NO_GC macro like ASSUME_NO_GC (true); p = SDATA (str); 532 lines of code here use p for the last time ASSUME_NO_GC (false); This seems to me to be readable (better names/API welcome, of course), even if it's a nop. Faster than writing down these assumptions in English, and it does offer the ability to eventually find such cases automatically. That would have found this bug if Fprin1 contained /* Due to print-unreadable-function, all calls to prin1 may call Lisp, which may GC. Ensure we're not accidentally called from a critical section. */ #ifdef ENABLE_CHECK_NO_GC if (assume_no_gc >= 0) emacs_abort (); #endif It'd make life easier for relative Emacs newcomers like myself who do not know by heart which functions call GC and which don't.) >> Thank you very much for suggesting this: if I hadn't been wrong about >> the first test case being useful (it was), I'd probably have refused. >> While writing this test didn't uncover new bugs, it was an educational >> experience. >> >> New test: >> >> (ert-deftest editfns-tests-styled-print () >> (let* ((gc-me (copy-sequence "foo")) >> (print-unreadable-function >> (lambda (&rest args) >> (make-string 10 10 t) >> (garbage-collect) >> (make-string 10 10 t) >> (prog1 (make-string 100 ?Ā t) >> (setq gc-me nil) >> (make-string 10 10 t)))) >> (str "\"[1] ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ\"")) >> (should (string= >> (format "%S" (format (copy-sequence "%S %S %S %S") [1] >> (symbol-function '+) >> (symbol-function '*) >> (symbol-function '-))) >> str)))) >> >> The "problem" writing this test is that string compaction is very often >> idempotent: the only way a string is moved again in a second GC is when >> another string which is *older* than the string moved in the first GC is >> freed. That's 'gc-me', which must be live during the first call to >> garbage-collect but unreachable for the second call. > > I suggest to have at least part of the above in comments to the test, > since these details are so important for the test to be effective. Yes, definitely, if it is to be useful. Without comments this is so much line noise. > What are the general ideas for that? This doesn't belong here; I'll file a new bug even if I can't provide such details now (unless there already is one), as a wishlist item. >> >> feature/igc avoids all of these problems by keeping SDATA pointers >> >> valid, even when the string is resized. >> > >> > Unless the string is GC'ed, right? >> >> Even if it is GC'd. >> >> The SDATA will stay around while there are live pointers to it. The >> SDATA doesn't keep the string alive, but there's no way to get a >> reference to the now-dead Lisp_String structure from the SDATA, so this >> won't cause bugs. > > Even if the code modifies the string after GC? If the GC free'd the string, but not the SDATA structure, we cannot regenerate the string to call resize_string_data. So, no, this cannot happen. Note that resizing a live string while you have a live SDATA pointer to it might result in the SDATA pointer referring to the pre-modification SDATA, not the current one. Thanks for continuing to ask such questions! It's important we don't forget about one such question and cause crashes, particularly on feature/igc where there's a shortage of testers (which is not anyone's fault but mine; I'm working on making it more testable)! Pip
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.