GNU bug report logs - #75754
styled_format stack usage/GC protection

Previous Next

Package: emacs;

Reported by: Pip Cet <pipcet <at> protonmail.com>

Date: Wed, 22 Jan 2025 10:20:01 UTC

Severity: normal

Done: Pip Cet <pipcet <at> protonmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Pip Cet <pipcet <at> protonmail.com>
Cc: eggert <at> cs.ucla.edu, michael.albinus <at> gmx.de, 75754 <at> debbugs.gnu.org
Subject: bug#75754: styled_format stack usage/GC protection
Date: Fri, 24 Jan 2025 15:57:45 +0200
> Date: Fri, 24 Jan 2025 12:51:22 +0000
> From: Pip Cet <pipcet <at> protonmail.com>
> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org, Michael Albinus <michael.albinus <at> gmx.de>
> 
> Michael Albinus, could you please C-s ert-how in this message and
> comment briefly if appropriate? I think adding a new defstruct may help
> improve ert in at least three ways, including this one.  Please indicate
> whether this would be *potentially* acceptable; if it is, I'll prepare a
> separate bug.  If it isn't, I might have to think of something else.

I think it will take time for Michael to respond, due to his personal
circumstances.  So if you expect the answers soon, that might not
happen.

> "Eli Zaretskii" <eliz <at> gnu.org> writes:
> 
> >> Date: Thu, 23 Jan 2025 23:58:41 +0000
> >> From: Pip Cet <pipcet <at> protonmail.com>
> >> Cc: eggert <at> cs.ucla.edu, 75754 <at> debbugs.gnu.org
> >>
> >> 1. Reload format_start and format (and end, and format0) after every
> >> call which might have GC'd.  If you think we should do this, please
> >> tell me whether lisp_string_width can GC.
> >
> > It can, if called with the last argument 'true' (because
> > find_automatic_composition calls into Lisp).  Currently, we call it
> > with 'false', so it cannot.
> 
> Is this documented anywhere?

If you mean under what circumstances some internal function calls Lisp
(or can otherwise GC), then no, we don't document that anywhere.  I
just looked at the source code to answer your question.

> Thank you very much for suggesting this: if I hadn't been wrong about
> the first test case being useful (it was), I'd probably have refused.
> While writing this test didn't uncover new bugs, it was an educational
> experience.
> 
> New test:
> 
> (ert-deftest editfns-tests-styled-print ()
>    (let* ((gc-me (copy-sequence "foo"))
>           (print-unreadable-function
> 	   (lambda (&rest args)
>              (make-string 10 10 t)
>              (garbage-collect)
>              (make-string 10 10 t)
>              (prog1 (make-string 100 ?Ā t)
>                (setq gc-me nil)
>                (make-string 10 10 t))))
>           (str "\"[1] ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ ĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀĀ\""))
>      (should (string=
>               (format "%S" (format (copy-sequence "%S %S %S %S") [1]
>                                    (symbol-function '+)
>                                    (symbol-function '*)
>                                    (symbol-function '-)))
>               str))))
> 
> The "problem" writing this test is that string compaction is very often
> idempotent: the only way a string is moved again in a second GC is when
> another string which is *older* than the string moved in the first GC is
> freed.  That's 'gc-me', which must be live during the first call to
> garbage-collect but unreachable for the second call.

I suggest to have at least part of the above in comments to the test,
since these details are so important for the test to be effective.

> While I figured out how to do this for this specific test, it'd be nicer
> to just run make check-gc and stress-test the garbage collector in ways
> known to have caused problems.

Yes; patches welcome.

> (Brief proposal relevant to several issues: If we want to test GC bugs
> more thoroughly, we need a way to tell ert to run them in a special way.
> I proposed extending ert with an ert-how defstruct for the make
> benchmark target, indicating "how" a test is to be run, and implemented
> it, but I don't think I've sent the patch yet.  Using this structure to
> create conditions for testing for GC bugs would be possible, and it
> would also be an easier way to make crash tests print a message before
> they're run.
> 
> Of course, this would in no way imply a decision to use this structure,
> or ERT, for "make benchmark".  Using it for GC testing or indicating
> crash tests would be sufficient reason to add it, IMHO.
> 
> While this needs more thought, if it's obviously unacceptable for a
> simple reason I haven't thought of, please let me know so I won't waste
> time on it.  Of course the final decision will have to be made based on
> an actual patch, but if we can have a quick "no" right now I'd
> appreciate it.)

Well, for now I don't have even a vague idea what you have in mind, so
it is hard for me to answer the question.  How do you extend ERT to be
able to run tests in a way that makes testing GC-related bugs easier?
What are the general ideas for that?

> >> +  char *format_start = SAFE_ALLOCA (formatlen);
> >> +  memcpy (format_start, SSDATA (args[0]), formatlen);
> >>    bool fmt_props = !!string_intervals (args[0]);
> >
> > Does this assume anything about formatlen, like that it doesn't exceed
> > the alloca limit?  If so, should we have an assertion about that?
> 
> No.  We're already past the alloca limit at his point so SAFE_ALLOCA
> will always call xmalloc here (that's the "excessive stack usage" part
> of this bug, which remains unfixed but might be a performance issue in
> rare situations involving very many calls to styled_format which don't
> GC).

IMO, that fact should also be in comments to the code.

> >> feature/igc avoids all of these problems by keeping SDATA pointers
> >> valid, even when the string is resized.
> >
> > Unless the string is GC'ed, right?
> 
> Even if it is GC'd.
> 
> The SDATA will stay around while there are live pointers to it.  The
> SDATA doesn't keep the string alive, but there's no way to get a
> reference to the now-dead Lisp_String structure from the SDATA, so this
> won't cause bugs.

Even if the code modifies the string after GC?




This bug report was last modified 162 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.