Package: emacs;
Reported by: Pip Cet <pipcet <at> protonmail.com>
Date: Wed, 22 Jan 2025 10:20:01 UTC
Severity: normal
Done: Pip Cet <pipcet <at> protonmail.com>
Bug is archived. No further changes may be made.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: bug-gnu-emacs <at> gnu.org, Paul Eggert <eggert <at> cs.ucla.edu> Subject: styled_format stack usage/GC protection Date: Wed, 22 Jan 2025 10:18:32 +0000
This popped up while trying to debug the feature/igc branch and looking into formatted output functions to replace sprintf/snprintf: The function styled_format in editfns.c does this: enum { /* Maximum precision for a %f conversion such that the trailing output digit might be nonzero. Any precision larger than this will not yield useful information. */ USEFUL_PRECISION_MAX = ((1 - LDBL_MIN_EXP) * (FLT_RADIX == 2 || FLT_RADIX == 10 ? 1 : FLT_RADIX == 16 ? 4 : -1)), /* Maximum number of bytes (including terminating null) generated by any format, if precision is no more than USEFUL_PRECISION_MAX. On all practical hosts, %Lf is the worst case. */ SPRINTF_BUFSIZE = (sizeof "-." + (LDBL_MAX_10_EXP + 1) + USEFUL_PRECISION_MAX) }; char initial_buffer[1000 + SPRINTF_BUFSIZE]; USE_SAFE_ALLOCA; sa_avail -= sizeof initial_buffer; On my system, the relevant values are: USEFUL_PRECISION_MAX 16382 SPRINTF_BUFSIZE 21318 MAX_ALLOCA 16384 After the code above executes, sa_avail is -5934. styled_format proceeds to allocate a structure using SAFE_ALLOCA: /* Information recorded for each format spec. */ struct info { /* The corresponding argument, converted to string if conversion was needed. */ Lisp_Object argument; /* The start and end bytepos in the output string. */ ptrdiff_t start, end; /* The start bytepos of the spec in the format string. */ ptrdiff_t fbeg; /* Whether the argument is a string with intervals. */ bool_bf intervals : 1; } *info; info = SAFE_ALLOCA (alloca_size); While the alloca_size value is small, sa_avail is negative when we enter SAFE_ALLOCA, so SAFE_ALLOCA uses xmalloc to allocate the memory on the heap. The structure contains a Lisp_Object. This Lisp_Object must be protected from GC by being present on the C stack if GC can ever happen in this function. SAFE_ALLOCA doesn't protect it. I'm not entirely sure this is a problem, but (let ((print-unreadable-function (lambda (&rest args) (garbage-collect)))) (format "%S" (symbol-function '+))) produces this backtrace: #0 garbage_collect () at alloc.c:6450 #1 0x00005555557f7700 in Fgarbage_collect () at alloc.c:6697 #2 0x000055555582e1ee in eval_sub (form=XIL(0x7ffff4dab073)) at eval.c:2584 #3 0x0000555555828d9f in Fprogn (body=XIL(0)) at eval.c:439 #4 0x0000555555830604 in funcall_lambda (fun=XIL(0x555556045c9d), nargs=2, arg_vector=0x7fffffff6b78) at eval.c:3339 #5 0x000055555582f62d in funcall_general (fun=XIL(0x555556045c9d), numargs=2, args=0x7fffffff6b78) at eval.c:3033 #6 0x000055555582f89a in Ffuncall (nargs=3, args=0x7fffffff6b70) at eval.c:3082 #7 0x0000555555863e8a in print_vectorlike_unreadable (obj=XIL(0x555555ee7765), printcharfun=XIL(0), escapeflag=true, buf=0x7fffffff6ce0 "\220m\377\377\377\177") at print.c:1683 #8 0x0000555555866e5f in print_object (obj=XIL(0x555555ee7765), printcharfun=XIL(0), escapeflag=true) at print.c:2647 #9 0x0000555555862ec2 in print (obj=XIL(0x555555ee7765), printcharfun=XIL(0), escapeflag=true) at print.c:1296 #10 0x0000555555861cef in Fprin1_to_string (object=XIL(0x555555ee7765), noescape=XIL(0), overrides=XIL(0)) at print.c:814 #11 0x000055555581fd14 in styled_format (nargs=2, args=0x7fffffffca20, message=false) at editfns.c:3633 #12 0x000055555581f444 in Fformat (nargs=2, args=0x7fffffffca20) at editfns.c:3370 indicating that GC can happen. The code attempts to protect the current argument by keeping it in a redundant automatic variable: Lisp_Object arg = spec->argument; ... spec->argument = arg = Fprin1_to_string (arg, noescape, Qnil); I don't know whether this approach works; maybe a very smart compiler could eliminate the automatic variable and keep only the copy on the heap for some of the time, but it's unlikely that that's valid while calling GC. In any case, this protects only the current argument; if we detect a multibyte situation too late, we may restart the loop and, I think, reuse info->argument values which were unprotected during GC. The problem on the feature/igc branch may explain some of the crashes we've seen during heavy elisp usage. On the master branch, the problems are: 1. excessive stack usage: exceeding MAX_ALLOCA by creating a large temporary buffer on the stack seems unwise. 2. SAFE_ALLOCA is in use, but on many systems it's impossible to use stack space. 3. GC protection needs to be verified, or fixed if we accept that there is a possibility a Lisp_Object might be unprotected. Note that fixing (2) is strictly optional; however, fixing only (2) would make (3) a latent but still real bug, which may be worse than the current situation. On the feature/igc branch, where protection is definitely required, I'm thinking about a fix. A quick fix would be to replace all elements of struct info by Lisp_Object values and use SAFE_ALLOCA_LISP.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.