Package: emacs;
Reported by: Pip Cet <pipcet <at> protonmail.com>
Date: Mon, 26 May 2025 13:17:02 UTC
Severity: normal
Found in version 31.0.50
Message #20 received at 78590 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 78590 <at> debbugs.gnu.org Subject: Re: bug#78590: 31.0.50; print_object calls strout unsafely Date: Tue, 27 May 2025 18:46:24 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Tue, 27 May 2025 18:05:00 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: 78590 <at> debbugs.gnu.org >> >> "Eli Zaretskii" <eliz <at> gnu.org> writes: >> >> > I agree that the current code is buggy because it keeps referencing a >> > Lisp string's data via a 'char *' pointer while calling a Lisp >> > function PRINTCHARFUN. >> >> How about calling Fmapc (printcharfun, string) at that point, from >> print_string? > > If that works, fine. But didn't you say below that print_string does > something we don't want to do in this case? Or am I misunderstanding > what you mean? Sorry, that wasn't quite clear. I also forgot to include the proposed patch: From 43a27ab64584d98cf10a3f1616701ceae0cdb510 Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet <at> protonmail.com> Subject: [PATCH 1/2] Fix unsafe SDATA usage in print.c (bug#78590) * src/print.c (print_string_1): Renamed from 'print_string', with an extra argument to disable nonascii escaping. Call 'Fmapc' when passed a Lisp function. (print_string): New function. (print_object): Use 'print_string_1', not 'strout'. --- src/print.c | 33 ++++++++++++--------------------- test/src/print-tests.el | 18 ++++++++++++++++++ 2 files changed, 30 insertions(+), 21 deletions(-) diff --git a/src/print.c b/src/print.c index b17ec337f70..cd8316a4048 100644 --- a/src/print.c +++ b/src/print.c @@ -469,18 +469,18 @@ strout (const char *ptr, ptrdiff_t size, ptrdiff_t size_byte, because printing one char can relocate. */ static void -print_string (Lisp_Object string, Lisp_Object printcharfun) +print_string_1 (Lisp_Object string, Lisp_Object printcharfun, bool escape_nonascii) { if (EQ (printcharfun, Qt) || NILP (printcharfun)) { ptrdiff_t chars; - if (print_escape_nonascii) + if (escape_nonascii) string = string_escape_byte8 (string); if (STRING_MULTIBYTE (string)) chars = SCHARS (string); - else if (! print_escape_nonascii + else if (! escape_nonascii && (EQ (printcharfun, Qt) ? ! NILP (BVAR (&buffer_defaults, enable_multibyte_characters)) : ! NILP (BVAR (current_buffer, enable_multibyte_characters)))) @@ -524,25 +524,16 @@ print_string (Lisp_Object string, Lisp_Object printcharfun) } else { - /* Otherwise, string may be relocated by printing one char. - So re-fetch the string address for each character. */ - ptrdiff_t i; - ptrdiff_t size = SCHARS (string); - ptrdiff_t size_byte = SBYTES (string); - if (size == size_byte) - for (i = 0; i < size; i++) - printchar (SREF (string, i), printcharfun); - else - for (i = 0; i < size_byte; ) - { - /* Here, we must convert each multi-byte form to the - corresponding character code before handing it to PRINTCHAR. */ - int len, ch = string_char_and_length (SDATA (string) + i, &len); - printchar (ch, printcharfun); - i += len; - } + /* printcharfun is a Lisp function. */ + Fmapc (printcharfun, string); } } + +static void +print_string (Lisp_Object string, Lisp_Object printcharfun) +{ + print_string_1 (string, printcharfun, print_escape_nonascii); +} DEFUN ("write-char", Fwrite_char, Swrite_char, 1, 2, 0, doc: /* Output character CHARACTER to stream PRINTCHARFUN. @@ -2282,7 +2273,7 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) } else if (STRINGP (num)) { - strout (SSDATA (num), SCHARS (num), SBYTES (num), printcharfun); + print_string_1 (num, printcharfun, false); goto next_obj; } } diff --git a/test/src/print-tests.el b/test/src/print-tests.el index af57311135b..036248fd091 100644 --- a/test/src/print-tests.el +++ b/test/src/print-tests.el @@ -540,5 +540,23 @@ test-print-unreadable-function-buffer (should (eq callback-buffer buffer)) (should (equal str "tata")))) +(ert-deftest test-print-number-realloc () + ;; Test for bug#78590. Note that this may in rare cases crash unfixed + ;; Emacs versions. + (let ((print-circle t) + (print-number-table (make-hash-table)) + (print-continuous-numbering t) + (str "yy") + (outstr "")) + (garbage-collect) + (ignore (make-string 100 ?a)) + (puthash str (make-string 3 ?x) print-number-table) + (prin1 str + (lambda (c) + (setq outstr (concat outstr (string c))) + (garbage-collect) + (ignore (make-string 100 ?b)))) + (should (equal outstr "xxx")))) + (provide 'print-tests) ;;; print-tests.el ends here -- 2.48.1 >> It does, however, also transform the string by escaping some characters, >> which seems incorrect to me in these circumstances (and would mean >> changing existing behavior). > > So calling print_string as it is will not do what we want, right? The new function print_string_1 has an extra flag parameter; all existing callers set it to print_escape_nonascii by calling print_string as before, the new caller directly calls print_string_1 and forcibly disables byte8 escaping (and enables multibyte conversion where necessary). >> > If this requires adding an additional argument to strout, to pass the >> > Lisp string, which will be used only in the last 'else' clause there, >> > that's not difficult, right? >> >> But we already have print_string, which does almost exactly that, right? >> All it needs is an extra argument to avoid the byte8 escaping (unless >> I'm wrong and we actually want that to happen to our string here). > > Sure, that works as well, I think (though I took only a cursory look > at print_string, so maybe I missed something). > > The important part, I think, is to avoid duplicating the string if we > can avoid that. I believe you're right. (It's probably okay to copy strings when printcharfun is Qt, redisplay will be much slower than creating the copy, but when output isn't for the echo area there's no reason to believe the replacement strings are short). Pip
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.