Package: emacs;
Reported by: Daniel Colascione <dancol <at> dancol.org>
Date: Fri, 28 Mar 2025 01:08:02 UTC
Severity: normal
To reply to this bug, email your comments to 77325 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Fri, 28 Mar 2025 01:08:02 GMT) Full text and rfc822 format available.Daniel Colascione <dancol <at> dancol.org>
:bug-gnu-emacs <at> gnu.org
.
(Fri, 28 Mar 2025 01:08:03 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Daniel Colascione <dancol <at> dancol.org> To: bug-gnu-emacs <at> gnu.org Subject: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Thu, 27 Mar 2025 21:07:02 -0400
Somehow, the buffer changes underneath json_parse. We pass an out-of-bounds position to SET_PT_BOTH (position, byte), which either asserts or crashes. Not sure how the buffer could have changed --- maybe a handler-bind? The JSON parser doesn't seem to do anything except allocate and signal. The buffer itself is plenty big enough --- it's just that the accessible region has somehow shrunk to nothing. (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 2.1 frame #0: 0x000000010026d7a0 emacs`die(msg="charpos <= ZV && charpos >= BEGV", file="intervals.c", line=1873) at alloc.c:7450:12 frame #1: 0x000000010037e3dc emacs`set_point_both(charpos=8, bytepos=8) at intervals.c:1873:3 frame #2: 0x00000001003baf1c emacs`SET_PT_BOTH(position=8, byte=8) at buffer.h:182:3 * frame #3: 0x00000001003bbadc emacs`Fjson_parse_buffer(nargs=6, args=(struct Lisp_Symbol *) $4 = 0x0000000270365e38) at json.c:1779:3 frame #4: 0x00000001002bcd64 emacs`eval_sub(form=(struct Lisp_Cons *) $116 = 0x000000010831a440) at eval.c:2564:10 -> 1873 eassert (charpos <= ZV && charpos >= BEGV); 1874 1875 have_overlays = buffer_has_overlays (); 1876 (lldb) print charpos (ptrdiff_t) 8 (lldb) print *current_thread->m_current_buffer (buffer) { header = (size = 4611686018645684300) name_ = 0x000000011ed08374 (struct Lisp_String *) $120 = 0x000000011ed08370 last_name_ = 0x000000011ed08374 (struct Lisp_String *) $120 = 0x000000011ed08370 filename_ = NULL directory_ = 0x0000000103708724 (struct Lisp_String *) $124 = 0x0000000103708720 backed_up_ = NULL save_length_ = 0x0000000000000002 (EMACS_INT) $125 = 0 auto_save_file_name_ = NULL read_only_ = NULL mark_ = 0x000000011e0113e5 (struct Lisp_Marker *) $128 = 0x000000011e0113e0 local_var_alist_ = 0x000000012078c0b3 (struct Lisp_Cons *) $129 = 0x000000012078c0b0 major_mode_ = 0x000000001f379d28 (struct Lisp_Symbol *) $131 = 0x000000011f8e8bc0 local_minor_modes_ = 0x000000013042cff3 (struct Lisp_Cons *) $133 = 0x000000013042cff0 mode_name_ = 0x000000013ee94fe4 (struct Lisp_String *) $134 = 0x000000013ee94fe0 mode_line_format_ = 0x0000000110141db3 (struct Lisp_Cons *) $135 = 0x0000000110141db0 header_line_format_ = NULL tab_line_format_ = NULL keymap_ = 0x00000001205abc33 (struct Lisp_Cons *) $136 = 0x00000001205abc30 abbrev_table_ = 0x000000013f116a2d (struct Lisp_Obarray *) $139 = 0x000000013f116a28 syntax_table_ = 0x000000013f116805 (struct Lisp_Vector *) $140 = 0x000000013f116800 category_table_ = 0x000000010269d35d (struct Lisp_Vector *) $142 = 0x000000010269d358 tab_width_ = 0x0000000000000022 (EMACS_INT) $143 = 8 fill_column_ = 0x000000000000011a (EMACS_INT) $144 = 70 left_margin_ = 0x0000000000000002 (EMACS_INT) $125 = 0 auto_fill_function_ = NULL downcase_table_ = 0x000000010266a17d (struct Lisp_Vector *) $145 = 0x000000010266a178 upcase_table_ = 0x00000001026590dd (struct Lisp_Vector *) $146 = 0x00000001026590d8 case_canon_table_ = 0x000000010267b45d (struct Lisp_Vector *) $147 = 0x000000010267b458 case_eqv_table_ = 0x000000010266a3bd (struct Lisp_Vector *) $148 = 0x000000010266a3b8 truncate_lines_ = 0x0000000000000030 (struct Lisp_Symbol *) $150 = 0x000000010056eec8 word_wrap_ = NULL ctl_arrow_ = 0x0000000000000030 (struct Lisp_Symbol *) $150 = 0x000000010056eec8 bidi_display_reordering_ = 0x0000000000000030 (struct Lisp_Symbol *) $150 = 0x000000010056eec8 bidi_paragraph_direction_ = NULL bidi_paragraph_separate_re_ = NULL bidi_paragraph_start_re_ = NULL selective_display_ = NULL selective_display_ellipses_ = 0x0000000000000030 (struct Lisp_Symbol *) $150 = 0x000000010056eec8 overwrite_mode_ = NULL abbrev_mode_ = NULL display_table_ = NULL mark_active_ = NULL enable_multibyte_characters_ = 0x0000000000000030 (struct Lisp_Symbol *) $150 = 0x000000010056eec8 buffer_file_coding_system_ = 0x00000000000127e0 (struct Lisp_Symbol *) $153 = 0x0000000100581678 file_format_ = NULL auto_save_file_format_ = 0x0000000000000030 (struct Lisp_Symbol *) $150 = 0x000000010056eec8 cache_long_scans_ = 0x0000000000000030 (struct Lisp_Symbol *) $150 = 0x000000010056eec8 width_table_ = NULL pt_marker_ = NULL begv_marker_ = NULL zv_marker_ = NULL point_before_scroll_ = NULL file_truename_ = NULL invisibility_spec_ = 0x0000000000000030 (struct Lisp_Symbol *) $150 = 0x000000010056eec8 last_selected_window_ = 0x0000000148490c15 (struct window *) $157 = 0x0000000148490c10 display_count_ = 0x0000000000000006 (EMACS_INT) $159 = 1 left_margin_cols_ = 0x0000000000000002 (EMACS_INT) $125 = 0 right_margin_cols_ = 0x0000000000000002 (EMACS_INT) $125 = 0 left_fringe_width_ = NULL right_fringe_width_ = NULL fringes_outside_margins_ = NULL scroll_bar_width_ = NULL scroll_bar_height_ = NULL vertical_scroll_bar_type_ = 0x0000000000000030 (struct Lisp_Symbol *) $150 = 0x000000010056eec8 horizontal_scroll_bar_type_ = 0x0000000000000030 (struct Lisp_Symbol *) $150 = 0x000000010056eec8 indicate_empty_lines_ = NULL indicate_buffer_boundaries_ = NULL fringe_indicator_alist_ = 0x0000000102656e8b (struct Lisp_Cons *) $160 = 0x0000000102656e88 fringe_cursor_alist_ = 0x00000001026560db (struct Lisp_Cons *) $161 = 0x00000001026560d8 display_time_ = 0x0000000110c15e73 (struct Lisp_Cons *) $162 = 0x0000000110c15e70 scroll_up_aggressively_ = NULL scroll_down_aggressively_ = NULL cursor_type_ = NULL extra_line_spacing_ = NULL ts_parser_list_ = NULL text_conversion_style_ = NULL cursor_in_non_selected_windows_ = 0x0000000000009ba0 (struct Lisp_Symbol *) $164 = 0x0000000100578a38 own_text = { beg = 0x0000000130088000 "" gpt = 1 z = 74465 gpt_byte = 1 z_byte = 76476 gap_size = 60247 modiff = 15338 chars_modiff = 15338 save_modiff = 1 overlay_modiff = 757 compact = 1 beg_unchanged = 0 end_unchanged = 0 unchanged_modified = 4374 overlay_unchanged_modified = 755 intervals = 0x000000011f38caa8 markers = 0x000000011ffa4288 inhibit_shrinking = false redisplay = true } text = 0x000000011e011268 pt = 1 pt_byte = 1 begv = 1 begv_byte = 1 zv = 1 zv_byte = 1 base_buffer = NULL indirections = 0 window_count = 1 local_flags = "\0\0\0\0\0\0\0\0\U00000001\0\0\0\0\U00000001\0\U00000001\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\U00000001" modtime = (tv_sec = 0, tv_nsec = -2) modtime_size = -1 auto_save_modified = 0 display_error_modiff = 0 auto_save_failure_time = 0 last_window_start = 1 newline_cache = 0x0000600000666300 width_run_cache = NULL bidi_paragraph_cache = 0x000060000067afc0 prevent_redisplay_optimizations_p = true clip_changed = true inhibit_buffer_hooks = false long_line_optimizations_p = false overlays = 0x00006000013c2a40 undo_list_ = 0x0000000108985be3 (struct Lisp_Cons *) $166 = 0x0000000108985be0 (lldb) print p.point_of_current_line (ptrdiff_t) 1 (lldb) print p.current_column (ptrdiff_t) 6 input_begin = 0x0000000130096b57 "\n 6 pass\n 620 skip\n [...] input_current = 0x0000000130096b5e " pass\n 620 skip\n The actual JSON we're parsing appears to be mangled somehow --- raw newlines embedded in the output instead of being encapsulated inside a string --- but that's a separate bug.
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Fri, 28 Mar 2025 02:44:02 GMT) Full text and rfc822 format available.Message #8 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Daniel Colascione <dancol <at> dancol.org> Cc: 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Fri, 28 Mar 2025 02:43:36 +0000
"Daniel Colascione" <dancol <at> dancol.org> writes: > Somehow, the buffer changes underneath json_parse. We pass an Do we know that the buffer changed after we entered json-parse-buffer? It looks to me like the buffer was narrowed to nothing before we called json-parse-buffer, like this: (with-temp-buffer (insert "3") (narrow-to-region (point-min) (point-min)) (message "%S" (json-parse-buffer))) json.c proceeds to read past ZV, all the way to Z, then hits the assertion just as it did for you, so this code currently causes a crash. Do you still see the crash if you change json-parse-buffer to honor buffer narrowing, like this? From 073c00135e6f0e213fc8671fc0a52a67ee5b56ce Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet <at> protonmail.com> Subject: [PATCH] Respect narrowed buffers when parsing JSON (bug#77325) * src/json.c (Fjson_parse_buffer): Only read to ZV, not all the way to Z. --- src/json.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/json.c b/src/json.c index f438d191bde..a0480718ca8 100644 --- a/src/json.c +++ b/src/json.c @@ -1757,12 +1757,16 @@ DEFUN ("json-parse-buffer", Fjson_parse_buffer, Sjson_parse_buffer, unsigned char *end = GPT_ADDR; unsigned char *secondary_begin = NULL; unsigned char *secondary_end = NULL; - if (GPT_ADDR < Z_ADDR) + if (GPT_ADDR < ZV_ADDR) { secondary_begin = GAP_END_ADDR; if (secondary_begin < PT_ADDR) secondary_begin = PT_ADDR; - secondary_end = Z_ADDR; + secondary_end = ZV_ADDR; + } + else if (ZV_ADDR < GPT_ADDR) + { + end = ZV_ADDR; } json_parser_init (&p, conf, begin, end, secondary_begin, -- 2.48.1 > input_begin = 0x0000000130096b57 "\n 6 pass\n 620 skip\n [...] > input_current = 0x0000000130096b5e " pass\n 620 skip\n > > The actual JSON we're parsing appears to be mangled somehow --- raw > newlines embedded in the output instead of being encapsulated inside > a string --- but that's a separate bug. Certainly doesn't look like JSON, but maybe that's why it's outside the accessible region? Pip
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Fri, 28 Mar 2025 11:06:02 GMT) Full text and rfc822 format available.Message #11 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Daniel Colascione <dancol <at> dancol.org> Cc: 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Fri, 28 Mar 2025 14:05:29 +0300
> From: Daniel Colascione <dancol <at> dancol.org> > Date: Thu, 27 Mar 2025 21:07:02 -0400 > > > Somehow, the buffer changes underneath json_parse. We pass an > out-of-bounds position to SET_PT_BOTH (position, byte), which either > asserts or crashes. Not sure how the buffer could have changed --- > maybe a handler-bind? The JSON parser doesn't seem to do anything > except allocate and signal. Can you post a recipe for reproducing this? > own_text = { > beg = 0x0000000130088000 "" > gpt = 1 > z = 74465 > gpt_byte = 1 > z_byte = 76476 > gap_size = 60247 > modiff = 15338 > chars_modiff = 15338 > save_modiff = 1 > overlay_modiff = 757 > compact = 1 > beg_unchanged = 0 > end_unchanged = 0 > unchanged_modified = 4374 > overlay_unchanged_modified = 755 > intervals = 0x000000011f38caa8 > markers = 0x000000011ffa4288 > inhibit_shrinking = false > redisplay = true > } > text = 0x000000011e011268 > pt = 1 > pt_byte = 1 > begv = 1 > begv_byte = 1 > zv = 1 > zv_byte = 1 This seems to tell that the buffer is narrowed to an empty region. Does that make sense in the scenario where you had this problem? > (lldb) print p.point_of_current_line > (ptrdiff_t) 1 > (lldb) print p.current_column > (ptrdiff_t) 6 > > input_begin = 0x0000000130096b57 "\n 6 pass\n 620 skip\n [...] > input_current = 0x0000000130096b5e " pass\n 620 skip\n Give BEGV and ZV, this seems to mean we are accessing beyond the restriction, which should never happen.
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Fri, 28 Mar 2025 14:13:02 GMT) Full text and rfc822 format available.Message #14 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Daniel Colascione <dancol <at> dancol.org> To: Eli Zaretskii <eliz <at> gnu.org> Cc: Pip Cet <pipcet <at> protonmail.com>, 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Fri, 28 Mar 2025 10:11:58 -0400
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Daniel Colascione <dancol <at> dancol.org> >> Date: Thu, 27 Mar 2025 21:07:02 -0400 >> >> >> Somehow, the buffer changes underneath json_parse. We pass an >> out-of-bounds position to SET_PT_BOTH (position, byte), which either >> asserts or crashes. Not sure how the buffer could have changed --- >> maybe a handler-bind? The JSON parser doesn't seem to do anything >> except allocate and signal. > > Can you post a recipe for reproducing this? Didn't have a good repro. Pip's fix works though. I was barking up the wrong tree: I'm parsing JSON out of a process buffer in a loop and dispatching commands as they come in. One of these commands switched the buffer, so in the next iteration of the loop, I started parsing JSON out of some other random buffer. It just so happened that other buffer was narrowed, so we crashed. I'll let Pip do the honors of checking in the fix if he wants. I initially thought a GC finalizer might have been switching the buffer, but turns out GC doesn't actually run for me while parsing. IGC does GC all the time --- but it's not observable because we pump messages from the GC only at dedicated points and run GC hooks only in response to these messages. however, notice that on the IGC branch that we pump GC messages, including finalizer callbacks, on the allocation path for, e.g. various pseudovectors. That'll cause Lisp to run where it wouldn't have before. Is that going to be a problem? ISTM we can either pump messages in maybe_quit() or just rely on igc_on_idle().
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Fri, 28 Mar 2025 15:06:01 GMT) Full text and rfc822 format available.Message #17 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Daniel Colascione <dancol <at> dancol.org> Cc: Eli Zaretskii <eliz <at> gnu.org>, 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Fri, 28 Mar 2025 15:05:22 +0000
"Daniel Colascione" <dancol <at> dancol.org> writes: > Eli Zaretskii <eliz <at> gnu.org> writes: > >>> From: Daniel Colascione <dancol <at> dancol.org> >>> Date: Thu, 27 Mar 2025 21:07:02 -0400 >>> >>> >>> Somehow, the buffer changes underneath json_parse. We pass an >>> out-of-bounds position to SET_PT_BOTH (position, byte), which either >>> asserts or crashes. Not sure how the buffer could have changed --- >>> maybe a handler-bind? The JSON parser doesn't seem to do anything >>> except allocate and signal. >> >> Can you post a recipe for reproducing this? > > Didn't have a good repro. Pip's fix works though. I was barking up > the wrong tree: I'm parsing JSON out of a process buffer in a loop and > dispatching commands as they come in. One of these commands switched the > buffer, so in the next iteration of the loop, I started parsing JSON out > of some other random buffer. It just so happened that other buffer was > narrowed, so we crashed. I'll let Pip do the honors of checking in the > fix if he wants. Eli, is that okay? I'll simplify the else branch, which has an unnecessary "else if" in the original patch. > I initially thought a GC finalizer might have been switching the buffer, > but turns out GC doesn't actually run for me while parsing. json.c assumes no GC on the master branch, because it doesn't protect its object workspace (and possibly for other reasons). > IGC does GC all the time --- but it's not observable because we pump > messages from the GC only at dedicated points and run GC hooks only in > response to these messages. however, notice that on the IGC branch that > we pump GC messages, including finalizer callbacks, on the allocation > path for, e.g. various pseudovectors. That'll cause Lisp to run where > it wouldn't have before. Is that going to be a problem? ISTM we can > either pump messages in maybe_quit() or just rely on igc_on_idle(). Oh, I forgot about that one! I'll open a new bug so this gets a number: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=77338 Pip
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Fri, 28 Mar 2025 16:23:02 GMT) Full text and rfc822 format available.Message #20 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Daniel Colascione <dancol <at> dancol.org> Cc: pipcet <at> protonmail.com, 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Fri, 28 Mar 2025 19:21:57 +0300
> From: Daniel Colascione <dancol <at> dancol.org> > Cc: 77325 <at> debbugs.gnu.org, Pip Cet <pipcet <at> protonmail.com> > Date: Fri, 28 Mar 2025 10:11:58 -0400 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > Didn't have a good repro. Pip's fix works though. I was barking up > the wrong tree: I'm parsing JSON out of a process buffer in a loop and > dispatching commands as they come in. One of these commands switched the > buffer, so in the next iteration of the loop, I started parsing JSON out > of some other random buffer. It just so happened that other buffer was > narrowed, so we crashed. I'll let Pip do the honors of checking in the > fix if he wants. I think we should simply replace each BEG with BEGV and each Z with ZV. Emacs should never look outside of the current restriction. > IGC does GC all the time --- but it's not observable because we pump > messages from the GC only at dedicated points and run GC hooks only in > response to these messages. however, notice that on the IGC branch that > we pump GC messages, including finalizer callbacks, on the allocation > path for, e.g. various pseudovectors. You mean, we'll run Lisp as part of displaying messages? Or what do you mean by "pump GC messages"?
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Fri, 28 Mar 2025 16:31:01 GMT) Full text and rfc822 format available.Message #23 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Fri, 28 Mar 2025 19:29:14 +0300
> Date: Fri, 28 Mar 2025 15:05:22 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: Eli Zaretskii <eliz <at> gnu.org>, 77325 <at> debbugs.gnu.org > > "Daniel Colascione" <dancol <at> dancol.org> writes: > > > Didn't have a good repro. Pip's fix works though. I was barking up > > the wrong tree: I'm parsing JSON out of a process buffer in a loop and > > dispatching commands as they come in. One of these commands switched the > > buffer, so in the next iteration of the loop, I started parsing JSON out > > of some other random buffer. It just so happened that other buffer was > > narrowed, so we crashed. I'll let Pip do the honors of checking in the > > fix if he wants. > > Eli, is that okay? I'll simplify the else branch, which has an > unnecessary "else if" in the original patch. Can we discuss why you don't simply replace Z with ZV and BEG with BEGV? I'm not sure I understand some parts of the change you proposed.
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Fri, 28 Mar 2025 17:01:02 GMT) Full text and rfc822 format available.Message #26 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Fri, 28 Mar 2025 17:00:24 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Fri, 28 Mar 2025 15:05:22 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: Eli Zaretskii <eliz <at> gnu.org>, 77325 <at> debbugs.gnu.org >> >> "Daniel Colascione" <dancol <at> dancol.org> writes: >> >> > Didn't have a good repro. Pip's fix works though. I was barking up >> > the wrong tree: I'm parsing JSON out of a process buffer in a loop and >> > dispatching commands as they come in. One of these commands switched the >> > buffer, so in the next iteration of the loop, I started parsing JSON out >> > of some other random buffer. It just so happened that other buffer was >> > narrowed, so we crashed. I'll let Pip do the honors of checking in the >> > fix if he wants. >> >> Eli, is that okay? I'll simplify the else branch, which has an >> unnecessary "else if" in the original patch. > > Can we discuss why you don't simply replace Z with ZV and BEG with > BEGV? I'm not sure I understand some parts of the change you > proposed. Because the code assumes GPT <= Z, and GPT <= ZV isn't always true. Pip
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Sat, 29 Mar 2025 07:16:01 GMT) Full text and rfc822 format available.Message #29 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Sat, 29 Mar 2025 10:14:51 +0300
> Date: Fri, 28 Mar 2025 17:00:24 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org > > "Eli Zaretskii" <eliz <at> gnu.org> writes: > > >> Date: Fri, 28 Mar 2025 15:05:22 +0000 > >> From: Pip Cet <pipcet <at> protonmail.com> > >> Cc: Eli Zaretskii <eliz <at> gnu.org>, 77325 <at> debbugs.gnu.org > >> > >> "Daniel Colascione" <dancol <at> dancol.org> writes: > >> > >> > Didn't have a good repro. Pip's fix works though. I was barking up > >> > the wrong tree: I'm parsing JSON out of a process buffer in a loop and > >> > dispatching commands as they come in. One of these commands switched the > >> > buffer, so in the next iteration of the loop, I started parsing JSON out > >> > of some other random buffer. It just so happened that other buffer was > >> > narrowed, so we crashed. I'll let Pip do the honors of checking in the > >> > fix if he wants. > >> > >> Eli, is that okay? I'll simplify the else branch, which has an > >> unnecessary "else if" in the original patch. > > > > Can we discuss why you don't simply replace Z with ZV and BEG with > > BEGV? I'm not sure I understand some parts of the change you > > proposed. > > Because the code assumes GPT <= Z, and GPT <= ZV isn't always true. Sorry, I don't understand: if the gap is beyond ZV, then there's no "secondary" region for json.c's purposes, which AFAIU is the only thing json-parse-buffer needs to know. In addition, the value of 'end' should be limited to not exceed ZV_ADDR. Or what am I missing? IOW, why does json-parse-buffer ignore the restriction? No other primitive does, with rare exceptions that are explicitly documented.
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Sat, 29 Mar 2025 11:55:02 GMT) Full text and rfc822 format available.Message #32 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Sat, 29 Mar 2025 11:53:49 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Fri, 28 Mar 2025 17:00:24 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org >> >> "Eli Zaretskii" <eliz <at> gnu.org> writes: >> >> >> Date: Fri, 28 Mar 2025 15:05:22 +0000 >> >> From: Pip Cet <pipcet <at> protonmail.com> >> >> Cc: Eli Zaretskii <eliz <at> gnu.org>, 77325 <at> debbugs.gnu.org >> >> >> >> "Daniel Colascione" <dancol <at> dancol.org> writes: >> >> >> >> > Didn't have a good repro. Pip's fix works though. I was barking up >> >> > the wrong tree: I'm parsing JSON out of a process buffer in a loop and >> >> > dispatching commands as they come in. One of these commands switched the >> >> > buffer, so in the next iteration of the loop, I started parsing JSON out >> >> > of some other random buffer. It just so happened that other buffer was >> >> > narrowed, so we crashed. I'll let Pip do the honors of checking in the >> >> > fix if he wants. >> >> >> >> Eli, is that okay? I'll simplify the else branch, which has an >> >> unnecessary "else if" in the original patch. >> > >> > Can we discuss why you don't simply replace Z with ZV and BEG with >> > BEGV? I'm not sure I understand some parts of the change you >> > proposed. >> >> Because the code assumes GPT <= Z, and GPT <= ZV isn't always true. > > Sorry, I don't understand: if the gap is beyond ZV, then there's no > "secondary" region for json.c's purposes, which AFAIU is the only > thing json-parse-buffer needs to know. It's about the primary selection, not the secondary one. The code currently reads: unsigned char *begin = PT_ADDR; unsigned char *end = GPT_ADDR; unsigned char *secondary_begin = NULL; unsigned char *secondary_end = NULL; if (GPT_ADDR < Z_ADDR) { secondary_begin = GAP_END_ADDR; if (secondary_begin < PT_ADDR) secondary_begin = PT_ADDR; secondary_end = Z_ADDR; } Simply replacing Z_ADDR by ZV_ADDR would still set up the primary region to be [PT, GPT]. If GPT > ZV, that would mean that the primary region extends beyond ZV, which would mean we parse buffer text that should be inaccessible. So, in this case, we need to limit the primary region to end at ZV_ADDR. That's what my patch does. The code for the secondary region is correct, if unnecessary because sending up a paradoxical [GPT, ZV] range if ZV < GPT wouldn't hurt. > In addition, the value of 'end' should be limited to not exceed > ZV_ADDR. Or what am I missing? That's what my patch does, yes. > IOW, why does json-parse-buffer ignore the restriction? No other > primitive does, with rare exceptions that are explicitly documented. I assumed it was an accident, and that's why my patch changes it to respect the restriction. Pip
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Sat, 29 Mar 2025 12:39:02 GMT) Full text and rfc822 format available.Message #35 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Sat, 29 Mar 2025 15:38:42 +0300
> Date: Sat, 29 Mar 2025 11:53:49 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org > > "Eli Zaretskii" <eliz <at> gnu.org> writes: > > Simply replacing Z_ADDR by ZV_ADDR would still set up the primary region > to be [PT, GPT]. If GPT > ZV, that would mean that the primary region > extends beyond ZV, which would mean we parse buffer text that should be > inaccessible. > > So, in this case, we need to limit the primary region to end at ZV_ADDR. > That's what my patch does. > > The code for the secondary region is correct, if unnecessary because > sending up a paradoxical [GPT, ZV] range if ZV < GPT wouldn't hurt. > > > In addition, the value of 'end' should be limited to not exceed > > ZV_ADDR. Or what am I missing? > > That's what my patch does, yes. > > > IOW, why does json-parse-buffer ignore the restriction? No other > > primitive does, with rare exceptions that are explicitly documented. > > I assumed it was an accident, and that's why my patch changes it to > respect the restriction. Sorry, I've misread your patch. It's fine (but please don't use braces for a 1-line block).
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Sat, 29 Mar 2025 15:38:01 GMT) Full text and rfc822 format available.Message #38 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Sat, 29 Mar 2025 15:37:17 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Sat, 29 Mar 2025 11:53:49 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org >> >> "Eli Zaretskii" <eliz <at> gnu.org> writes: >> >> Simply replacing Z_ADDR by ZV_ADDR would still set up the primary region >> to be [PT, GPT]. If GPT > ZV, that would mean that the primary region >> extends beyond ZV, which would mean we parse buffer text that should be >> inaccessible. >> >> So, in this case, we need to limit the primary region to end at ZV_ADDR. >> That's what my patch does. >> >> The code for the secondary region is correct, if unnecessary because >> sending up a paradoxical [GPT, ZV] range if ZV < GPT wouldn't hurt. >> >> > In addition, the value of 'end' should be limited to not exceed >> > ZV_ADDR. Or what am I missing? >> >> That's what my patch does, yes. >> >> > IOW, why does json-parse-buffer ignore the restriction? No other >> > primitive does, with rare exceptions that are explicitly documented. >> >> I assumed it was an accident, and that's why my patch changes it to >> respect the restriction. > > Sorry, I've misread your patch. No problem at all. The code does seem a bit complicated for what it's trying to achieve, to be honest. I think it'd be clearer just to write: unsigned char *begin = PT_ADDR; unsigned char *end = min (GPT_ADDR, ZV_ADDR); unsigned char *secondary_begin = min (GAP_END_ADDR, ZV_ADDR)); unsigned char *secondary_end = ZV_ADDR; json_parser_init (&p, conf, begin, end, secondary_begin, secondary_end); json_parser_init fixes up secondary_begin and secondary_end to be NULL pointers in this case. ("min (GAP_END_ADDR, ZV_ADDR)" could also be replaced by just "GAP_END_ADDR", since json_parser_init fixes paradoxical ranges itself, but I think we should avoid ever creating those where possible.) But if that's not okay, let me know and I'll push the other patch. Pip
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Sat, 29 Mar 2025 15:48:02 GMT) Full text and rfc822 format available.Message #41 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Sat, 29 Mar 2025 18:47:07 +0300
> Date: Sat, 29 Mar 2025 15:37:17 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org > > The code does seem a bit complicated for what it's trying to achieve, to > be honest. I think it'd be clearer just to write: > > unsigned char *begin = PT_ADDR; > unsigned char *end = min (GPT_ADDR, ZV_ADDR); > unsigned char *secondary_begin = min (GAP_END_ADDR, ZV_ADDR)); > unsigned char *secondary_end = ZV_ADDR; > > json_parser_init (&p, conf, begin, end, secondary_begin, > secondary_end); > > json_parser_init fixes up secondary_begin and secondary_end to be NULL > pointers in this case. I'd prefer the secondary_* values to be set NULL explicitly, otherwise it is hard to understand what the code does without looking deep into the subroutines of json_parse.
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Sun, 30 Mar 2025 10:41:02 GMT) Full text and rfc822 format available.Message #44 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Sun, 30 Mar 2025 10:40:42 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Sat, 29 Mar 2025 15:37:17 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org >> >> The code does seem a bit complicated for what it's trying to achieve, to >> be honest. I think it'd be clearer just to write: >> >> unsigned char *begin = PT_ADDR; >> unsigned char *end = min (GPT_ADDR, ZV_ADDR); >> unsigned char *secondary_begin = min (GAP_END_ADDR, ZV_ADDR)); >> unsigned char *secondary_end = ZV_ADDR; >> >> json_parser_init (&p, conf, begin, end, secondary_begin, >> secondary_end); >> >> json_parser_init fixes up secondary_begin and secondary_end to be NULL >> pointers in this case. > > I'd prefer the secondary_* values to be set NULL explicitly, otherwise > it is hard to understand what the code does without looking deep into > the subroutines of json_parse. Sorry, it took me a while. I was surprised by how tricky this is, and I admit I was confused briefly because ZV == GPT doesn't make ZV_ADDR == GPT_ADDR. I can see that most places in the code manipulate byte positions, not pointers, and I think I understand the reason now :-) From dd30d60a7992da0cef4e09a8a7e8989249dab4f1 Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet <at> protonmail.com> Subject: [PATCH] Respect narrowed buffers when parsing JSON (bug#77325) * src/json.c (Fjson_insert): Simplify 'memcpy' argument. (Fjson_parse_buffer): Only read to ZV, not all the way to Z. * test/src/json-tests.el (with-all-gap-positions-in-temp-buffer): New macro. (json-parse-buffer/restricted): New test. --- src/json.c | 13 +++++++------ test/src/json-tests.el | 26 ++++++++++++++++++++++++++ 2 files changed, 33 insertions(+), 6 deletions(-) diff --git a/src/json.c b/src/json.c index f438d191bde..5795c582ce0 100644 --- a/src/json.c +++ b/src/json.c @@ -641,7 +641,7 @@ DEFUN ("json-insert", Fjson_insert, Sjson_insert, 1, MANY, move_gap_both (PT, PT_BYTE); if (GAP_SIZE < jo.size) make_gap (jo.size - GAP_SIZE); - memcpy ((char *) BEG_ADDR + PT_BYTE - BEG_BYTE, jo.buf, jo.size); + memcpy (GPT_ADDR, jo.buf, jo.size); /* No need to keep allocation beyond this point. */ unbind_to (count, Qnil); @@ -1754,15 +1754,16 @@ DEFUN ("json-parse-buffer", Fjson_parse_buffer, Sjson_parse_buffer, struct json_parser p; unsigned char *begin = PT_ADDR; - unsigned char *end = GPT_ADDR; + unsigned char *end = (GPT == ZV) ? GPT_ADDR : ZV_ADDR; unsigned char *secondary_begin = NULL; unsigned char *secondary_end = NULL; - if (GPT_ADDR < Z_ADDR) + if (PT == ZV) + begin = end = NULL; + else if (GPT > PT && GPT < ZV && GAP_SIZE > 0) { + end = GPT_ADDR; secondary_begin = GAP_END_ADDR; - if (secondary_begin < PT_ADDR) - secondary_begin = PT_ADDR; - secondary_end = Z_ADDR; + secondary_end = ZV_ADDR; } json_parser_init (&p, conf, begin, end, secondary_begin, diff --git a/test/src/json-tests.el b/test/src/json-tests.el index 94b6cfcffca..1cb667ddeac 100644 --- a/test/src/json-tests.el +++ b/test/src/json-tests.el @@ -315,6 +315,32 @@ json-parse-buffer/trailing (should-not (bobp)) (should (looking-at-p (rx " [456]" eos))))) +(defmacro with-all-gap-positions-in-temp-buffer (string &rest body) + "Create a temporary buffer containing STRING, and evaluate BODY +with each possible gap position. +See also `with-temp-buffer'." + `(with-temp-buffer + (insert ,string) + (dotimes (i (- (point-max) (point-min))) + (goto-char (- (point-max) i)) + (insert "X") + (delete-region (1- (point)) (point)) + ,@body))) + +(ert-deftest json-parse-buffer/restricted () + (with-all-gap-positions-in-temp-buffer + "[123] [456] [789]" + (pcase-dolist (`((,beg . ,end) ,result) + '(((7 . 12) [456]) + ((1 . 6) [123]) + ((13 . 18) [789]))) + (goto-char beg) + (narrow-to-region beg end) + (should (equal (json-parse-buffer) result)) + (should (= (point) end)) + (should-error (json-parse-buffer) :type 'json-end-of-file) + (widen)))) + (ert-deftest json-parse-with-custom-null-and-false-objects () (let* ((input "{ \"abc\" : [9, false] , \"def\" : null }") -- 2.48.1
bug-gnu-emacs <at> gnu.org
:bug#77325
; Package emacs
.
(Sun, 30 Mar 2025 11:46:02 GMT) Full text and rfc822 format available.Message #47 received at 77325 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org Subject: Re: bug#77325: Crash in Fjson_parse_buffer: ZV changes underneath it? Date: Sun, 30 Mar 2025 14:45:20 +0300
> Date: Sun, 30 Mar 2025 10:40:42 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: dancol <at> dancol.org, 77325 <at> debbugs.gnu.org > > "Eli Zaretskii" <eliz <at> gnu.org> writes: > > > I'd prefer the secondary_* values to be set NULL explicitly, otherwise > > it is hard to understand what the code does without looking deep into > > the subroutines of json_parse. > > Sorry, it took me a while. I was surprised by how tricky this is, and I > admit I was confused briefly because ZV == GPT doesn't make ZV_ADDR == > GPT_ADDR. I can see that most places in the code manipulate byte > positions, not pointers, and I think I understand the reason now :-) Yes, it's a bit tricky. > >From dd30d60a7992da0cef4e09a8a7e8989249dab4f1 Mon Sep 17 00:00:00 2001 > From: Pip Cet <pipcet <at> protonmail.com> > Subject: [PATCH] Respect narrowed buffers when parsing JSON (bug#77325) > > * src/json.c (Fjson_insert): Simplify 'memcpy' argument. > (Fjson_parse_buffer): Only read to ZV, not all the way to Z. > * test/src/json-tests.el (with-all-gap-positions-in-temp-buffer): > New macro. > (json-parse-buffer/restricted): New test. Thanks, LGTM.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.