Package: emacs;
Reported by: Eli Zaretskii <eliz <at> gnu.org>
Date: Fri, 25 Nov 2022 15:05:02 UTC
Severity: normal
Found in version 29.0.50
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: help-debbugs <at> gnu.org (GNU bug Tracking System) To: Eli Zaretskii <eliz <at> gnu.org> Cc: tracker <at> debbugs.gnu.org Subject: bug#59574: closed (29.0.50; Emacs crashes when using tree-sitter-based mode in an empty buffer) Date: Sat, 26 Nov 2022 14:32:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Sat, 26 Nov 2022 16:31:59 +0200 with message-id <83k03hss68.fsf <at> gnu.org> and subject line Re: bug#59574: 29.0.50; Emacs crashes when using tree-sitter-based mode in an empty buffer has caused the debbugs.gnu.org bug report #59574, regarding 29.0.50; Emacs crashes when using tree-sitter-based mode in an empty buffer to be marked as done. (If you believe you have received this mail in error, please contact help-debbugs <at> gnu.org.) -- 59574: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=59574 GNU Bug Tracking System Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Eli Zaretskii <eliz <at> gnu.org> To: bug-gnu-emacs <at> gnu.org Cc: Yuan Fu <casouri <at> gmail.com> Subject: 29.0.50; Emacs crashes when using tree-sitter-based mode in an empty buffer Date: Fri, 25 Nov 2022 17:04:27 +0200To reproduce: emacs -Q C-x C-f foo.c RET M-x c-ts-mode RET Type "in" Make sure foo.c doesn't exist, so you start from an empty buffer. As soon as you type the second character of "in", there's an assertion violation: treesit.c:1383: Emacs fatal error: assertion failed: end_byte <= BUF_ZV_BYTE (bu ffer) Thread 1 hit Breakpoint 1, terminate_due_to_signal (sig=22, backtrace_limit=2147483647) at emacs.c:427 427 signal (sig, SIG_DFL); (gdb) up #1 0x01230802 in die ( msg=0x18e6778 <DEFAULT_REHASH_SIZE+3288> "end_byte <= BUF_ZV_BYTE (buffer)", file=0x18e5fcc <DEFAULT_REHASH_SIZE+1324> "treesit.c", line=1383) at alloc.c:7697 7697 terminate_due_to_signal (SIGABRT, INT_MAX); (gdb) #2 0x01355636 in treesit_make_ranges (ranges=0x856a778, len=1, buffer=0x7fe94b0) at treesit.c:1383 1383 eassert (end_byte <= BUF_ZV_BYTE (buffer)); (gdb) p end_byte $1 = 4 (gdb) p BUF_ZV_BYTE(buffer) $2 = 3 Interestingly, this only happens once, when the buffer includes exactly 1 byte and an additional character is inserted. If you get past this assertion, further characters can be inserted without any problems, and end_byte always equals BUF_ZV_BYTE. The backtrace is below, if it is interesting. I couldn't figure out where did tree-sitter take the range it returns to us. Yuan, can you describe how does the parser get the range it needs to consider? If I put a breakpoint in treesit-parser-set-included-ranges, the breakpoint never breaks, so this doesn't seem to be how the range is set in this scenario. There's also something strange in treesit_record_change: when it is called for the first time in a buffer which was empty and you insert one character, we bypass the updating of visible_beg and visible_end fields of the Lisp parser object, because XTS_PARSER (lisp_parser)->tree is NULL. But it looks to me that we should still update these two fields regardless, no? Only the call to treesit_tree_edit_1 needs the tree. (I thought that maybe this lack of update explains the assertion, but even if I move the condition to guard only treesit_tree_edit_1, the assertion still happens, so I guess my hypothesis eats dust.) Here's the backtrace I promised: (gdb) bt #0 terminate_due_to_signal (sig=22, backtrace_limit=2147483647) at emacs.c:427 #1 0x01230802 in die ( msg=0x18e6778 <DEFAULT_REHASH_SIZE+3288> "end_byte <= BUF_ZV_BYTE (buffer)", file=0x18e5fcc <DEFAULT_REHASH_SIZE+1324> "treesit.c", line=1383) at alloc.c:7697 #2 0x01355636 in treesit_make_ranges (ranges=0x856a778, len=1, buffer=0x7fe94b0) at treesit.c:1383 #3 0x01353c7e in treesit_call_after_change_functions (old_tree=0x84d9fe0, new_tree=0x856a5d0, parser=XIL(0xa00000000853e4e8)) at treesit.c:859 #4 0x01353fff in treesit_ensure_parsed (parser=XIL(0xa00000000853e4e8)) at treesit.c:906 #5 0x01354ff8 in Ftreesit_parser_root_node (parser=XIL(0xa00000000853e4e8)) at treesit.c:1328 #6 0x012773d2 in funcall_subr (subr=0x1883640 <Streesit_parser_root_node>, numargs=1, args=0x6c10470) at eval.c:3034 #7 0x012e9b92 in exec_byte_code (fun=XIL(0xa00000000850edc8), args_template=256, nargs=1, args=0x6c10390) at bytecode.c:809 #8 0x0127799a in fetch_and_exec_byte_code (fun=XIL(0xa0000000084b0d20), args_template=257, nargs=1, args=0x6c101c8) at eval.c:3081 #9 0x01277ef9 in funcall_lambda (fun=XIL(0xa0000000084b0d20), nargs=1, arg_vector=0x6c101c8) at eval.c:3153 #10 0x01276e66 in funcall_general (fun=XIL(0xa0000000084b0d20), numargs=1, args=0x6c101c8) at eval.c:2945 #11 0x012771eb in Ffuncall (nargs=2, args=0x6c101c0) at eval.c:2995 #12 0x012762ae in run_hook_wrapped_funcall (nargs=2, args=0x6c101c0) at eval.c:2773 #13 0x01276765 in run_hook_with_args (nargs=2, args=0x6c101c0, funcall=0x1276266 <run_hook_wrapped_funcall>) at eval.c:2854 #14 0x012762fd in Frun_hook_wrapped (nargs=2, args=0x6c101c0) at eval.c:2788 #15 0x0127784b in funcall_subr (subr=0x187cf00 <Srun_hook_wrapped>, numargs=2, args=0x6c101c0) at eval.c:3059 #16 0x012e9b92 in exec_byte_code (fun=XIL(0xa0000000061302c4), args_template=514, nargs=2, args=0x6c100f8) at bytecode.c:809 #17 0x0127799a in fetch_and_exec_byte_code (fun=XIL(0xa00000000612fd94), args_template=257, nargs=1, args=0x82ac88) at eval.c:3081 #18 0x01277ef9 in funcall_lambda (fun=XIL(0xa00000000612fd94), nargs=1, arg_vector=0x82ac88) at eval.c:3153 #19 0x01276e66 in funcall_general (fun=XIL(0xa00000000612fd94), numargs=1, args=0x82ac88) at eval.c:2945 #20 0x012771eb in Ffuncall (nargs=2, args=0x82ac80) at eval.c:2995 #21 0x012712a1 in internal_condition_case_n (bfun=0x127709f <Ffuncall>, nargs=2, args=0x82ac80, handlers=XIL(0x30), hfun=0x104286e <safe_eval_handler>) at eval.c:1558 #22 0x01042aa1 in safe__call (inhibit_quit=false, nargs=2, func=XIL(0x47648c4), ap=0x82ad44 "") at xdisp.c:3024 #23 0x01042b1a in safe_call (nargs=2, func=XIL(0x47648c4)) at xdisp.c:3039 #24 0x01042b6e in safe_call1 (fn=XIL(0x47648c4), arg=make_fixnum(1)) at xdisp.c:3050 #25 0x010469d4 in handle_fontified_prop (it=0x82afd0) at xdisp.c:4416 #26 0x010453c7 in handle_stop (it=0x82afd0) at xdisp.c:3951 #27 0x01051ebf in reseat (it=0x82afd0, pos=..., force_p=true) at xdisp.c:7469 #28 0x01044495 in init_iterator (it=0x82afd0, w=0x7958be0, charpos=1, bytepos=1, row=0x7a214a0, base_face_id=DEFAULT_FACE_ID) at xdisp.c:3488 #29 0x010446c3 in start_display (it=0x82afd0, w=0x7958be0, pos=...) at xdisp.c:3568 #30 0x0107c99e in try_window (window=XIL(0xa000000007958be0), pos=..., flags=1) at xdisp.c:20511 #31 0x01079579 in redisplay_window (window=XIL(0xa000000007958be0), just_this_one_p=true) at xdisp.c:19903 #32 0x010706c6 in redisplay_window_1 (window=XIL(0xa000000007958be0)) at xdisp.c:17405 #33 0x0127108e in internal_condition_case_1 ( bfun=0x107066e <redisplay_window_1>, arg=XIL(0xa000000007958be0), handlers=XIL(0xc000000006462abc), hfun=0x10702c6 <redisplay_window_error>) at eval.c:1498 #34 0x0106f10a in redisplay_internal () at xdisp.c:16944 #35 0x0106c163 in redisplay () at xdisp.c:16006 #36 0x01174cf8 in read_char (commandflag=1, map=XIL(0xc000000008096220), prev_event=XIL(0), used_mouse_menu=0x82f41f, end_time=0x0) at keyboard.c:2623 #37 0x0118ec5e in read_key_sequence (keybuf=0x82f6f8, prompt=XIL(0), dont_downcase_last=false, can_return_switch_frame=true, fix_current_buffer=true, prevent_redisplay=false) at keyboard.c:10070 #38 0x0117033d in command_loop_1 () at keyboard.c:1376 #39 0x01270fa4 in internal_condition_case (bfun=0x116fcdc <command_loop_1>, handlers=XIL(0x90), hfun=0x116ecaa <cmd_error>) at eval.c:1474 #40 0x0116f749 in command_loop_2 (handlers=XIL(0x90)) at keyboard.c:1125 #41 0x0126fe2b in internal_catch (tag=XIL(0x10290), func=0x116f712 <command_loop_2>, arg=XIL(0x90)) at eval.c:1197 #42 0x0116f6b4 in command_loop () at keyboard.c:1103 #43 0x0116e70a in recursive_edit_1 () at keyboard.c:712 #44 0x0116e9a8 in Frecursive_edit () at keyboard.c:795 #45 0x0116975d in main (argc=2, argv=0xa428e0) at emacs.c:2523 Lisp Backtrace: "treesit-parser-root-node" (0x6c10470) "treesit-buffer-root-node" (0x6c10388) "treesit-font-lock-fontify-region" (0x6c10300) "font-lock-default-fontify-region" (0x6c10298) "font-lock-fontify-region" (0x6c10230) 0x84b0d20 PVEC_COMPILED "run-hook-wrapped" (0x6c101c0) "jit-lock--run-functions" (0x6c100e8) "jit-lock-fontify-now" (0x6c10058) "jit-lock-function" (0x82ac88) "redisplay_internal (C function)" (0x0) (gdb) In GNU Emacs 29.0.50 (build 2261, i686-pc-mingw32) of 2022-11-25 built on HOME-C4E4A596F7 Repository revision: af545234314601ba3dcd8bf32e0d9b46e1917f79 Repository branch: master Windowing system distributor 'Microsoft Corp.', version 5.1.2600 System Description: Microsoft Windows XP Service Pack 3 (v5.1.0.2600) Configured using: 'configure -C --prefix=/d/usr --with-wide-int --enable-checking=yes,glyphs 'CFLAGS=-O0 -gdwarf-4 -g3'' Configured features: ACL GIF GMP GNUTLS HARFBUZZ JPEG JSON LCMS2 LIBXML2 MODULES NOTIFY W32NOTIFY PDUMPER PNG RSVG SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP XPM ZLIB Important settings: value of $LANG: ENU locale-coding-system: cp1255 Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t global-eldoc-mode: t eldoc-mode: t show-paren-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t line-number-mode: t indent-tabs-mode: t transient-mark-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t Load-path shadows: None found. Features: (shadow sort mail-extr emacsbug message mailcap yank-media puny dired dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg rfc6068 epg-config gnus-util text-property-search time-date subr-x mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader cl-loaddefs cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils rmc iso-transl tooltip cconv eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel dos-w32 ls-lisp disp-table term/w32-win w32-win w32-vars term/common-win tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic indonesian philippine cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite emoji-zwj charscript charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs theme-loaddefs faces cus-face macroexp files window text-properties overlay sha1 md5 base64 format env code-pages mule custom widget keymap hashtable-print-readable backquote threads w32notify w32 lcms2 multi-tty make-network-process emacs) Memory information: ((conses 16 42624 11101) (symbols 48 6278 0) (strings 16 16553 2914) (string-bytes 1 398654) (vectors 16 9312) (vector-slots 8 146415 13640) (floats 8 23 27) (intervals 40 274 97) (buffers 896 10))
[Message part 3 (message/rfc822, inline)]
From: Eli Zaretskii <eliz <at> gnu.org> To: Yuan Fu <casouri <at> gmail.com> Cc: 59574-done <at> debbugs.gnu.org Subject: Re: bug#59574: 29.0.50; Emacs crashes when using tree-sitter-based mode in an empty buffer Date: Sat, 26 Nov 2022 16:31:59 +0200> From: Yuan Fu <casouri <at> gmail.com> > Date: Fri, 25 Nov 2022 19:18:09 -0800 > Cc: 59574 <at> debbugs.gnu.org > > > There's also something strange in treesit_record_change: when it is called > > for the first time in a buffer which was empty and you insert one character, > > we bypass the updating of visible_beg and visible_end fields of the Lisp > > parser object, because XTS_PARSER (lisp_parser)->tree is NULL. But it looks > > to me that we should still update these two fields regardless, no? Only the > > call to treesit_tree_edit_1 needs the tree. (I thought that maybe this lack > > of update explains the assertion, but even if I move the condition to guard > > only treesit_tree_edit_1, the assertion still happens, so I guess my > > hypothesis eats dust.) > > We don’t need to update visible_beg/end in treesit_record_change if tree is NULL, because visible_beg/end represents the range of buffer that the tree sees, so if there is no tree, visible_beg/end can be considered uninitialized. However you are right about needing to update visible_beg/end, but in treesit_ensure_position_synced (I renamed it to treesit_sync_visible_region): that’s where we ensure visible_beg/end equals to BUF_BEGV_BYTE/friends. > > The problem is we don’t update visible_beg/end for the very first parse, when tree is NULL. > > I also added some comments, hopefully they sufficiently explain everything. Thanks, the problem is gone, so I'm closing the bug.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.