GNU bug report logs - #71012
30.0.50; tree-sitter crash

Previous Next

Package: emacs;

Reported by: Helmut Eller <eller.helmut <at> gmail.com>

Date: Fri, 17 May 2024 13:40:01 UTC

Severity: normal

Found in version 30.0.50

Done: Yuan Fu <casouri <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Yuan Fu <casouri <at> gmail.com>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#71012: closed (30.0.50; tree-sitter crash)
Date: Thu, 06 Jun 2024 05:33:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Wed, 5 Jun 2024 22:31:04 -0700
with message-id <792DB4FC-EB1E-4094-A4CF-14500DDA82C1 <at> gmail.com>
and subject line Re: bug#71012: 30.0.50; tree-sitter crash
has caused the debbugs.gnu.org bug report #71012,
regarding 30.0.50; tree-sitter crash
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
71012: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=71012
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Helmut Eller <eller.helmut <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 30.0.50; tree-sitter crash
Date: Fri, 17 May 2024 15:39:27 +0200
[Message part 3 (text/plain, inline)]
The code in the attached file tries to parse src/lisp.h but crashes
while printing the result:  emacs --batch -l ts-bug.el

[ts-bug.el (application/emacs-lisp, attachment)]
[Message part 5 (text/plain, inline)]

Program received signal SIGSEGV, Segmentation fault.
0x000055555575c33a in buf_bytepos_to_charpos (b=0x555556074c60, bytepos=1)
    at marker.c:343
343       eassert (bytepos >= BUF_Z_BYTE (b)
(gdb) ba 10
#0  0x000055555575c33a in buf_bytepos_to_charpos (b=0x555556074c60, bytepos=1)
    at marker.c:343
#1  0x0000555555853509 in Ftreesit_node_start
    (node=node <at> entry=XIL(0x55555605b225)) at treesit.c:1927
#2  0x00005555557f3f8a in print_vectorlike_unreadable
    (obj=XIL(0x55555605b225), printcharfun=XIL(0), escapeflag=<optimized out>, buf=0x7fffffff7ef0 "dd\aVUU") at print.c:2051
#3  0x00005555557f1b85 in print_object
    (obj=<optimized out>, printcharfun=<optimized out>, escapeflag=false)
    at print.c:2642
#4  0x00005555557f2cf0 in Fprin1_to_string
    (object=object <at> entry=XIL(0x55555605b225), noescape=XIL(0x30), overrides=overrides <at> entry=XIL(0)) at print.c:814
#5  0x00005555557b7c30 in styled_format
    (nargs=2, args=args <at> entry=0x7fffffffda30, message=message <at> entry=true)
    at editfns.c:3635
#6  0x00005555557b933f in Fformat_message
    (args=0x7fffffffda30, nargs=<optimized out>) at editfns.c:3388
#7  Fmessage (args=0x7fffffffda30, nargs=<optimized out>) at editfns.c:3185
#8  Fmessage (nargs=<optimized out>, args=0x7fffffffda30) at editfns.c:3154
#9  0x00005555557c6b75 in eval_sub (form=<optimized out>)
    at /scratch/emacs/emacs-git/src/lisp.h:2243
(More stack frames follow...)



In GNU Emacs 30.0.50 (build 6, x86_64-pc-linux-gnu, GTK+ Version
 3.24.38, cairo version 1.16.0) of 2024-05-17 built on caladan
Repository revision: 6ca3a60db3427bc6aef08144c1524920ff3d9c4d
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12101007
System Description: Debian GNU/Linux 12 (bookworm)

Configured using:
 'configure --enable-checking --without-native-compiler
 --with-xpm=ifavailable --with-gif=ifavailable
 --with-native-compilation=no --with-tree-sitter'

Configured features:
CAIRO DBUS FREETYPE GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG LIBSELINUX
LIBSYSTEMD LIBXML2 MODULES NOTIFY INOTIFY PDUMPER PNG SECCOMP SOUND
SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP X11 XDBE XIM
XINPUT2 GTK3 ZLIB

[Message part 6 (message/rfc822, inline)]
From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 71012-done <at> debbugs.gnu.org, Helmut Eller <eller.helmut <at> gmail.com>
Subject: Re: bug#71012: 30.0.50; tree-sitter crash
Date: Wed, 5 Jun 2024 22:31:04 -0700

> On Jun 1, 2024, at 10:43 AM, Yuan Fu <casouri <at> gmail.com> wrote:
> 
> 
> 
>> On Jun 1, 2024, at 10:15 AM, Yuan Fu <casouri <at> gmail.com> wrote:
>> 
>> 
>> 
>>> On May 29, 2024, at 5:28 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>>> 
>>>> From: Yuan Fu <casouri <at> gmail.com>
>>>> Date: Tue, 28 May 2024 22:15:05 -0700
>>>> Cc: Helmut Eller <eller.helmut <at> gmail.com>,
>>>> 71012 <at> debbugs.gnu.org
>>>> 
>>>> From what I can gather, the crash seems to be because the temp buffer is garbage collected—the inserted lisp.h is a large file, so the temp buffer is probably immediately collected, before Emacs tries to print the node in the next line. I replaced the insert-file-content with some smaller file and it didn’t crash.
>>> 
>>> It is unthinkable that a buffer is GC'ed while it is being used.
>>> 
>>>> But that theory has critical flaws: a) Emacs certainly doesn't collect the temp buffer before the with-temp-buffer form returns; b) I can’t crash Emacs in my non-debug build by inserting (garbage-collect) in front of the message line in the example; c) debug build Emacs still crashes even if I enlarge gc-cons-threshold.
>>>> 
>>>> Eli, is there anything different regarding temp buffers in debug builds?
>>> 
>>> No.
>>> 
>>> But note that there are _two_ temporary buffers involved here: one is
>>> created in ts-bug.el, and it remains intact and valid; the other is
>>> the temporary buffer created by treesit-parse-string.  That one is
>>> killed by the time treesit-parse-string returns, so treesit-node-start
>>> attempts to access positions of a killed buffer!
>>> 
>>> So I think this is a bug in treesit-parse-string: it cannot use
>>> with-temp-buffer; instead, it should make the buffer into which it
>>> inserts the string part of the parser, so that the buffer is killed
>>> and GC'ed only when the parser is no longer referenced.  Otherwise the
>>> syntax tree returned by treesit-parse-string is unsafe to use.
>> 
>> I see, you’re absolutely right, thanks for the analysis! On top of that I need to make sure all the treesit function checks for buffer liveness before accessing the buffer. I was under the impression that a killed buffer would keep its content around until it’s collected. Turns out that wasn’t the case.
>> 
>> Yuan
> 
> Pushed the fix to emacs-29.
> 
> Yuan
> 

The fix works for me so I’m closing this report. Feel free to followup if new problems occur :-)

Yuan



This bug report was last modified 323 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.