GNU bug report logs - #64017
Wrong conversion from Emacs to Tree-sitter S-expression syntax

Previous Next

Package: emacs;

Reported by: Mattias Engdegård <mattias.engdegard <at> gmail.com>

Date: Mon, 12 Jun 2023 14:15:01 UTC

Severity: normal

Done: Mattias Engdegård <mattias.engdegard <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Mattias Engdegård <mattias.engdegard <at> gmail.com>
Cc: Basil Contovounesios <contovob <at> tcd.ie>,
 Bug Report Emacs <bug-gnu-emacs <at> gnu.org>
Subject: Re: Wrong conversion from Emacs to Tree-sitter S-expression syntax
Date: Thu, 15 Jun 2023 15:08:26 -0700
Thanks for catching this.

> On Jun 12, 2023, at 7:14 AM, Mattias Engdegård <mattias.engdegard <at> gmail.com> wrote:
> 
> `treesit-pattern-expand` converts a query pattern into tree-sitter S-expression syntax, as a string. The conversion mainly converts certain keywords but the main problem is that it prints strings in Emacs syntax which differs from that of tree-sitter.
> 
> As a consequence, :match regexps cannot contain newlines:
> 
> (treesit-query-capture
> 'java
> '(((identifier) @font-lock-constant-face
>    (:match "hello\n" @font-lock-constant-face))))
> 
> signals a syntax error.
> 
> As far as I can tell the tree-sitter string syntax allows for the escape sequences:
> 
> \n = LF
> \r = CR
> \t = TAB
> \0 = NUL  (only a single 0 -- no octal escapes!)
> \X = the character X itself
> 
> Unescape newlines result in a syntax error as seen in the example above. NULs don't seem to go well either.
> 
> At the very least, the conversion should avoid literal newlines and NULs in the result (and probably CR and TAB). This cannot be done with a straight prin1-to-string.
> 
> (By the way, why is the conversion written in C? Was Lisp too slow?)

Because I wasn't sure if it’s ok for C functions to rely on Lisp functions, plus the function is simple enough. Right now if one doesn’t load treesit.el, all the C functions work fine.

> 
> Ideally we should not need to expose the tree-sitter s-exp query syntax at all. Surely Emacs s-exps should be preferable in every case?
> 

It shouldn’t hurt to expose the tree-sitter sexp. Other editors mainly use the string syntax.

Yuan



This bug report was last modified 2 years and 32 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.