From unknown Sun Jun 15 13:00:02 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#59637 <59637@debbugs.gnu.org> To: bug#59637 <59637@debbugs.gnu.org> Subject: Status: 29.0.50; Should treesit-range-settings support the possibility of separate parser for each region? Reply-To: bug#59637 <59637@debbugs.gnu.org> Date: Sun, 15 Jun 2025 20:00:02 +0000 retitle 59637 29.0.50; Should treesit-range-settings support the possibilit= y of separate parser for each region? reassign 59637 emacs submitter 59637 miha@kamnitnik.top severity 59637 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Sun Nov 27 12:11:32 2022 Received: (at submit) by debbugs.gnu.org; 27 Nov 2022 17:11:32 +0000 Received: from localhost ([127.0.0.1]:43167 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ozLBj-0006qI-Vd for submit@debbugs.gnu.org; Sun, 27 Nov 2022 12:11:32 -0500 Received: from lists.gnu.org ([209.51.188.17]:55506) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ozLBi-0006qC-D1 for submit@debbugs.gnu.org; Sun, 27 Nov 2022 12:11:30 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ozLBi-0008IO-8W for bug-gnu-emacs@gnu.org; Sun, 27 Nov 2022 12:11:30 -0500 Received: from kamnitnik.top ([209.250.245.214]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ozLBe-0007fl-If for bug-gnu-emacs@gnu.org; Sun, 27 Nov 2022 12:11:29 -0500 From: miha@kamnitnik.top DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kamnitnik.top; s=mail; t=1669569084; bh=doWme5TRXYsBlAUPIub2saYZTNs1zUQlDUP9g04vSUQ=; h=From:To:Subject:Date:From; b=hHq884zUPly1WCsex3PsMEZv/tgDr6BJlHtCrEwU8/Qf84rpf3H8YR/dnEHWyqLX2 Hnf/8oAN9XZpVOHxtAsAPo6GT11OGJBNJok8f5syj/YIWGyIKp/8YbYVydWxwXaDza tHOgkMJygYtHlxizGX39ryK32G0CEvIjzsG0zXWdBqSkR0Re+AfaRPXL/vIcCxcgNC 84saKyyAbJLjOiXwPTnrfprD9toB1BnO/q6aKYgRZ0IBqScPGN93dxjXkBJtSfL4kP RE/HbAeOzmtkA0Vb7QAQVkB7tH2jfxzc8loaXgxj+Ia3wl9kLYAsGdCLxyGXNfzMfL Lj+bED7duPzmw== To: bug-gnu-emacs@gnu.org Subject: 29.0.50; Should treesit-range-settings support the possibility of separate parser for each region? Date: Sun, 27 Nov 2022 18:12:42 +0100 Message-ID: <87v8n0b9th.fsf@miha-pc> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Received-SPF: pass client-ip=209.250.245.214; envelope-from=miha@kamnitnik.top; helo=kamnitnik.top X-Spam_score_int: 16 X-Spam_score: 1.6 X-Spam_bar: + X-Spam_report: (1.6 / 5.0 requ) BAYES_00=-1.9, CONTENT_AFTER_HTML_WEAK=1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FROM_SUSPICIOUS_NTLD=0.001, FROM_SUSPICIOUS_NTLD_FP=0.695, PDS_OTHER_BAD_TLD=1.999, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Score: 3.6 (+++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: As far as I understand, the current behaviour of treesit-parser-set-included-ranges is that the concatenation of text from different regions in the same range set is considered as one program. This me [...] Content analysis details: (3.6 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 2.0 PDS_OTHER_BAD_TLD Untrustworthy TLDs [URI: kamnitnik.top (top)] -0.0 SPF_HELO_PASS SPF: HELO matches SPF record 0.9 SPF_FAIL SPF: sender does not match SPF record (fail) [SPF failed: Please see http://www.openspf.org/Why?s=mfrom; id=miha%40kamnitnik.top; ip=209.51.188.17; r=debbugs.gnu.org] -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at https://www.dnswl.org/, medium trust [209.51.188.17 listed in list.dnswl.org] -0.0 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [209.51.188.17 listed in wl.mailspike.net] 2.0 FROM_SUSPICIOUS_NTLD_FP From abused NTLD 0.0 FROM_SUSPICIOUS_NTLD From abused NTLD 1.0 CONTENT_AFTER_HTML_WEAK More content after HTML close tag X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.4 (/) --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable As far as I understand, the current behaviour of treesit-parser-set-included-ranges is that the concatenation of text from different regions in the same range set is considered as one program. This means that for this html program treesitter would consider "alert('hello');" to be inside a comment and the second script tag would contain an error about missing comment end. However, testing this in Firefox, it seems that the first script tag is the erroneous one here and the alert function call isn't inside a comment. So I guess the correct way to parse this html document would be to have two instances of javascript parser, one for each region. On the other hand, we should consider if this is worth the added complexity and performance degradation. Thanks and best regards. --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQJHBAEBCAAxFiEEmxVnesoT5rQXvVXnswkaGpIVmT8FAmODmooTHG1paGFAa2Ft bml0bmlrLnRvcAAKCRCzCRoakhWZP/0SEACNj/sx7UXE+hzEmkKOk0X7OeKp0K5r 2dSTcSxLNO5/do66R0PZd2nEb0bfdQkP9WvCpNaKT/e5rzG+hRcWOfzRj0Frzn7g RKbMBpZ4kYFjnEELGi4JXfHeejXmjhpOf+N4JaP4KyerTUCTVrkIsb7ZzympWPXD ZIXWOdfYcwxFn0cN5rqDavrc22nBUof0h6++dezwjvrrpaxyjf0sB6FO7W2cfITO syMaiU2StsBjK3XErK0ATy+HCSwk6nsEz/f5MaKGpwS0vOYSZ2FBANe6UkBn39hZ nss0E4NtJ1hzxyBG/n3ID0t1RqnkmAcSwCxc8WmVO9uDrT9hUO6rHedeNOAsnHJE QN0ay7qkkHp9fBTL1GnLutyYSKv8elOmRuo3Ha7VjVt4VkgzsFJ5ChT67uFc1ypO WWueu66LCkog7hK5v58dveq/MXR1qAR1dGsLu+ePhEr0fN9TMDNAS6nQ+cHVS63D pR5xemn00gc+Hg4irldj6HwDzwmdEcquaWykCRctcEGxlUjlLFsDRqqTXFHmNBXP Vou7JCD6uWADbb8B0hKns29NeYf+w5Qi2UvUH19WVKuNZiu51O1VCEBgu29sZ6VT tjLISvQ9YWWtJl/Fu/9IKhRbPekDWZG5MDGq79vIrhdzeBgMS1+RLBVGvTJH3HD6 2bTfap5VNLvUBQ== =vAd4 -----END PGP SIGNATURE----- --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Sun Nov 27 12:29:05 2022 Received: (at 59637) by debbugs.gnu.org; 27 Nov 2022 17:29:05 +0000 Received: from localhost ([127.0.0.1]:43241 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ozLSj-0006z7-8z for submit@debbugs.gnu.org; Sun, 27 Nov 2022 12:29:05 -0500 Received: from mail-oa1-f47.google.com ([209.85.160.47]:44591) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ozLSg-0006yj-TO for 59637@debbugs.gnu.org; Sun, 27 Nov 2022 12:29:03 -0500 Received: by mail-oa1-f47.google.com with SMTP id 586e51a60fabf-142306beb9aso10577169fac.11 for <59637@debbugs.gnu.org>; Sun, 27 Nov 2022 09:29:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:mime-version:references:in-reply-to :from:from:to:cc:subject:date:message-id:reply-to; bh=US6lueThxeIx2ciew3tGEA3W/xOVDMbbuyI79fqm4ac=; b=ehV6lPSMaL5/RZmJjU8X3qPbF4XDWJ+UcIA8mUrnA0O7oT7Ovr677fcwrqEbANlPWJ 1nqmK3LujQ4tlhb8qeiy45ifvGoVj1KQcua4DMofVmPTeOIkK8rEfgXOLAWZcbOZk461 I70z4AE7qf9D+gF9chUjGtw7dOeqoGC2qlfL7oaTziDNnARM+vn8ZLVtv1axHC8gpJ+j FZzEdjdFAvhIQKa+iAM1CsfjWl/3rBH/GCD/XcSbE38dN7elcfx/E0iYrSnxDeVukZfi ujTHQLVRosuXn1L2BnMWCk4/vHIlXC0RpTqPWZqfZDVpiNml2UMVhxpNcNuckqmFJELN a5BA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:mime-version:references:in-reply-to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=US6lueThxeIx2ciew3tGEA3W/xOVDMbbuyI79fqm4ac=; b=t1Nsjj9zAUTCSSaEnU35Dd/d83DxWIARJawRpycU1vHoXAevqohVje6K9wM4XWrgoo GaPxO7ntsF3yA4kFZ69ArKaREUCIQGEV7z0vw21D0bIumN7QVZFRg1heNjfib2bqX/xl LVagvE4i1TLGHYmE8VGP0niCmuuXqNlT+DEtCOwUbw67PIDR+kxG5C+JpHizLKrPlF9T V9S3ouGXn9MzKzRitv5YRrKgckNXNVvPG4/cCkKeKPJpvxis/jRpxeC4KY37BgVkHIgF tAlMpf34efEIzlUTNHuySzsiQ6u1vcqlNhM9XrMvZ1EE9D/23ldTf7G8sOSm+/4GXVoP 0Rqw== X-Gm-Message-State: ANoB5pkrbTjhykUmIpHKEQkrdgXbt6rpj3QUrXamIU9y5rjNceF5rtKN xzh8SD67tdcXZzvYHlGydpGQqXS+bsHrITcfIcY= X-Google-Smtp-Source: AA0mqf4fiyJJowDSkRaC+kmHeeOSfhBRFZeq45oCN2OWATVaGm7pZgnL6SzpwEc2KWahnE89vFJMwEMSniefvXcRZD0= X-Received: by 2002:a05:6870:cc89:b0:12b:fbe7:b793 with SMTP id ot9-20020a056870cc8900b0012bfbe7b793mr28213586oab.92.1669570137259; Sun, 27 Nov 2022 09:28:57 -0800 (PST) Received: from 753933720722 named unknown by gmailapi.google.com with HTTPREST; Sun, 27 Nov 2022 09:28:56 -0800 From: Stefan Kangas In-Reply-To: <87v8n0b9th.fsf@miha-pc> References: <87v8n0b9th.fsf@miha-pc> X-Hashcash: 1:20:221127:casouri@gmail.com::ZDFc5g47YcubD8SU:w0b MIME-Version: 1.0 Date: Sun, 27 Nov 2022 09:28:56 -0800 Message-ID: Subject: Re: bug#59637: 29.0.50; Should treesit-range-settings support the possibility of separate parser for each region? To: miha@kamnitnik.top, 59637@debbugs.gnu.org Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 59637 Cc: Yuan Fu X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) miha--- via "Bug reports for GNU Emacs, the Swiss army knife of text editors" writes: > As far as I understand, the current behaviour of > treesit-parser-set-included-ranges is that the concatenation of text > from different regions in the same range set is considered as one > program. This means that for this html program > > > > > > > treesitter would consider "alert('hello');" to be inside a comment and > the second script tag would contain an error about missing comment > end. > > However, testing this in Firefox, it seems that the first script tag is > the erroneous one here and the alert function call isn't inside a > comment. So I guess the correct way to parse this html document would be > to have two instances of javascript parser, one for each region. On the > other hand, we should consider if this is worth the added complexity and > performance degradation. > > Thanks and best regards. Copying in Yuan Fu. From debbugs-submit-bounces@debbugs.gnu.org Mon Nov 28 17:51:41 2022 Received: (at 59637) by debbugs.gnu.org; 28 Nov 2022 22:51:41 +0000 Received: from localhost ([127.0.0.1]:51435 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ozmyS-00082R-KM for submit@debbugs.gnu.org; Mon, 28 Nov 2022 17:51:41 -0500 Received: from mail-pg1-f177.google.com ([209.85.215.177]:41853) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ozmyQ-00082L-5u for 59637@debbugs.gnu.org; Mon, 28 Nov 2022 17:51:39 -0500 Received: by mail-pg1-f177.google.com with SMTP id q71so11284147pgq.8 for <59637@debbugs.gnu.org>; Mon, 28 Nov 2022 14:51:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:cc:date:message-id:subject:mime-version :content-transfer-encoding:from:from:to:cc:subject:date:message-id :reply-to; bh=I+cFP7TCQHh2YpwTIZ7Cksr/Vws5flGkvOf3GjsMKxA=; b=ce5NKbWkrMq5lyKOxFaAaTagkBQVt2dCiR3XbseMIV0RuWo2C/wMw6nV5P1Z0rf8Ee sY5tl5mNJTkmGrxJ++UMZTqR/KhuNGfD97XfJZgb7ahE3fsTAP6+X51+PF6q77QSUxCW vFwSDH92Lnpm2jKFJc3w3CRv5GIsLP1t1+1QVhvtz3yK0CVTlpnumVFO5fGGKrOWl9Nr w/ZKeEs6RhKoovASpUuZBk0F9aloiGYd59ggitVox7/Xb3RmFcYFhmM3Gq/5ny1X0agS 4YxrXEgS7P/80ZjtHnxn0uLigyNm9tu6LpRUh42SpSTSIA3u6J1IgB+TJdquq9vrpR/7 +bDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:cc:date:message-id:subject:mime-version :content-transfer-encoding:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I+cFP7TCQHh2YpwTIZ7Cksr/Vws5flGkvOf3GjsMKxA=; b=iCtLEJJE0YD9ai8BEkdKq2atFLgpqR8rG1+yPRcuXcIWIYcnWOpfcJGSaWajr3ZbVR tywd16rIxX2HdiETIyqvL+tTo25y4EsdyC5bY0+XX6F74y5DEKhFBtUQt39FpAQKJ1XE fHsTR63wJUQukcTmxVtGLplI28DgGyvqubIDn6VPmJEt3/OUtnnij7sXv574U31S12on qfmchejepnK2AuDT5zsKGAeQ2hhUiux86/Ghnveeb7ag6NilNCQD3dfsQZxaCTkD4U89 m6vuYVY+C8bzOzLpVDM+etAGgfd8kR4edatTTCO82Iil0bRK2Hlr7xsatcPEKABWJuce tTYA== X-Gm-Message-State: ANoB5pnwa3TO+bIcOrt+tW9opb1HtpiHe4DQYtfUxx0ffkuqUsbE2BWL +6dn6PYK1DzhJWmB6AS/5rk= X-Google-Smtp-Source: AA0mqf6iFs9F0Df/XuaQ2rMbiS5mVQ/sN5dGmJ5gsnTVPwUW63G5pumEeIE+tHGohbIqwTD32iffvQ== X-Received: by 2002:a05:6a00:2396:b0:572:698b:5fa9 with SMTP id f22-20020a056a00239600b00572698b5fa9mr33867067pfc.28.1669675892253; Mon, 28 Nov 2022 14:51:32 -0800 (PST) Received: from smtpclient.apple (cpe-172-117-161-177.socal.res.rr.com. [172.117.161.177]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0017e64da44c5sm9296089plb.203.2022.11.28.14.51.31 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 28 Nov 2022 14:51:31 -0800 (PST) From: Yuan Fu Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Subject: Re: bug#59637: 29.0.50; Should treesit-range-settings support the possibility of separate parser for each region? Message-Id: Date: Mon, 28 Nov 2022 14:51:30 -0800 To: Stefan Kangas X-Mailer: Apple Mail (2.3696.120.41.1.1) X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 59637 Cc: 59637@debbugs.gnu.org, miha@kamnitnik.top X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Stefan Kangas writes: > miha--- via "Bug reports for GNU Emacs, the Swiss army knife of text > editors" writes: > >> As far as I understand, the current behaviour of >> treesit-parser-set-included-ranges is that the concatenation of text >> from different regions in the same range set is considered as one >> program. This means that for this html program >> >> >> >> >> >> >> treesitter would consider "alert('hello');" to be inside a comment = and >> the second script tag would contain an error about missing comment >> end. >> >> However, testing this in Firefox, it seems that the first script tag = is >> the erroneous one here and the alert function call isn't inside a >> comment. So I guess the correct way to parse this html document would = be >> to have two instances of javascript parser, one for each region. On = the >> other hand, we should consider if this is worth the added complexity = and >> performance degradation. >> >> Thanks and best regards. Yeah it makes sense, but as you say the isolation comes at a cost and I don=E2=80=99t know if it can be justified right now, because the = complexity in assinging different parsers for each range which can disappear/appear as the user edits the buffer. Plus the current framework kind of assumes one parser for each language, so we need some non-trivial change to make "one parser per range" work smoothly. For now, I think it=E2=80=99s best to just turn off error highlighting = and rely on tree-sitter=E2=80=99s error recovery. I think that=E2=80=99s what = everybody else does. In the future if we make the framework more flexible and makes "one parser per range" easier to implement we can try adding support for it. > > Copying in Yuan Fu. Thanks :-) Yuan