From unknown Tue Jun 17 01:32:55 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#57245 <57245@debbugs.gnu.org> To: bug#57245 <57245@debbugs.gnu.org> Subject: Status: 29.0.50; M-> in a large XML file (without long lines) is slow Reply-To: bug#57245 <57245@debbugs.gnu.org> Date: Tue, 17 Jun 2025 08:32:55 +0000 retitle 57245 29.0.50; M-> in a large XML file (without long lines) is slow reassign 57245 emacs submitter 57245 Dmitry Gutov severity 57245 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 16 10:34:42 2022 Received: (at submit) by debbugs.gnu.org; 16 Aug 2022 14:34:43 +0000 Received: from localhost ([127.0.0.1]:47970 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oNxeL-0005ny-JP for submit@debbugs.gnu.org; Tue, 16 Aug 2022 10:34:42 -0400 Received: from lists.gnu.org ([209.51.188.17]:35386) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oNxeJ-0005nr-FC for submit@debbugs.gnu.org; Tue, 16 Aug 2022 10:34:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51276) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNxe3-0002Go-Oa for bug-gnu-emacs@gnu.org; Tue, 16 Aug 2022 10:34:20 -0400 Received: from mail-wm1-x32f.google.com ([2a00:1450:4864:20::32f]:43744) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oNxdy-00062m-TT for bug-gnu-emacs@gnu.org; Tue, 16 Aug 2022 10:34:13 -0400 Received: by mail-wm1-x32f.google.com with SMTP id ay39-20020a05600c1e2700b003a5503a80cfso5478072wmb.2 for ; Tue, 16 Aug 2022 07:34:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:from:content-language:subject:to :user-agent:mime-version:date:message-id:sender:from:to:cc; bh=I3BmggrVqU+yaTa9L17oMTuIPrAtoFF7TdGIfiGPSBc=; b=EyhL1IXSvbZP3i3GLOhLyh3FGeSFd8fH93IkEGCVpKOyQbedmdAdPZetTT2c+VcV3s Fm7OWB/jgXrqBhAm+Jzsb08DuoiCXEzYCr1Sv/1/in5JKdt+q9EZO+gJjfL8wa185pmi PtR24AEMl7ug+RXUH8O0P9v+1xcADu8YDB2ittcDUb2htREWdgbuhsbwZMB5dpI810Ya k+lhTxwLpHvcmRfvrcuxG3Lg6ofjZsKOntynXL73hg7sbUPq2+qcQbWTmtY1O9qsCA5f LTNd8fgcYnmnkPle+V2ZZABG49zrD9UPX30878tlRCIXXAt/W+0I7yQ11kFugP3TTUNm TAcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:from:content-language:subject:to :user-agent:mime-version:date:message-id:sender:x-gm-message-state :from:to:cc; bh=I3BmggrVqU+yaTa9L17oMTuIPrAtoFF7TdGIfiGPSBc=; b=2sREhedc/6ObhQiGl8FrYmIGTKr6ru5a6AUwuZMkrad+MptfGPWTHJjxSRd8TSNMD2 mCnLjcqoRJaROfYUUDuHHz3S08kQnzcu+PC2aZjKKp40Y1NJ4Oemqhc8/1V8Nyp3IlMz PNXutem4sFooIZ38FijanZfW+z0MVTqTOjhTzWSLjaPQOma1e7iVcCIxB4ubq7Uqrle7 7BDSud9NDclevkuxRmddFTLmyuQeBWgkRSLRbE2Wj96pWpQy8nRKcraywfa+GIC3MDyI kY0H1qLartKPtJZSyQ1CSq1ZqJARhKp9Csem52eKM+fP7++gT2Yo9jXSVIHXmNRazips 3Ydg== X-Gm-Message-State: ACgBeo3RrVADSevbQygqfsITQCWIL3ZHNVR8PCVXOgRomAm5XSr+IYxp tPF839NI1MsmvhzsrfbJP9QBmXhJvho= X-Google-Smtp-Source: AA6agR5Udgh+Zq9ihtWZXN8LPTQ31zJayGwYLBcww93JWBo/E4aCVbZJHHGjHKzKWs7Ept2KdKlpew== X-Received: by 2002:a05:600c:a0b:b0:39e:22ef:1a0 with SMTP id z11-20020a05600c0a0b00b0039e22ef01a0mr13619428wmp.46.1660660440392; Tue, 16 Aug 2022 07:34:00 -0700 (PDT) Received: from [192.168.0.6] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id w5-20020adfee45000000b0020e6ce4dabdsm10040621wro.103.2022.08.16.07.33.59 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 16 Aug 2022 07:34:00 -0700 (PDT) Message-ID: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> Date: Tue, 16 Aug 2022 17:33:58 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 To: bug-gnu-emacs@gnu.org Subject: 29.0.50; M-> in a large XML file (without long lines) is slow Content-Language: en-US From: Dmitry Gutov Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Received-SPF: pass client-ip=2a00:1450:4864:20::32f; envelope-from=raaahh@gmail.com; helo=mail-wm1-x32f.google.com X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Score: -0.8 (/) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.8 (-) Branching this off from the discussion in bug#56682. Prerequisite: Have an XML file that is 20 MB in size, and doesn't have long lines. Or follow steps 1-3 to create one. 1. wget -o large-file.xml https://updates.drupal.org/release-history/drupal/current 2. M-% /> RET ^J/> RET (to break up the long line into smaller pieces) 3. Select the contents of the file and copy them over and over for 99 times. Alternatively, copy them 9 times, then select the result, and copy it 9 times as well. Save the buffer. (To try to keep XML valid -- not sure if necessary -- you can only perform the copying operation on the contents of the tag. But that's probably not important. I did that, though.) 4. Kill the buffer and re-visit it again. Press M->. 5. Note the delay. Here's the profiler output: 1397 95% - command-execute 1397 95% - call-interactively 1338 91% - funcall-interactively 1331 90% - end-of-buffer 1327 90% - recenter 1327 90% - jit-lock-function 1327 90% - jit-lock-fontify-now 1327 90% - jit-lock--run-functions 1327 90% - run-hook-wrapped 1327 90% - # 1327 90% - font-lock-fontify-region 1327 90% - font-lock-default-fontify-region 1327 90% - nxml-extend-region 845 57% - skip-syntax-forward 845 57% - internal--syntax-propertize 845 57% - syntax-propertize 845 57% - nxml-syntax-propertize 845 57% - sgml-syntax-propertize 842 57% - # 479 32% sgml--syntax-propertize-ppss 3 0% syntax-ppss 482 32% - nxml-move-outside-backwards 482 32% - nxml-inside-start 482 32% syntax-ppss 7 0% + execute-extended-command 59 4% + byte-code 59 4% + ... 10 0% + timer-event-handler In GNU Emacs 29.0.50 (build 3, x86_64-pc-linux-gnu, GTK+ Version 3.24.20, cairo version 1.16.0) of 2022-08-16 built on potemkin Repository revision: 81ff64d3ca8d6e43e976f209399d2a0e9b4a7dd8 Repository branch: master Windowing system distributor 'The X.Org Foundation', version 11.0.12013000 System Description: Ubuntu 20.04.4 LTS From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 16 12:55:36 2022 Received: (at 57245) by debbugs.gnu.org; 16 Aug 2022 16:55:36 +0000 Received: from localhost ([127.0.0.1]:48158 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oNzqq-00010P-Bz for submit@debbugs.gnu.org; Tue, 16 Aug 2022 12:55:36 -0400 Received: from eggs.gnu.org ([209.51.188.92]:43816) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oNzqo-000107-Hy for 57245@debbugs.gnu.org; Tue, 16 Aug 2022 12:55:35 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:41206) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNzqi-000559-GQ; Tue, 16 Aug 2022 12:55:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=9j+UbrOuYfNR3Yf/6A80q0HoNkHj3K7e6ds231M0Bac=; b=YsAOQj7VWJEE uHzGQwezEJ5oJ8Cw2kjx8W4Ru5nghimdZZgHrb/N7mZZ2TO3ufvOcFinlkxNLr9EOLOnMt7Cmn8H/ 0EVidvk90aJ+1Ag3KKrYp5k2dxRDPEvgDvHSils2/etCHbNyh/Td5QoaWPmVQfnRuapevUQxEiymq g3niZCkXPKQDCeJEhPiIDzG+Voea04teyLdZKQRLPkRIm7OD3sumDtx9E06vjB7e18Fuj8qKyy+Sb +dC+BNzwY9jlx/IODOQ59RXdxvSD6Yow12ntG1Mpc0LfOByddttjR4QhYKMBR5hTymGh1qbggRdUW dFZXf5XCcaaYwsrPA32hwA==; Received: from [87.69.77.57] (port=3386 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNzqL-000097-7x; Tue, 16 Aug 2022 12:55:25 -0400 Date: Tue, 16 Aug 2022 19:54:54 +0300 Message-Id: <83tu6cdt7l.fsf@gnu.org> From: Eli Zaretskii To: Dmitry Gutov , Stefan Monnier In-Reply-To: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> (message from Dmitry Gutov on Tue, 16 Aug 2022 17:33:58 +0300) Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Tue, 16 Aug 2022 17:33:58 +0300 > From: Dmitry Gutov > > Prerequisite: Have an XML file that is 20 MB in size, and doesn't have > long lines. > > Or follow steps 1-3 to create one. > > 1. wget -o large-file.xml > https://updates.drupal.org/release-history/drupal/current > 2. M-% /> RET ^J/> RET (to break up the long line into smaller pieces) > 3. Select the contents of the file and copy them over and over for 99 > times. Alternatively, copy them 9 times, then select the result, and > copy it 9 times as well. Save the buffer. > > (To try to keep XML valid -- not sure if necessary -- you can only > perform the copying operation on the contents of the tag. But > that's probably not important. I did that, though.) > > 4. Kill the buffer and re-visit it again. Press M->. > 5. Note the delay. > > Here's the profiler output: > > 1397 95% - command-execute > 1397 95% - call-interactively > 1338 91% - funcall-interactively > 1331 90% - end-of-buffer > 1327 90% - recenter > 1327 90% - jit-lock-function > 1327 90% - jit-lock-fontify-now > 1327 90% - jit-lock--run-functions > 1327 90% - run-hook-wrapped > 1327 90% - # > 1327 90% - font-lock-fontify-region > 1327 90% - font-lock-default-fontify-region > 1327 90% - nxml-extend-region > 845 57% - skip-syntax-forward > 845 57% - internal--syntax-propertize > 845 57% - syntax-propertize > 845 57% - nxml-syntax-propertize > 845 57% - sgml-syntax-propertize > 842 57% - # > 479 32% sgml--syntax-propertize-ppss > 3 0% syntax-ppss > 482 32% - nxml-move-outside-backwards > 482 32% - nxml-inside-start > 482 32% syntax-ppss > 7 0% + execute-extended-command > 59 4% + byte-code > 59 4% + ... > 10 0% + timer-event-handler > Thanks. It looks like some problem in nXML mode or in syntax.c or in how nXML uses the syntax stuff. Maybe the code there is simply not scalable. Stefan, can you see why syntax-related stuff in sgml-mode is so heavy here? What the above profile doesn't show is that this code creates tons of garbage, so GC is called a lot, and adds its share of slowdown. From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 16 14:40:48 2022 Received: (at 57245) by debbugs.gnu.org; 16 Aug 2022 18:40:48 +0000 Received: from localhost ([127.0.0.1]:48305 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO1Ud-00062o-Vk for submit@debbugs.gnu.org; Tue, 16 Aug 2022 14:40:48 -0400 Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:36635) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO1Ub-00062a-Jb for 57245@debbugs.gnu.org; Tue, 16 Aug 2022 14:40:46 -0400 Received: from pmg1.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id 603B7100138; Tue, 16 Aug 2022 14:40:39 -0400 (EDT) Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id E9CD91000ED; Tue, 16 Aug 2022 14:40:37 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1660675237; bh=pLsYvXMFNew2LkuwIxg4V3x5UNl98ezVjstJHjsK1PE=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=mWrPSk/niTlTmP1F3cmT3z5N9MoNCx3Shvd8/Z79JTxLhI224mM4F+r0F6c4Mbrq9 Cotqi9CUqfv9TKtisGgpNb69LpORwd9nPt/nKrn6cpiG6Pba2OfBrtpXLDuZ9EvPS9 vdNXM3Ft9+BpJKLBZcnW33GSZQB4EKcNgzhH4TZ2qJNWu36FqHSgs8x8CQFQ4gpOm3 hM+O7vwB4PKc6vEO+8JAENurSXBo2ifdr9Fa2HRX0NTtebFOHhLkctlTkwdXf6K/rm /fHXHlMQSC2dudFACSKWGT2MzQ9NOlvOo7VZY8KVRxUhm16nqeGFFyOqN6px9XVCUw mUtc6RqPpw+pg== Received: from alfajor (unknown [45.44.229.252]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 81E611201AF; Tue, 16 Aug 2022 14:40:37 -0400 (EDT) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow In-Reply-To: <83tu6cdt7l.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 16 Aug 2022 19:54:54 +0300") Message-ID: References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> Date: Tue, 16 Aug 2022 14:40:33 -0400 User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-SPAM-INFO: Spam detection results: 0 ALL_TRUSTED -1 Passed through trusted hosts only via SMTP AWL -0.014 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain T_SCC_BODY_TEXT_LINE -0.01 - X-SPAM-LEVEL: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, Dmitry Gutov X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Thanks. It looks like some problem in nXML mode or in syntax.c or in > how nXML uses the syntax stuff. Maybe the code there is simply not > scalable. > > Stefan, can you see why syntax-related stuff in sgml-mode is so heavy > here? IIRC the sgml/xml/nxml code for syntax-propertize is fairly costly, indeed. It can probably be reimplemented in a more efficient way, but someone will/would have to sit down and think about how to do that. [ Some `syntax-propertize-function`s (like cperl-mode's) have been "hacked" in an unsatisfactory way by taking existing-but-not-fully-understood code and wrapping it so as to make it usable for `syntax-propertize-function`. The result works but it could suffer from inefficiencies due to the fact that it's used in a different context from the one for which it was designed. Some of nXML's code suffers from similar issues, so I thought maybe that would be part of the problem, but AFAICT those don't have any impact in this case. ] IOW, I think the issue is that our syntax tables aren't a good fit for SGML's syntax, so we just have to work harder than for other modes. > What the above profile doesn't show is that this code creates > tons of garbage, so GC is called a lot, and adds its share of > slowdown. Hmm... I wonder where this might come from. As you say, the profile doesn't show it: 95% in command-execute suggests less than 6% spent in the GC. Stefan From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 16 14:59:55 2022 Received: (at 57245) by debbugs.gnu.org; 16 Aug 2022 18:59:56 +0000 Received: from localhost ([127.0.0.1]:48332 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO1n9-0006ag-FO for submit@debbugs.gnu.org; Tue, 16 Aug 2022 14:59:55 -0400 Received: from eggs.gnu.org ([209.51.188.92]:38358) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO1n7-0006aO-H3 for 57245@debbugs.gnu.org; Tue, 16 Aug 2022 14:59:54 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:43982) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oO1n2-00071o-1N; Tue, 16 Aug 2022 14:59:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=5GyjZS3eYqdPjO/frkiP4RoThNMMWF5oE3rHvxeWgfE=; b=Ak2Zleh3LkDS nG6cx02Q58d3CsPtMblDGZC+i1JtRP71BZQdikvgyjxt/ti6qgoFXLCGYYDmZyvR1dMVeJhD2BfCI 37Ty+sFHd2x07NpbhfM1zrqXurGNkmPXtmRv7dyeckcNQa7gjLlTDXfHL/stdAz5vk2pFWWOe9n3N l+Q7Fl2kyYOR0ZmJsCgKbrhfxJIQsbde31Y4D8rNxLLbdaCu0YY3OKN2C8UhkIiAxL/TYSqNIVfF8 jnH3jXF3pA7OEoDHBBCx5i44O1Rv8WAY+zyfRQRjbkCiF4SC8BtWqWu2PvzrelzEsDXJrHqB41ePX y0MJERxvvupYGoHoPltuaw==; Received: from [87.69.77.57] (port=3218 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oO1n1-0002ju-Ft; Tue, 16 Aug 2022 14:59:47 -0400 Date: Tue, 16 Aug 2022 21:59:36 +0300 Message-Id: <83k078dnfr.fsf@gnu.org> From: Eli Zaretskii To: Stefan Monnier In-Reply-To: (message from Stefan Monnier on Tue, 16 Aug 2022 14:40:33 -0400) Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, dgutov@yandex.ru X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Stefan Monnier > Cc: Dmitry Gutov , 57245@debbugs.gnu.org > Date: Tue, 16 Aug 2022 14:40:33 -0400 > > > What the above profile doesn't show is that this code creates > > tons of garbage, so GC is called a lot, and adds its share of > > slowdown. > > Hmm... I wonder where this might come from. Run the recipe with a watchpoint on consing_until_gc, and you will see. From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 16 15:32:33 2022 Received: (at 57245) by debbugs.gnu.org; 16 Aug 2022 19:32:33 +0000 Received: from localhost ([127.0.0.1]:48402 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO2Ij-0007UH-2a for submit@debbugs.gnu.org; Tue, 16 Aug 2022 15:32:33 -0400 Received: from mail-wr1-f48.google.com ([209.85.221.48]:41556) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO2Ih-0007U3-OI for 57245@debbugs.gnu.org; Tue, 16 Aug 2022 15:32:32 -0400 Received: by mail-wr1-f48.google.com with SMTP id p10so13792758wru.8 for <57245@debbugs.gnu.org>; Tue, 16 Aug 2022 12:32:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc; bh=DKHmrsY4ZYfAyq5RB1luxmjbxIbZMojV9UU+ZRF3ohE=; b=FcjZud4sxSZ8r5CfNS07cV+y/bZioBrhkZuydCfpyYsbRNWulk2pw57tZmfL9V/ToD UFpLYnO+AzUADl4EGzmsuNIPBZBw8VIbSbhpewYWFP6m4L6YBREy3fLpRdomkrPMOz71 HObLUTqYmmPURLq27GCzDictPUl7Pe1G64DQQBlHxjgctS1xx5eqSAVitrYYKWSW6XHQ Qunk20RJkc07yABYaXVPBb1JWxSlDo11j8KxUPT77zn8zSnU8iFo/w9iPm/Cu5igUs1v jn7kmgYjV1v07yOvVO+KlFcwfG2fAEsrSirQZZWavHbRWovwVuZ3Vc5d1V7urs8d7nPQ +mCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc; bh=DKHmrsY4ZYfAyq5RB1luxmjbxIbZMojV9UU+ZRF3ohE=; b=IYD8sYxK/XcL8eBCsUDr+ahXrpzUq7hdIRHJza4e6MWf/7oXLWqJQdWH2ipXf4T3AV S45yVif8SmzpmrGmEwXN4nezZPdPTImaqpYCpV2pesY2KsuWRxAuwOqj5oXPJ5eQ+VZt 25oQktBWZtsOQErbkbQLvWIBVRMic6MVebs7pI4xC7UIdJhSlw/oZW8oglpvAcE/FhE5 hcpApe++1+hLJGFOhGWpbYsP0VX4ERaSpq2w2sjU52LNNmxL3qq4XH8bQsOvhNJvRfrV UyEYNwQCpZTn5tEACjQxz9FaDk9ZdB9+vIx2mknvpPqbSmpFk8IfHC0xs3kf0Hdsl2Pz pW/g== X-Gm-Message-State: ACgBeo37zM19ZJJtKspxavMt4nYuqqF3zM/Wk7PjnpLWT1OOIY1PW2qG GKK56zP1DvUMIF2u5pBOyfg= X-Google-Smtp-Source: AA6agR7qN392ObUy5JgTokEKZ+yC4M+R7YBA5BCDI+eX9mzr1d69kTZCOG0JNJSUVhdNyL0uoBlNEw== X-Received: by 2002:a5d:5b17:0:b0:220:7cec:2953 with SMTP id bx23-20020a5d5b17000000b002207cec2953mr12714916wrb.697.1660678345991; Tue, 16 Aug 2022 12:32:25 -0700 (PDT) Received: from [192.168.0.6] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id q65-20020a1c4344000000b003a54d610e5fsm13754368wma.26.2022.08.16.12.32.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 16 Aug 2022 12:32:25 -0700 (PDT) Message-ID: <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> Date: Tue, 16 Aug 2022 22:32:23 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow Content-Language: en-US To: Eli Zaretskii , Stefan Monnier References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> From: Dmitry Gutov In-Reply-To: <83tu6cdt7l.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) On 16.08.2022 19:54, Eli Zaretskii wrote: > Stefan, can you see why syntax-related stuff in sgml-mode is so heavy > here? nxml-syntax-propertize might well be heavier than average, but the delay scales linearly with the size of the file. Which seems to be exactly the behavior the "font-lock narrowing" was supposed to guard from? From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 16 16:22:15 2022 Received: (at 57245) by debbugs.gnu.org; 16 Aug 2022 20:22:15 +0000 Received: from localhost ([127.0.0.1]:48461 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO34p-0000LO-0S for submit@debbugs.gnu.org; Tue, 16 Aug 2022 16:22:15 -0400 Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:23816) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO34l-0000L9-VY for 57245@debbugs.gnu.org; Tue, 16 Aug 2022 16:22:12 -0400 Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 90EBE80070; Tue, 16 Aug 2022 16:22:06 -0400 (EDT) Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id E3E8C801B5; Tue, 16 Aug 2022 16:22:04 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1660681324; bh=HCyCRuxv9tthm5JZNRw59ASE7BHCBH1kHMMvEFpDiWc=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=F0xx5CSa9gWtB5dDbKYSEDoFmKMkIaov+oBGjZHTD8EYIVOCksnBfaw9wMRdVIMJK z5+T+oocZGFX0GpeGgWLgZv7t/YPH9qILAlqEkx82zeJda8g02ID9/JpRgOfnRRkiJ gQaQTsHlxtrmT6hOgMpmq9BimAiR3BL7v1qS+gtoQmCu24nOd5Lgz5CbbfmJBcRYSl OwVs5GqiRRfkvlQs1rQjY0BHMahGO4LndTMQrq3e2fqg+Z5RS7mTBOhkQ/XLh471BA Nx1mPKZyDkEbT4lenr/47wbY4eHVrWt2gKklaib5MB4H9clfdyWQmLZ7GjPolBa7o7 +Drctn9jgz0Ow== Received: from alfajor (unknown [45.44.229.252]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id A7DE51203E4; Tue, 16 Aug 2022 16:22:04 -0400 (EDT) From: Stefan Monnier To: Dmitry Gutov Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow In-Reply-To: <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> (Dmitry Gutov's message of "Tue, 16 Aug 2022 22:32:23 +0300") Message-ID: References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> Date: Tue, 16 Aug 2022 16:22:03 -0400 User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-SPAM-INFO: Spam detection results: 0 ALL_TRUSTED -1 Passed through trusted hosts only via SMTP AWL -0.043 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain T_SCC_BODY_TEXT_LINE -0.01 - X-SPAM-LEVEL: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 57245 Cc: Eli Zaretskii , 57245@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Dmitry Gutov [2022-08-16 22:32:23] wrote: > On 16.08.2022 19:54, Eli Zaretskii wrote: >> Stefan, can you see why syntax-related stuff in sgml-mode is so heavy >> here? > nxml-syntax-propertize might well be heavier than average, but the delay > scales linearly with the size of the file. Indeed, it should be linear. > Which seems to be exactly the behavior the "font-lock narrowing" was > supposed to guard from? Not sure which narrowing you're referring to. The "locked narrowing" introduced by Gregory is only installed in the presence of long lines. It's (currently) not used for large files (unless they contain long lines, that is). Stefan From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 16 16:49:53 2022 Received: (at 57245) by debbugs.gnu.org; 16 Aug 2022 20:49:53 +0000 Received: from localhost ([127.0.0.1]:48491 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO3VZ-000148-5U for submit@debbugs.gnu.org; Tue, 16 Aug 2022 16:49:53 -0400 Received: from mail-wr1-f50.google.com ([209.85.221.50]:38555) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO3VV-00013s-FK for 57245@debbugs.gnu.org; Tue, 16 Aug 2022 16:49:52 -0400 Received: by mail-wr1-f50.google.com with SMTP id ba1so3417184wrb.5 for <57245@debbugs.gnu.org>; Tue, 16 Aug 2022 13:49:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc; bh=7D4e8aaLjfYXwoidhyY607rb5hs0JiA+9f8mCqa2jio=; b=FcbMTY9hbrEGRTI1mBCcYhtzn1C+RCfomzcyZCLSWeCRNtrKfGLPnJnMejAU9Ua3zR hiuWPcQZQGxu5fJPovwH/F0xVi1RKG75cdO0IgYE3Hwq5jXmj/xGE3ozziXwfxSWnPBt kFEGV7rFMlNJNBGeFGkuDOA2kP6yTVOtwuBhSa/ZwLLbIeYHf+KSeuaggds1KmUn57hb LmiD81vbm74bnjcttHbFhL9DU3Xj2h7d+NHHvh7pJnUNia5t3w36HZ/2hjh1TZmqBPvH OBpFpTdnvdQ8d3jNW5z6tXkXhON1GPHmc5WVw4el4yhrQVtgtQad1BMw7OBv8Hxu/w56 Wylw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc; bh=7D4e8aaLjfYXwoidhyY607rb5hs0JiA+9f8mCqa2jio=; b=eJVrDOsrVFVuDY6Esr/FyZ5TKrAxmDS7UeuzU7BteEIR58AQxochrs0k6sF07xDfyU RN/6q3HYZ8hQudv10OXvDnQN9v1qvmU1Up8hHB8yLILFTjBbLbvxZHkJQ3rrS+w/QJj6 LOcEwNATNQQ7ZDdOfASV+vMgdtNrmTFjU9+YO0i8q86xQeRyoKjqgoAldu6LR7uK6OFk DT2vd90tzL47xn342V6Qw0h7zTdVmMyzZR9wQtuTiXu+MRr1j6tNTwfXfEAdq73NL2Yj bKDIu98bSkLxw1GFFR77CTP0ncXUCnhoi0EncBZsWC4UYgIWPuKodQ16wI6OuJRrpXXM VCiA== X-Gm-Message-State: ACgBeo1MoyFiGcaW1yDZ8kdR5C/KSMXJi9N0oK0QeGPgtAmSv9k9rRyM 4Zks8+7NmN82JbBlYttITRE= X-Google-Smtp-Source: AA6agR6u2pmNJG4uDwA5rjNK+TooSVEgGrU8rsozgf80200HVMb+7KGbIIlcWfTd+EUrkHHfd/e1yQ== X-Received: by 2002:a5d:5949:0:b0:223:88d6:bcf1 with SMTP id e9-20020a5d5949000000b0022388d6bcf1mr12453502wri.165.1660682983415; Tue, 16 Aug 2022 13:49:43 -0700 (PDT) Received: from [192.168.0.6] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id ck6-20020a5d5e86000000b00223a50b1be8sm11342805wrb.50.2022.08.16.13.49.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 16 Aug 2022 13:49:42 -0700 (PDT) Message-ID: <6688b0ad-54e1-4a59-e9b6-4cdc803a8359@yandex.ru> Date: Tue, 16 Aug 2022 23:49:38 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow Content-Language: en-US To: Stefan Monnier References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> From: Dmitry Gutov In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 57245 Cc: Eli Zaretskii , 57245@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) On 16.08.2022 23:22, Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors wrote: > Dmitry Gutov [2022-08-16 22:32:23] wrote: >> On 16.08.2022 19:54, Eli Zaretskii wrote: >>> Stefan, can you see why syntax-related stuff in sgml-mode is so heavy >>> here? >> nxml-syntax-propertize might well be heavier than average, but the delay >> scales linearly with the size of the file. > Indeed, it should be linear. > >> Which seems to be exactly the behavior the "font-lock narrowing" was >> supposed to guard from? > Not sure which narrowing you're referring to. > The "locked narrowing" introduced by Gregory is only installed in the > presence of long lines. It's (currently) not used for large files > (unless they contain long lines, that is). I guess that's the problem here. The font-lock narrowing (if it's indeed the method we're going to use to speed up its performance) shouldn't be conditioned on the presence of long lines. From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 16 17:45:30 2022 Received: (at 57245) by debbugs.gnu.org; 16 Aug 2022 21:45:30 +0000 Received: from localhost ([127.0.0.1]:48547 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO4NN-0002Q6-TC for submit@debbugs.gnu.org; Tue, 16 Aug 2022 17:45:30 -0400 Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:32470) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO4NL-0002Ps-OZ for 57245@debbugs.gnu.org; Tue, 16 Aug 2022 17:45:28 -0400 Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 2BC3C804AB; Tue, 16 Aug 2022 17:45:22 -0400 (EDT) Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 33906801B5; Tue, 16 Aug 2022 17:45:16 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1660686316; bh=eBw2y5z317J5sR9grKfuA4ktQRy+eJOmjRG47BbYaLQ=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=fwElIRlTOM6ocMKT3Cs5qiTBlW5M21ATs3Otjg1j2E7syw7cXAuLY8hdHKG8Iv70L /PUVmpdKAxNDDrl74/OIFVSPKospsPPXIFA2bIJ4IWvfZm5+XjHZJAd5HoSoPU7S+p 8km8zjKzr2cdRHuI5RvEotl9uQ7zJgp/r90IKetloXEpp9rbBSo68Sveyhiiz0bbZJ 9lDoOhewdz9yEXr+iDxq5Pv1w1gtVwwZuSLS+2stBZnus/5rKawNINX5DccgMboDjE cU4BHcKXSXNerh7U4qBszhmm+nTVfi8vV4nIHI8NFIBx7i/wjhXjjyDT498o1Ft0bo Mtjl/kmEz4IfQ== Received: from alfajor (unknown [45.44.229.252]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 0D5DD120172; Tue, 16 Aug 2022 17:45:16 -0400 (EDT) From: Stefan Monnier To: Dmitry Gutov Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow In-Reply-To: <6688b0ad-54e1-4a59-e9b6-4cdc803a8359@yandex.ru> (Dmitry Gutov's message of "Tue, 16 Aug 2022 23:49:38 +0300") Message-ID: References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> <6688b0ad-54e1-4a59-e9b6-4cdc803a8359@yandex.ru> Date: Tue, 16 Aug 2022 17:45:13 -0400 User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-SPAM-INFO: Spam detection results: 0 ALL_TRUSTED -1 Passed through trusted hosts only via SMTP AWL -0.041 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain T_SCC_BODY_TEXT_LINE -0.01 - X-SPAM-LEVEL: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 57245 Cc: Eli Zaretskii , 57245@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > The font-lock narrowing (if it's indeed the method we're going to use to > speed up its performance) shouldn't be conditioned on the presence of > long lines. font-lock does suffer from long lines, so the current code's handling of font-lock makes some sense. But indeed, we all agree it's not sufficient because it only handles the long lines problem, and we still need to tackle the case of large buffers, which is related but different. Stefan From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 16 18:20:48 2022 Received: (at 57245) by debbugs.gnu.org; 16 Aug 2022 22:20:48 +0000 Received: from localhost ([127.0.0.1]:48569 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO4vY-0003JP-HL for submit@debbugs.gnu.org; Tue, 16 Aug 2022 18:20:48 -0400 Received: from mail-wm1-f47.google.com ([209.85.128.47]:35641) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oO4vX-0003JA-1G for 57245@debbugs.gnu.org; Tue, 16 Aug 2022 18:20:48 -0400 Received: by mail-wm1-f47.google.com with SMTP id m17-20020a7bce11000000b003a5bedec07bso117108wmc.0 for <57245@debbugs.gnu.org>; Tue, 16 Aug 2022 15:20:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc; bh=359eLHfDR/fOy1pDczg6c0O2ibZ7lQx+8FYRb6NM+bk=; b=J0jx/BBPsL3ecQ3j66HZfGV8YEF8XczfdnjXO/O4RYbfA+MDPdL5l0OwaOGNSsUt0i tfq8kpuDkkimoLa0ggbD/O15gtZyxy5M4ISMmqj+hcAZSipwmvx41oqjamsAT/aTrYIo MQTzXUle84Zob2n775ajGKcD7HnV2cxwtbimjZLOHRpMrrCgWxHpjuMzmT2C5GRNA41S Q89fpp5zuuGU61gcstX1RwpnH5icQ3+07NTM5U5MaPXCeZS9tzqZwmPw+lqu7CJSHSKN tP+Bgu3o814YlE0Klvo6yCM9II12ZxBq2g375HgGHi5gVM7RqoIgHLg9JyBJLiCJ5OH+ hh7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc; bh=359eLHfDR/fOy1pDczg6c0O2ibZ7lQx+8FYRb6NM+bk=; b=C+Y++4ASYQFrSdjpr/YDymxyUP8DVQ5ZkewWus/hRvGmhISSe0maE0gvT8hBlMBk9w vvYM0QdySHfpd+zyMBtv2U3mVmIHBMXUKd17KC3Kn98BhenCZ+RPKAunEZwTKOoifTjp aiKuIlyn51oCxn6Ng/jNYFQBQaa2qfHVMhcjMsg9Jtg1X5/1YiKBOHdp7UdSYlJlUhut /3r5ZGIPKDpJigX70017526PzWVgogQyHgPJd+oZK0ti2TuD71q0tiBNs5mSVGLqpMmx UggJt8f56nRpYZZoMrBkb8WE/0J296C1ekisXvm/5p41PyylxrZ0WgcVUrqWQWOVGZUl 9u6Q== X-Gm-Message-State: ACgBeo1YqmAKkxoKVioFG1292FeapnVaWWTByLXih1FZDhJRDDWrrgbQ pT4E4ctv1BQrFG0ar5zLD/A= X-Google-Smtp-Source: AA6agR63Q2xtWfWaES2PqkgknSxdIAsgClI6MqZcTQt6swdHJ+5knBg+VyuGIA1lTWCGfFqo4iTbfA== X-Received: by 2002:a05:600c:4e4c:b0:3a5:eb9b:b489 with SMTP id e12-20020a05600c4e4c00b003a5eb9bb489mr307986wmq.56.1660688441175; Tue, 16 Aug 2022 15:20:41 -0700 (PDT) Received: from [192.168.0.6] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id 130-20020a1c0288000000b003a5bd9448e5sm102415wmc.28.2022.08.16.15.20.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 16 Aug 2022 15:20:40 -0700 (PDT) Message-ID: <4e2838b5-109e-7a27-0230-29dc6624b751@yandex.ru> Date: Wed, 17 Aug 2022 01:20:38 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow Content-Language: en-US To: Stefan Monnier References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> <6688b0ad-54e1-4a59-e9b6-4cdc803a8359@yandex.ru> From: Dmitry Gutov In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 57245 Cc: Eli Zaretskii , 57245@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) On 17.08.2022 00:45, Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors wrote: >> The font-lock narrowing (if it's indeed the method we're going to use to >> speed up its performance) shouldn't be conditioned on the presence of >> long lines. > font-lock does suffer from long lines Perhaps with when some specific rules are used? Like MATCH-ANCHORED, one instance of which I deleted from js-mode a few days ago. Otherwise, syntax-wholeline-max seems to be doing its job fine: if I comment out the narrowing code in handle_fontified_prop (or switch to the branch I posted previously), two XML files -- one with long lines and one without (the files differ only by addition of newlines) -- show approximately the same delay on M->. From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 17 07:24:19 2022 Received: (at 57245) by debbugs.gnu.org; 17 Aug 2022 11:24:20 +0000 Received: from localhost ([127.0.0.1]:49610 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOH9n-0004rE-KV for submit@debbugs.gnu.org; Wed, 17 Aug 2022 07:24:19 -0400 Received: from eggs.gnu.org ([209.51.188.92]:48454) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOH9l-0004r2-Ih for 57245@debbugs.gnu.org; Wed, 17 Aug 2022 07:24:19 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:54968) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oOH9f-0001bz-Sh; Wed, 17 Aug 2022 07:24:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=r2UkScDG6g24XgwhOBEeB71yYLDiWxazZF7jPWqsXOc=; b=OnPrqnG+N22A 1YHUH5BG0zw0kGt2h34o7n6XcjjZawBcVT5fsq9PMyTNx8NXYx3hfrqT/2guy/I4NWS5iJCnO3rmg K98TazfHB+iihyFnvsRTELm+rWMJelvB9hSiaDC790RONzvQJzNN3usXs7wBw2PEApMXCXkIb+prZ 22I97plpYMefR2SaKl0y40LpACASyxJaw5/ZnSOEALsbIHOCxvb+wYdfbDm7FCveUTIkGrfMcal22 0Y1Y92SumenOxOwHVjlECXvNgaxIgUTlxcI2aia+kURJxcXAJuR9r+TEwr5C6vjw65C7qyb7W5bs/ L/v6EO+2K69wI1FtwCNIxQ==; Received: from [87.69.77.57] (port=3329 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oOH9f-0005ps-7Q; Wed, 17 Aug 2022 07:24:11 -0400 Date: Wed, 17 Aug 2022 14:24:01 +0300 Message-Id: <83fshvdsfi.fsf@gnu.org> From: Eli Zaretskii To: Dmitry Gutov In-Reply-To: <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> (message from Dmitry Gutov on Tue, 16 Aug 2022 22:32:23 +0300) Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Tue, 16 Aug 2022 22:32:23 +0300 > Cc: 57245@debbugs.gnu.org > From: Dmitry Gutov > > On 16.08.2022 19:54, Eli Zaretskii wrote: > > Stefan, can you see why syntax-related stuff in sgml-mode is so heavy > > here? > > nxml-syntax-propertize might well be heavier than average, but the delay > scales linearly with the size of the file. Which is generally not a good scaling factor, especially if the coefficient is quite large (as it seems to be in this case). > Which seems to be exactly the behavior the "font-lock narrowing" was > supposed to guard from? No. It wasn't supposed to fix modes that foolishly scan the buffer from BOB to point. It was supposed to fix modes which scan from the beginning of line, and that is (a) only a problem when lines are very long, and (b) much harder to solve in the mode itself, because font-lock very frequently uses anchored regexps and otherwise likes to start from BOL, and syntax processing also likes starting from BOL. Btw, does nXML and/or sgml-mode use libxml for their analysis? If not, why not? wouldn't that be faster (and possibly more accurate)? From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 17 07:37:23 2022 Received: (at 57245) by debbugs.gnu.org; 17 Aug 2022 11:37:23 +0000 Received: from localhost ([127.0.0.1]:49638 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOHMQ-0007MR-NN for submit@debbugs.gnu.org; Wed, 17 Aug 2022 07:37:22 -0400 Received: from eggs.gnu.org ([209.51.188.92]:51166) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOHMP-0007ME-2M for 57245@debbugs.gnu.org; Wed, 17 Aug 2022 07:37:21 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46904) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oOHMH-00072o-Nk; Wed, 17 Aug 2022 07:37:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=u/vahiSeM1jcpBiDRjIstQktQttKP/tI5Do8++zmT9w=; b=dT7/q/piEXkv YneonDPcqhkua6VWvnMetXJUCBNgPfu9+ce/5mWvWqplu7LjHWOV93c580OC//Ax49LG/mEGGcfgA bqqyMsIXhzUwDE2QUc69uRLTqPmM5toociHLVUezR3yCv1S5zgyCvXj3+T/qcdX5UNkCyd5KnFm3M FW/xCGH0SPhmF7nb1oGEurDayEtiKCZDePWGO48vI9MbhGpaBrtM5AzpXk7gQjBWNJ1eqX8pbhxUv DqI8rP8YgHwf3i81t1lNS+AVJ6RI5pPi8gKQ12sbFt8db9dC6F+er7HgPooyBjPUn3Sor4wkdl21w jW21wMgv9Wbz5eC5KPDB9A==; Received: from [87.69.77.57] (port=4112 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oOHM8-0001uR-W3; Wed, 17 Aug 2022 07:37:06 -0400 Date: Wed, 17 Aug 2022 14:36:54 +0300 Message-Id: <83bksjdru1.fsf@gnu.org> From: Eli Zaretskii To: Dmitry Gutov In-Reply-To: <4e2838b5-109e-7a27-0230-29dc6624b751@yandex.ru> (message from Dmitry Gutov on Wed, 17 Aug 2022 01:20:38 +0300) Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> <6688b0ad-54e1-4a59-e9b6-4cdc803a8359@yandex.ru> <4e2838b5-109e-7a27-0230-29dc6624b751@yandex.ru> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Wed, 17 Aug 2022 01:20:38 +0300 > Cc: Eli Zaretskii , 57245@debbugs.gnu.org > From: Dmitry Gutov > > Otherwise, syntax-wholeline-max seems to be doing its job fine: if I > comment out the narrowing code in handle_fontified_prop (or switch to > the branch I posted previously), two XML files -- one with long lines > and one without (the files differ only by addition of newlines) -- show > approximately the same delay on M->. Doesn't syntax-wholeline-max only affect long lines? Because I don't think I see its effect in the XML file where lines were broken by newlines, and then the file was duplicated 100 times. From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 17 07:46:59 2022 Received: (at 57245) by debbugs.gnu.org; 17 Aug 2022 11:46:59 +0000 Received: from localhost ([127.0.0.1]:49660 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOHVj-0007b0-Fm for submit@debbugs.gnu.org; Wed, 17 Aug 2022 07:46:59 -0400 Received: from mail-wr1-f48.google.com ([209.85.221.48]:38869) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOHVf-0007al-NW for 57245@debbugs.gnu.org; Wed, 17 Aug 2022 07:46:57 -0400 Received: by mail-wr1-f48.google.com with SMTP id ba1so5329074wrb.5 for <57245@debbugs.gnu.org>; Wed, 17 Aug 2022 04:46:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc; bh=GVoOpPuqgX5sOJw2MwfTDlD7gIsy+t46BlKPsqrkmzg=; b=Qkog7SWqyIPmAH8ynhRSBD375p5FCgtJx81HDkJr342ktlap5sRsXiM3zW96sXTtbQ upY92g1bS2KuNkc/i74qdSPO62FyZxdYCkGVwHHrcEHXXqPo1JUoarRefq3cFqncJZ2W x9E/xVonPc75+LmvJ4vB35Rg0AeFwV4IJoPYm4cjfLESEWtkI2L7LktrEJG3/MR9Jkr7 tsRnh73ZMOfWT6tyTkgleKTY9/E6vrlb05+BJAcWHM41Hk0wL7ekA6TB9QydZrsp8ecj 0wm2OPeW8rUmLjX38pLU1CK8gSASZ8/0snGijzatDkqNK1cbectATU3fgv3Mb2Q6Fw0E SyyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc; bh=GVoOpPuqgX5sOJw2MwfTDlD7gIsy+t46BlKPsqrkmzg=; b=eX8ZTODi2n0AfvOL2Q9TLb5ohrbOp+z4rtsi3S0YzRH2DNvblKjy/AmO9zQeWocY9Y 08Gib5x5XkzsJJLLYZFAPLYiByAIrNNMv+GIlhDVWl6OU2RKfI5O8WGL66D6oCffrM4r lG+bzBOtTvooOd1mw5O3/yafXEuXRJW6nHA57Nxm5pXxs5tihDZEx6PnKN9jaDv7r+xN JXMwdIa4eZfBgsB1+Ux2dnTG0NmcDFRnLESYOenZiXCAg0FDROKfXiB+CQ9P065yW/Gb Sxoupo+n3QkVB9vPKgHg5XVmx57hbfDjoovp2CMux2oeJpCKWZtR5kjQRZ75AgTRucUI BC3A== X-Gm-Message-State: ACgBeo0gFp2PX1HP6ZuDS7McsRzmk5h+rSBmWDOrYyGhiALbz7QGgQEi K2z5bLFyFvv5HCJ8ya85b1g= X-Google-Smtp-Source: AA6agR5atWwg0Oucayi4u20XAim2O5Gu9Q/Te+wtQZFwHOrtKZRA5qzZrJEt6t+HB7AYrfiQAjW35g== X-Received: by 2002:a5d:5281:0:b0:224:fe40:798f with SMTP id c1-20020a5d5281000000b00224fe40798fmr9305896wrv.90.1660736809794; Wed, 17 Aug 2022 04:46:49 -0700 (PDT) Received: from [192.168.0.6] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id f14-20020a05600c154e00b003a32251c3f9sm2412660wmg.5.2022.08.17.04.46.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 Aug 2022 04:46:48 -0700 (PDT) Message-ID: Date: Wed, 17 Aug 2022 14:46:46 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow Content-Language: en-US To: Eli Zaretskii References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> <6688b0ad-54e1-4a59-e9b6-4cdc803a8359@yandex.ru> <4e2838b5-109e-7a27-0230-29dc6624b751@yandex.ru> <83bksjdru1.fsf@gnu.org> From: Dmitry Gutov In-Reply-To: <83bksjdru1.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) On 17.08.2022 14:36, Eli Zaretskii wrote: >> Date: Wed, 17 Aug 2022 01:20:38 +0300 >> Cc: Eli Zaretskii , 57245@debbugs.gnu.org >> From: Dmitry Gutov >> >> Otherwise, syntax-wholeline-max seems to be doing its job fine: if I >> comment out the narrowing code in handle_fontified_prop (or switch to >> the branch I posted previously), two XML files -- one with long lines >> and one without (the files differ only by addition of newlines) -- show >> approximately the same delay on M->. > > Doesn't syntax-wholeline-max only affect long lines? Because I don't > think I see its effect in the XML file where lines were broken by > newlines, and then the file was duplicated 100 times. Its purpose is to handle the slowdown which occurred specifically on long lines because of font-lock-extend-region-functions/syntax-propertize-extend-region-functions. Now that it works -- I don't see any particular slowdowns on long lines, even with narrowing disabled. And the performance of M-> depends solely on the size of a file. In my XML test files, at least. From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 17 08:14:18 2022 Received: (at 57245) by debbugs.gnu.org; 17 Aug 2022 12:14:18 +0000 Received: from localhost ([127.0.0.1]:49697 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOHw9-0008JK-Lu for submit@debbugs.gnu.org; Wed, 17 Aug 2022 08:14:17 -0400 Received: from mail-wr1-f44.google.com ([209.85.221.44]:41978) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOHw7-0008J7-Ib for 57245@debbugs.gnu.org; Wed, 17 Aug 2022 08:14:16 -0400 Received: by mail-wr1-f44.google.com with SMTP id h24so1912696wrb.8 for <57245@debbugs.gnu.org>; Wed, 17 Aug 2022 05:14:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc; bh=fcPLpRDESRgY3aj6I+zdF432oLD5nSXIKpjFT4kyA2U=; b=lO7DK+GuaADMApIW++QL/y6LL4qWBS3wkWPR0SUSpRlaVC0yWGtQTTbYRPNx2QZKHO df8AH2AnY77i/urwdyO/JX81OVISMxi2Qx9PznYcHfzkJ/48vcY6IjoWuqi1Yj+LCAYo EytjWDo5yLuPmuwkogg4ldo2Jtysen5cedxWRkZJezA2tt2kRSswZjd76lpjVOnTPRfQ L1JFGXuXOotv2lxtRs7bgZ8z67mOt7qkRletqxRZIW4YLbjfTQp2/j3v3rdxN2EcuaES S2RMLCMnIVfAM5VE7H3dSl8mHFdnrE/pGwdcyi5geW1ft1qbLySqv+CXZuLM0IjnovDr TEdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc; bh=fcPLpRDESRgY3aj6I+zdF432oLD5nSXIKpjFT4kyA2U=; b=ABHi3eTXyqamHorh93KPl97gvYIlsNv75u+IjzCcA+LqBT49cVD2NHlfBRwdt61aMF J3mEI9gWsb5fXb5ayejYiy+xuAEwQbans0grdIie+GwoCsL7xvlALWviYbt/eqceKcMR tGzaSE5ClTX2bBhLrjQ0MyXyxXOltHEUxX7iGPsRpjeCnHPDnamxA7uTDhhe4Kh4pfwC 4wQMYlBp3yj/4ZekY9BRx7kzUNCUgIeLJyGbZrNrXG8XXgdZbSGLYuPuAPL+qDJMvcDH HaVOgIGIGyNdXoc1lNXRpcUfp3iLhuV0vehnnsGNr/YnPr1HiNfBY8ZRtQgNZhMTvVh3 RU8A== X-Gm-Message-State: ACgBeo29QOIrWx7ueu/5xTz4m5uEdZPQWRbRqziPDUdBkEqOd9ieu7Uo F7DcKftExrj4dsj3s5xWuPg= X-Google-Smtp-Source: AA6agR6TtOZTnXMwrwUE6rWU+DkaVpK9ss0jkE02GD7x78mCAcMNhCeoiIkYZGEBtu6WwKwsdgCORA== X-Received: by 2002:a05:6000:1ac9:b0:220:7f40:49e3 with SMTP id i9-20020a0560001ac900b002207f4049e3mr13941979wry.40.1660738449647; Wed, 17 Aug 2022 05:14:09 -0700 (PDT) Received: from [192.168.0.6] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id z14-20020adff74e000000b0022377df817fsm12518679wrp.58.2022.08.17.05.14.08 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 Aug 2022 05:14:09 -0700 (PDT) Message-ID: Date: Wed, 17 Aug 2022 15:14:07 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow Content-Language: en-US To: Eli Zaretskii References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> <83fshvdsfi.fsf@gnu.org> From: Dmitry Gutov In-Reply-To: <83fshvdsfi.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) On 17.08.2022 14:24, Eli Zaretskii wrote: >> Date: Tue, 16 Aug 2022 22:32:23 +0300 >> Cc: 57245@debbugs.gnu.org >> From: Dmitry Gutov >> >> On 16.08.2022 19:54, Eli Zaretskii wrote: >>> Stefan, can you see why syntax-related stuff in sgml-mode is so heavy >>> here? >> >> nxml-syntax-propertize might well be heavier than average, but the delay >> scales linearly with the size of the file. > > Which is generally not a good scaling factor, especially if the > coefficient is quite large (as it seems to be in this case). Someone can work on the coefficient, but any accurate parser has to scan the buffer from the beginning. At least once. Migration to tree-sitter might give us a better coefficient later, but the principle will remain. >> Which seems to be exactly the behavior the "font-lock narrowing" was >> supposed to guard from? > > No. It wasn't supposed to fix modes that foolishly scan the buffer > from BOB to point. You might want to choose words better. > It was supposed to fix modes which scan from the > beginning of line, and that is (a) only a problem when lines are very > long, and (b) much harder to solve in the mode itself, because > font-lock very frequently uses anchored regexps and otherwise likes to > start from BOL, and syntax processing also likes starting from BOL. syntax-wholelines-max handles that problem. Though it might depend on what you mean by "anchored regexps". > Btw, does nXML and/or sgml-mode use libxml for their analysis? If > not, why not? wouldn't that be faster (and possibly more accurate)? Might be "a simple matter of coding". But we do need syntax-propertize to run, so that the user commands can rely on proper syntax information in the buffer. It remains to be seen whether xml-parse-region is a good base for nxml-syntax-propertize, and how much of a performance improvement it can bring (with all the string marshaling around). nxml also probably handles invalid documents better, which might or might not be important. From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 17 08:16:38 2022 Received: (at 57245) by debbugs.gnu.org; 17 Aug 2022 12:16:38 +0000 Received: from localhost ([127.0.0.1]:49702 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOHyQ-0008NK-7a for submit@debbugs.gnu.org; Wed, 17 Aug 2022 08:16:38 -0400 Received: from eggs.gnu.org ([209.51.188.92]:60218) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOHyO-0008N8-Q5 for 57245@debbugs.gnu.org; Wed, 17 Aug 2022 08:16:37 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:34406) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oOHy7-0005Q1-BE; Wed, 17 Aug 2022 08:16:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=KGpazaUC66YMdiee0bUeU5z3Tyb0SqwL/y/xH80yqRY=; b=IlrjhwOXU8ew GvlzP6kcl8THtXKsi23LQZKSRG5MRZtauS2sGZ6g+r3X2zJAl8ZgRJUAZigeBTfnLmKt+A39vf6M6 fMvicM84pN8eLJXXq61NEwXrztiuYYvTc5OzIWGZFZP5NaBJv4M0Ii2odw3uNIhgs1od4qyNooN5q Auaxi5Rjq+EIW9Jon6NTWNH4YAgVKH8Y0QS3UI56svVL8wFFUZ8UvJo+jq5c3lGQCmWOjw03uXTbn oFwu1G4WMSYWTgrrPnzZHOhiJOeJjrz8+ZDBlu17KajsGIhqpvs8EA+GPsj1xkKmZXWx2dvE7Osl5 73ZN4pa2XC3XvZClpDjkHw==; Received: from [87.69.77.57] (port=2549 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oOHy5-0003Jb-VT; Wed, 17 Aug 2022 08:16:19 -0400 Date: Wed, 17 Aug 2022 15:16:08 +0300 Message-Id: <83wnb7cbg7.fsf@gnu.org> From: Eli Zaretskii To: Dmitry Gutov In-Reply-To: (message from Dmitry Gutov on Wed, 17 Aug 2022 14:46:46 +0300) Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> <6688b0ad-54e1-4a59-e9b6-4cdc803a8359@yandex.ru> <4e2838b5-109e-7a27-0230-29dc6624b751@yandex.ru> <83bksjdru1.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Wed, 17 Aug 2022 14:46:46 +0300 > Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca > From: Dmitry Gutov > > On 17.08.2022 14:36, Eli Zaretskii wrote: > >> Date: Wed, 17 Aug 2022 01:20:38 +0300 > >> Cc: Eli Zaretskii , 57245@debbugs.gnu.org > >> From: Dmitry Gutov > >> > >> Otherwise, syntax-wholeline-max seems to be doing its job fine: if I > >> comment out the narrowing code in handle_fontified_prop (or switch to > >> the branch I posted previously), two XML files -- one with long lines > >> and one without (the files differ only by addition of newlines) -- show > >> approximately the same delay on M->. > > > > Doesn't syntax-wholeline-max only affect long lines? > > Its purpose is to handle the slowdown which occurred specifically on > long lines because of > font-lock-extend-region-functions/syntax-propertize-extend-region-functions. > Now that it works -- I don't see any particular slowdowns on long lines, > even with narrowing disabled. > > And the performance of M-> depends solely on the size of a file. In my > XML test files, at least. Is that a yes? Because if it is, then what does this have to do with the issue of nXML not being scalable enough? From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 17 08:21:18 2022 Received: (at 57245) by debbugs.gnu.org; 17 Aug 2022 12:21:18 +0000 Received: from localhost ([127.0.0.1]:49715 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOI2v-0008Ux-Hc for submit@debbugs.gnu.org; Wed, 17 Aug 2022 08:21:17 -0400 Received: from eggs.gnu.org ([209.51.188.92]:33280) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOI2u-0008Uk-FW for 57245@debbugs.gnu.org; Wed, 17 Aug 2022 08:21:16 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:45688) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oOI2p-0001Vi-7a; Wed, 17 Aug 2022 08:21:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=U/VnyiWL2h9/c2KpU6AkHy1hpi5AnTFRdDwMqUkxPk8=; b=oLCfUKw45eDc O1wDfYlc/QkRYujubIeKuv4kKwWLJafXdedCTZWjmIH4St3MPOm3UKmpTu7+6aaAwLo11z14BEd2K yBqmPxcsCat25Sfv5DjZk4Km2L9VjSNQ2L4xl8B3TkyM4WPlmAF1hMDuVzsHRAe89oDBbJhtSQnsT HHrI+w86Yw0q9Hnca5iYuQM5rNXAMHdQSOYbs7OlxoQrb+4vuPbYGXfmMloHsydn8xpaFaX1kKV+p trUxQRhvTje1LA1Nc4qdsQDco6tLJxImbtDMFRRJSHKvptlhgYnat7Kn++gKIQYdfZdTCLKTDDcLD PXmdXcuv9r+Qr6VmaReoZg==; Received: from [87.69.77.57] (port=2841 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oOI2j-0003yn-Eh; Wed, 17 Aug 2022 08:21:10 -0400 Date: Wed, 17 Aug 2022 15:20:56 +0300 Message-Id: <83v8qrcb87.fsf@gnu.org> From: Eli Zaretskii To: Dmitry Gutov In-Reply-To: (message from Dmitry Gutov on Wed, 17 Aug 2022 15:14:07 +0300) Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> <83fshvdsfi.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Wed, 17 Aug 2022 15:14:07 +0300 > Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca > From: Dmitry Gutov > > >> Which seems to be exactly the behavior the "font-lock narrowing" was > >> supposed to guard from? > > > > No. It wasn't supposed to fix modes that foolishly scan the buffer > > from BOB to point. > > You might want to choose words better. I did. > > It was supposed to fix modes which scan from the > > beginning of line, and that is (a) only a problem when lines are very > > long, and (b) much harder to solve in the mode itself, because > > font-lock very frequently uses anchored regexps and otherwise likes to > > start from BOL, and syntax processing also likes starting from BOL. > > syntax-wholelines-max handles that problem. Only for syntax-related stuff. And we have yet to see whether it's a good enough job: that feature is too young to be sure. From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 17 08:30:35 2022 Received: (at 57245) by debbugs.gnu.org; 17 Aug 2022 12:30:35 +0000 Received: from localhost ([127.0.0.1]:49725 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOIBv-0000IX-2Z for submit@debbugs.gnu.org; Wed, 17 Aug 2022 08:30:35 -0400 Received: from mail-wm1-f44.google.com ([209.85.128.44]:43610) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOIBt-0000IB-3V for 57245@debbugs.gnu.org; Wed, 17 Aug 2022 08:30:33 -0400 Received: by mail-wm1-f44.google.com with SMTP id ay39-20020a05600c1e2700b003a5503a80cfso877949wmb.2 for <57245@debbugs.gnu.org>; Wed, 17 Aug 2022 05:30:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc; bh=/lfv++YbCxMjkSrbSwbSgoSKXsAlOCN1yAmLuRRbkBI=; b=SozAeKL48ar+98//iDpESh8khTqQY67BaXcqjm3N4d0/o1pCJvatf+hpzpzmimm/oI GiyXACKLznfd7gc2o3yq4SHOlks9FULdpf9JRKVSjzK4Zu8smVHoS6FxulaT9b60Z20i nyJa+Z0YYJuJtkdQd/8jx7LeTciKZz50n5J7sVanKbaCC4rU7R+FRFjusw2LS+oq2mn9 Z5aJu3guxCg2MMFgdY797gjFkI7PTuef3eS/aiEU+tPuXPhblCKIrQD8WS4S5VAK71dK 8xnob3oMgKOibdKqjPmlbc9xVYaJ0EYMwUY/Q+yeQsTj6uIIyN1mGPr4b/x53F2/Ti1n r+tQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc; bh=/lfv++YbCxMjkSrbSwbSgoSKXsAlOCN1yAmLuRRbkBI=; b=Qd9n3oE50SBZq/dOtTgodF1yibmpmDMldxY/fxHRR4T6dHcwRIQu8ra5Afk0wlh2q6 q3Z6VrreKBMozfD8Wz1Q27oYXKxTExGKC9IS6hBW9NQbrGJevJgc+ckc1+peFlVzKasl rcGD4iv5o2PZz4xBUS9+FQUvQAw3jcD/fG53WqJjkEgUnMrNsKRW2jHOQVQsNyEoSnda RY8yjlqgIAM+mc8yTwUcC3ifPktII84fOX3CX6kQ84xAwFToJ57CEyoL24Lc5wSM4RoP eLVlycv8/RCKK9ci76FlD93+FZsdxRDNUIbVDFsdWJL5S3QjYrAmOmoH95Kh20mr+e8b aQBQ== X-Gm-Message-State: ACgBeo1RE6bICPAT7l9TWp2vxcbOsuU9OIS036YrBHvi7w5FfCF8uOYm GQfd2VyIWpHUCplyZMYpXF4= X-Google-Smtp-Source: AA6agR4OVEE+CVnfggjO0RCASu5L9XXDw8pLSXAdN7qdgbKwJ+GvouAy3iKsIYvrrfh3GuFbX4iFHw== X-Received: by 2002:a1c:6a0a:0:b0:3a5:bcad:f2cc with SMTP id f10-20020a1c6a0a000000b003a5bcadf2ccmr2053047wmc.74.1660739427068; Wed, 17 Aug 2022 05:30:27 -0700 (PDT) Received: from [192.168.0.6] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id i8-20020a05600c2d8800b003a31ca9dfb6sm2308713wmg.32.2022.08.17.05.30.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 Aug 2022 05:30:25 -0700 (PDT) Message-ID: Date: Wed, 17 Aug 2022 15:30:22 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow Content-Language: en-US To: Eli Zaretskii References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> <6688b0ad-54e1-4a59-e9b6-4cdc803a8359@yandex.ru> <4e2838b5-109e-7a27-0230-29dc6624b751@yandex.ru> <83bksjdru1.fsf@gnu.org> <83wnb7cbg7.fsf@gnu.org> From: Dmitry Gutov In-Reply-To: <83wnb7cbg7.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) On 17.08.2022 15:16, Eli Zaretskii wrote: >> Date: Wed, 17 Aug 2022 14:46:46 +0300 >> Cc:57245@debbugs.gnu.org,monnier@iro.umontreal.ca >> From: Dmitry Gutov >> >> On 17.08.2022 14:36, Eli Zaretskii wrote: >>>> Date: Wed, 17 Aug 2022 01:20:38 +0300 >>>> Cc: Eli Zaretskii,57245@debbugs.gnu.org >>>> From: Dmitry Gutov >>>> >>>> Otherwise, syntax-wholeline-max seems to be doing its job fine: if I >>>> comment out the narrowing code in handle_fontified_prop (or switch to >>>> the branch I posted previously), two XML files -- one with long lines >>>> and one without (the files differ only by addition of newlines) -- show >>>> approximately the same delay on M->. >>> Doesn't syntax-wholeline-max only affect long lines? >> Its purpose is to handle the slowdown which occurred specifically on >> long lines because of >> font-lock-extend-region-functions/syntax-propertize-extend-region-functions. >> Now that it works -- I don't see any particular slowdowns on long lines, >> even with narrowing disabled. >> >> And the performance of M-> depends solely on the size of a file. In my >> XML test files, at least. > Is that a yes? Yes, it's a "yes". Stefan said: > font-lock does suffer from long lines, so the current code's handling > of font-lock makes some sense Meaning that narrowing around font-lock on long lines makes sense. And I replied that no, syntax-wholelines-max should be dealing with long lines issues in font-lock already. > Because if it is, then what does this have to do with > the issue of nXML not being scalable enough? Narrowing around font-lock shouldn't be conditioned on the presence of long lines. It either should be done unconditionally (with larger radius, I guess), or not at all. From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 17 08:33:29 2022 Received: (at 57245) by debbugs.gnu.org; 17 Aug 2022 12:33:30 +0000 Received: from localhost ([127.0.0.1]:49730 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOIEj-0000Mp-Ks for submit@debbugs.gnu.org; Wed, 17 Aug 2022 08:33:29 -0400 Received: from eggs.gnu.org ([209.51.188.92]:36090) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOIEg-0000Ma-Pr for 57245@debbugs.gnu.org; Wed, 17 Aug 2022 08:33:27 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:52504) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oOIEb-0005yH-FW; Wed, 17 Aug 2022 08:33:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=Ic4lMZI0dr5wf8PGFi/wsd7X/F6aTvhMuaqOyMRTMdA=; b=goxgiU5St7Mc VbI8j5dlKcLkgwrt5cj5VWIsm0CcqhjF5eq3uhzcpitQMaNcR3DXuDW+AYKfiFAEVE3dKp3k2xLgY nh5pP1pK8/LJGB0QitnpQmpO++IwVloz2fPRqadR+WZVHRJdHFB3wCs37xscATcNaG0Bk//68rulk wdsa5MRC4pRN/r436basRlFpTQQGdgUUGJBUqyu8qytqzScK3Hkzuzpx5zDqOj118hcg8ym1/9ZTW 6VEiOD/b118+dePOU6IDSfxqr4qjcOASAKD9XWQ4EC3KGQ3aMExOetvIMiMo5ljqQsq175cNbNi0r qCpiF4x3P5Ee5WvXUmsL1Q==; Received: from [87.69.77.57] (port=3593 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oOIEa-000172-Fl; Wed, 17 Aug 2022 08:33:21 -0400 Date: Wed, 17 Aug 2022 15:33:11 +0300 Message-Id: <83tu6bcans.fsf@gnu.org> From: Eli Zaretskii To: Dmitry Gutov In-Reply-To: (message from Dmitry Gutov on Wed, 17 Aug 2022 15:30:22 +0300) Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> <6688b0ad-54e1-4a59-e9b6-4cdc803a8359@yandex.ru> <4e2838b5-109e-7a27-0230-29dc6624b751@yandex.ru> <83bksjdru1.fsf@gnu.org> <83wnb7cbg7.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Wed, 17 Aug 2022 15:30:22 +0300 > Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca > From: Dmitry Gutov > > > Because if it is, then what does this have to do with > > the issue of nXML not being scalable enough? > > Narrowing around font-lock shouldn't be conditioned on the presence of > long lines. It either should be done unconditionally (with larger > radius, I guess), or not at all. Yes, you already said that, and I don't agree (and explained why). Now, can we please agree to disagree and move on? From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 17 08:40:41 2022 Received: (at 57245) by debbugs.gnu.org; 17 Aug 2022 12:40:41 +0000 Received: from localhost ([127.0.0.1]:49743 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOILh-0000XV-BD for submit@debbugs.gnu.org; Wed, 17 Aug 2022 08:40:41 -0400 Received: from mail-wr1-f54.google.com ([209.85.221.54]:37480) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOILc-0000XE-Gr for 57245@debbugs.gnu.org; Wed, 17 Aug 2022 08:40:40 -0400 Received: by mail-wr1-f54.google.com with SMTP id n7so4863724wrv.4 for <57245@debbugs.gnu.org>; Wed, 17 Aug 2022 05:40:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc; bh=mIM2EN5CUdlV39NT22DDaqvgzJ9VaBajo++BqhVltd0=; b=pTAFE99rlH8DNqawmo8YhuN8iuqbVQgxShq+vRmCpfVyfCkZia6ywjdfVHBcr0C6pv M9up9AyAZDQ5Dl+KzuJyAy792hSRqMMCCmqi25vnl245aP3btSxfcWzO7PVbqRU4rA47 ii4RQI7UNRGye3pKJD4PVKdHOXtKYYqzQP0Nr5oPamC/TV7y/I7r2iN7owLkIRSui6JN 1fm035NwYEJLVv1Et5sa5MoYtVypeUPhbyTdf+/sYjuvQ84xm9T7Qae8THBHXHVISter t+KUJ0vd6JzGLhk45+m6OImFWH/UJG6VTZFDhgaTGpDUv7NbMs7zg4+PiFWnz0Ke/gYa eLHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc; bh=mIM2EN5CUdlV39NT22DDaqvgzJ9VaBajo++BqhVltd0=; b=oJwrCgX6v/N8DNn/hvYkbnCrTOTUIe/JZgkL0QuNctDGMwdN0Gu54l9zSFf1eSV2Q5 LHdAORikVhxVext6I4UPv7bi6oROIjKkM7CbYdJ3eSJO66/Ktzv1b/Oyh5/D4WQ9+EHU FJJ+4BAkSkzWUyI9J+ZtqPXiLDY9E1YuP5CBcHaA6VNtomU0NCJxIsKGsNvTyYyxrleU ORgx91vY5eH413cSObKKBz5pdDhNXAYFSFwTb40sc7JVQ1zzXqiWml1dFy0zjqFif4rF oApvjfdbOWYztSCMEzbsmHkw9OXQRETnpNTcFRRJHB13fBIqDr+Uql8doSsBKy7VSVwB 6xSA== X-Gm-Message-State: ACgBeo2RtnKb32neKi2/XjDTnda/yiLzyKSFChy51sdqSdTUuuj8jj6D XuGFmZ5RJllhol3KJ5tSZGMDZ7YN5Sw= X-Google-Smtp-Source: AA6agR4NtDeIFc7qJXQAm/uMdWmcZZW3o2bmD+2vZbrpipQnXLfCQXaH0zRkxgm1W/7ZBGxA/ycaWQ== X-Received: by 2002:adf:e983:0:b0:21e:d487:a5ba with SMTP id h3-20020adfe983000000b0021ed487a5bamr13323742wrm.202.1660740030343; Wed, 17 Aug 2022 05:40:30 -0700 (PDT) Received: from [192.168.0.6] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id j2-20020a5d6042000000b00225232c03fdsm2594761wrt.27.2022.08.17.05.40.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 Aug 2022 05:40:30 -0700 (PDT) Message-ID: <7c8e1a91-6398-6733-82bd-12a6011bc954@yandex.ru> Date: Wed, 17 Aug 2022 15:40:27 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow Content-Language: en-US To: Eli Zaretskii References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> <83fshvdsfi.fsf@gnu.org> <83v8qrcb87.fsf@gnu.org> From: Dmitry Gutov In-Reply-To: <83v8qrcb87.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) On 17.08.2022 15:20, Eli Zaretskii wrote: >> Date: Wed, 17 Aug 2022 15:14:07 +0300 >> Cc:57245@debbugs.gnu.org,monnier@iro.umontreal.ca >> From: Dmitry Gutov >> >>>> Which seems to be exactly the behavior the "font-lock narrowing" was >>>> supposed to guard from? >>> No. It wasn't supposed to fix modes that foolishly scan the buffer >>> from BOB to point. >> You might want to choose words better. > I did. > >>> It was supposed to fix modes which scan from the >>> beginning of line, and that is (a) only a problem when lines are very >>> long, and (b) much harder to solve in the mode itself, because >>> font-lock very frequently uses anchored regexps and otherwise likes to >>> start from BOL, and syntax processing also likes starting from BOL. >> syntax-wholelines-max handles that problem. > Only for syntax-related stuff. font-lock-extend-region-wholelines uses that variable too. > And we have yet to see whether it's a > good enough job: that feature is too young to be sure. Same goes for the long-line-narrowing business. And for us to be sure, people would need to be able to try it and report problems. But as long as handle_fontified_props creates a narrowing with ~5000 char radius, syntax-wholelines-max isn't even given a chance to do its job. From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 17 08:46:29 2022 Received: (at 57245) by debbugs.gnu.org; 17 Aug 2022 12:46:29 +0000 Received: from localhost ([127.0.0.1]:49748 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOIRJ-0000gK-3l for submit@debbugs.gnu.org; Wed, 17 Aug 2022 08:46:29 -0400 Received: from mail-wr1-f49.google.com ([209.85.221.49]:39588) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOIRH-0000g6-O8 for 57245@debbugs.gnu.org; Wed, 17 Aug 2022 08:46:28 -0400 Received: by mail-wr1-f49.google.com with SMTP id r16so6865222wrm.6 for <57245@debbugs.gnu.org>; Wed, 17 Aug 2022 05:46:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc; bh=tXVBTT8nNha28pPmVm6YXYDFBKx8BWPg2gcDVh1fLPE=; b=RzTdzyfkxjpVrBTNCz05rLKO3lsyVlOkNvH4ucXQ4JXmN4gk8I86af+njQgeS2QNCA L9P1VYot6TAACo30e+CJvPE3SXdgv8jg54iDN3pWjF6XDRyKhlIgWdhOIGL7QHVcVkYu Xus2WYFRCqyM6/EfFjwBWLnX4yGV2M5paOQsZhoGUE1CJ9I16+jO6Gc0r7YmM89ZijDC vDY8c+qEicbNxHXRcja5gct7IyYHXNvEacNUcZhOiV/h3mXvT020Fvt154X6VEUm+B0+ ZfvbZR4ibLnWcUX6OrKNFcG7UvNE5D4LfzPyL/4aig0cC98loQHR8VpxvEepyuuPWEcL LAdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc; bh=tXVBTT8nNha28pPmVm6YXYDFBKx8BWPg2gcDVh1fLPE=; b=hyprnzmyqrqBUnrikTtp8gKV4yGJGQB351lC6A2AINwQpJD468iYtmbFUlWDHIjJJd IXMKpeaC0byuOmGwL1AzPniSFbvGobnzc+wS/xAaKmbc4aux279bclWgpnZMWOAayypd TyW3ssRvBgiLEfgbdqkS4JlGM3Ig09h+2Ril+CGUNU2CwwiRiKEzakHlc/oa+W59F+5V js8UpkADcbPkGPzFNskgqriIw3ekXDpHcJv0LPiEAmsVVcepQBBBDKGUBkzI0bXeaRra fZb6MSsRaXhT7nzG+Hng6z2U2atC9Vp9Kd8Btkc29qOiXmSEHrP4RKpPtiX4Fgr+42hM oevA== X-Gm-Message-State: ACgBeo27eTxFf8aLiX3lABFHfs84fibozt8eHY9gsIma5/9C/gCRQCpQ mRmj2pWBheVuFHEX0qXHXs4= X-Google-Smtp-Source: AA6agR7ofXNIxu4+UyR4/BvtRLaGrNe98Zpp1OJkSNAn0g9avb/aS73sU9HocPDvwNu+vO/hLYBtdw== X-Received: by 2002:a05:6000:3c1:b0:225:27bc:3dc8 with SMTP id b1-20020a05600003c100b0022527bc3dc8mr1180429wrg.207.1660740381787; Wed, 17 Aug 2022 05:46:21 -0700 (PDT) Received: from [192.168.0.6] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id j20-20020a05600c191400b003a5c1e916c8sm8274746wmq.1.2022.08.17.05.46.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 Aug 2022 05:46:20 -0700 (PDT) Message-ID: <80291f13-2044-b8be-620d-8dee95a35e4a@yandex.ru> Date: Wed, 17 Aug 2022 15:46:18 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow Content-Language: en-US To: Eli Zaretskii References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> <6688b0ad-54e1-4a59-e9b6-4cdc803a8359@yandex.ru> <4e2838b5-109e-7a27-0230-29dc6624b751@yandex.ru> <83bksjdru1.fsf@gnu.org> <83wnb7cbg7.fsf@gnu.org> <83tu6bcans.fsf@gnu.org> From: Dmitry Gutov In-Reply-To: <83tu6bcans.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, monnier@iro.umontreal.ca X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) On 17.08.2022 15:33, Eli Zaretskii wrote: >> Date: Wed, 17 Aug 2022 15:30:22 +0300 >> Cc:57245@debbugs.gnu.org,monnier@iro.umontreal.ca >> From: Dmitry Gutov >> >> > Because if it is, then what does this have to do with >> > the issue of nXML not being scalable enough? >> >> Narrowing around font-lock shouldn't be conditioned on the presence of >> long lines. It either should be done unconditionally (with larger >> radius, I guess), or not at all. > Yes, you already said that, and I don't agree (and explained why). > Now, can we please agree to disagree and move on? I don't think you explained that, no. If you're referring to the previous discussions, this [bug report] is the first time I have put forward this particular suggestion. So you couldn't have addressed it before that. From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 17 09:20:42 2022 Received: (at 57245) by debbugs.gnu.org; 17 Aug 2022 13:20:42 +0000 Received: from localhost ([127.0.0.1]:49793 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOIyQ-0001aD-3a for submit@debbugs.gnu.org; Wed, 17 Aug 2022 09:20:42 -0400 Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:63373) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oOIyL-0001Zx-GL for 57245@debbugs.gnu.org; Wed, 17 Aug 2022 09:20:41 -0400 Received: from pmg1.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id CBD21100138; Wed, 17 Aug 2022 09:20:31 -0400 (EDT) Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id 38E5A100091; Wed, 17 Aug 2022 09:20:30 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1660742430; bh=rBaXRSIs5Iki9erD04P6Edv5dXhMXyCDrCPr5j2eLcU=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=U6ei1AXt6Hh74rSFqPBPmG8RSPopwQoR4q5MgVKLv09ON2/imPqvpfnwB/Vd3l5iW 9o3H90K0lky07qOCnhucJoZ4LySdR7bddxHGSKFs4Nh/p8sxcpXKZWUK8A7RSD/wgy OZnOxthtkDqQQB0Bd3iRjyFl++7I0WJ034gWrineoiBkFJcnwodRHlbRzRRDA4KQfC M93NIGdIvi427O4xSvwHWBPkhPVGSRhJNjQ2x8bxCinfxydzXpCjHTGRtfJhZ63Uoa iws8IE33VoC3BGbgqaK8Z6mGQ5TXCyQng25hb9PlColWyoOCZb/YrwZ905JOnZ7Ikw +Ged7hvQXvaJQ== Received: from pastel (unknown [45.72.195.111]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 0AA2D120250; Wed, 17 Aug 2022 09:20:30 -0400 (EDT) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#57245: 29.0.50; M-> in a large XML file (without long lines) is slow In-Reply-To: <83fshvdsfi.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 17 Aug 2022 14:24:01 +0300") Message-ID: References: <18035574-1b50-62f4-7605-8cdb33204535@yandex.ru> <83tu6cdt7l.fsf@gnu.org> <913e0b46-7145-d39d-1fcd-bc17094e28f2@yandex.ru> <83fshvdsfi.fsf@gnu.org> Date: Wed, 17 Aug 2022 09:20:28 -0400 User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-SPAM-INFO: Spam detection results: 0 ALL_TRUSTED -1 Passed through trusted hosts only via SMTP AWL -0.049 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain T_SCC_BODY_TEXT_LINE -0.01 - X-SPAM-LEVEL: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 57245 Cc: 57245@debbugs.gnu.org, Dmitry Gutov X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) >> nxml-syntax-propertize might well be heavier than average, but the delay >> scales linearly with the size of the file. > Which is generally not a good scaling factor, especially if the > coefficient is quite large (as it seems to be in this case). For most languages, this is the minimum scaling factor that allows the result to be correct in all cases. So, as a general rule, it should be considered as a good scaling factor, I think (when seen as a judgment on the implementation quality of a major mode). Obviously, that won't work well in really large buffers, but to a large extent that should be blamed on the language rather than its major mode. For this reason, we need to add hacks/heuristics (e.g. not highlighting, accepting occasional broken highlighting, delaying highlighting, younameit) if we want to be able to handle such large buffers in a timely fashion. Stefan