From debbugs-submit-bounces@debbugs.gnu.org Sat May 20 23:14:38 2023 Received: (at submit) by debbugs.gnu.org; 21 May 2023 03:14:38 +0000 Received: from localhost ([127.0.0.1]:59761 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0ZWo-0001S7-9u for submit@debbugs.gnu.org; Sat, 20 May 2023 23:14:38 -0400 Received: from lists.gnu.org ([209.51.188.17]:58576) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0ZWk-0001Ry-PQ for submit@debbugs.gnu.org; Sat, 20 May 2023 23:14:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q0ZWk-0004qW-1Z for bug-gnu-emacs@gnu.org; Sat, 20 May 2023 23:14:34 -0400 Received: from mail-yb1-xb2f.google.com ([2607:f8b0:4864:20::b2f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1q0ZWi-0001nE-HR for bug-gnu-emacs@gnu.org; Sat, 20 May 2023 23:14:33 -0400 Received: by mail-yb1-xb2f.google.com with SMTP id 3f1490d57ef6-ba878d5e75fso6413201276.3 for ; Sat, 20 May 2023 20:14:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684638871; x=1687230871; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=aKX3lU9/IL3jNjlM/sihf/4MRR1Yf4y2vZrX5p6z7fE=; b=h8JnHWBDGZ2CzU2XhJcnU/wkdidX6iR992vGhHefm1My8XnuNIb/rUuwAVJwgHbUIw 5ocr8M/oICyNnpGZLj8UPXNdt+A8N4XIOOtcFEWX/7aT3AhZmkKUikoccy4wPUpUc7/i w4hbvytXJ9WGgCBx8hHVRgR8aH59PYAfqEvO5/XPeEGede/NuqwS4NoGmpm5SPU2qA+1 JVT39cl1aYo4zcf5dVg2nwE1l1ZkUt0N9/QYQdCUnw3sdwiHuTYe0JHgB4ChStB9oXF/ xjFdd8RIGA0v6hgPHJnCXybqH9idpWtz/LLBYETrwwaFh9/UzSEh+C+wHi6nmg/eIVzV EhBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684638871; x=1687230871; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=aKX3lU9/IL3jNjlM/sihf/4MRR1Yf4y2vZrX5p6z7fE=; b=ipfSklvVmMtKxjw61EwyHb2yfbS+Qod8r5QQVbWb0APD+DZJNsbcylRvQ1EmuCH254 yNDid8YQo/dE6bTQYXmrVEmPBeeWyKBcbgavgKj/R3l4ji4EKI5Ybd9MuS2R4YdzN7rz LjaAGQAlGVDVNNThiAKFI97dbdM7IMXR1S+dr21CqEM5FOCQwi2+dDd95OluvLwUy7Cm 2pXpmXnqEIWiK8LsUEgz6qTQavCfG/Sj94qgxC2hc2dqFDVywY+vgvF/qoF0C+FHcpr9 3IhmLcAmSVpC2HN5NddtHgYaibdaalMYtB7tK7yN6q1982sjHg4UxBNqbUm1l/RMJbS0 9dAQ== X-Gm-Message-State: AC+VfDySdgstzgO/0HXIv13yXuM5CnmPoNGjitAyXg7V4yfN9lKiLU1a wrcb3+3RkmoYxtELZWm6qhU0nNGgutJ4j9y+YRS7v8rm3ow= X-Google-Smtp-Source: ACHHUZ4r/sVcd2J43RqN+1SlcS+25H2Fi/HqVa8zKW2ZVahDAWGOdkEHK5DrvfLprKvUs+IgZKEpEu1dBvols2uF6nA= X-Received: by 2002:a25:b601:0:b0:ba8:373d:821a with SMTP id r1-20020a25b601000000b00ba8373d821amr6617726ybj.23.1684638870854; Sat, 20 May 2023 20:14:30 -0700 (PDT) MIME-Version: 1.0 From: Tom Gillespie Date: Sat, 20 May 2023 20:14:19 -0700 Message-ID: Subject: lisp/progmodes/python.el: performance regression introduced by multiline font-lock To: Emacs Bug Report Content-Type: text/plain; charset="UTF-8" Received-SPF: pass client-ip=2607:f8b0:4864:20::b2f; envelope-from=tgbugs@gmail.com; helo=mail-yb1-xb2f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) The changes in 4915ca5dd4245a909c046e6691e8d4a1919890c8 have introduced a significant performance regression when editing python code that contains dictionary structures across multiple lines. The current behavior makes Emacs unusable when editing python dictionaries with more than 20 or so lines. I would suggest reverting the commit until the performance issue can be addressed. If I had to guess, this is probably being caused by a double zero-or-more pattern (possibly implicit) in the new regexps that were added/changed. The literal dictionary below is sufficient to demonstrate the issue and if you bisect and compare the behavior to the immediately prior commit 31e32212670f5774a6dbc0debac8854fa01d8f92 the difference is clear. Open the file and hit enter a couple of times and the lag is obvious (if you can't detect the issue try doubling the number of lines at the deepest nesting level from 25 to 50). By profiling and varying the number of repeated lines (e.g. by doubling them) it appears that the issue is some lurking quadratic behavior in syntax-ppss as a result of the changes in 4914ca. In my testing 25, 50, and 100 lines take 100ms, 800ms, and 5 seconds respectively to insert a new line while the cursor is inside the outer most paren. Collapsing all the structures into one line hides the issue. The longer each individual line the more rapid the slowdown. The example below is not syntactically correct python, however it highlights the issue in a way that is clearer than it would be otherwise. Examples that trigger the issue (repeat the 2nd line 50 or 100 times to see the effect). Any combination of paren types will cause the issue. The closing paren does not have to be present and does not prevent the issue. #+begin_src python ['' '' [] #+end_src #+begin_src python ['' [] '' #+end_src #+begin_src python ['' '' {} #+end_src #+begin_src python {'' '' () #+end_src #+begin_src python ["" '' [] #+end_src Examples that are do not cause the issue. #+begin_src python [a '' [] #+end_src #+begin_src python ['' '' a #+end_src #+begin_src python ['' '' '' #+end_src #+begin_src python [[] [] [] #+end_src #+begin_src python [[] [] '' #+end_src Example of syntactically correct python that causes the issue. #+begin_src python {'': { '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, '': {'': ''}, }, '': ['']} #+end_src From debbugs-submit-bounces@debbugs.gnu.org Sun May 21 00:54:08 2023 Received: (at 63622) by debbugs.gnu.org; 21 May 2023 04:54:08 +0000 Received: from localhost ([127.0.0.1]:59826 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0b56-0004OS-GU for submit@debbugs.gnu.org; Sun, 21 May 2023 00:54:08 -0400 Received: from mail-yw1-f175.google.com ([209.85.128.175]:56750) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0b55-0004Nz-1s for 63622@debbugs.gnu.org; Sun, 21 May 2023 00:54:07 -0400 Received: by mail-yw1-f175.google.com with SMTP id 00721157ae682-561b65b34c4so66189467b3.1 for <63622@debbugs.gnu.org>; Sat, 20 May 2023 21:54:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684644841; x=1687236841; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=UpMY4CpMNDGND77PlaKfmEx0TXaOBW2wxOnSxlYv8SY=; b=iZmyW4WPK6/9RoVT+xWJHZBrJL2tdcWWhH/tlubFK7sfWtk/GIQvaRFj5sPLuyh4WA nh+QYbmc46w0AytqzGUuApl5v3tT1mol9k962JwRPFOnBCqF/HKNpHTeMklowRZ+Ph5i KhwxWMKf3tzlG4RYSn8bnjvRLfSfHfnfNco008tAcxE9LGx5X+DgXBcwIDZUYrC/gV3O knN1xSeAncra/lFp4URZeICHAVRp5tiNhsy+u5inp7oW0cekIIk3/NFruJNnBlCFoD97 Lakevh/tXBU323GItr4ftRIdKSK2OqqPtZNQSrfZtL58AMkWTcTT9cAVdwQR+/CmE+cQ OSCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684644841; x=1687236841; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=UpMY4CpMNDGND77PlaKfmEx0TXaOBW2wxOnSxlYv8SY=; b=INMRGDxSPytdHMfXvFBG1wV9GxUoojIazYFvvN6i3zwGF+bRSDEmgkaKsSjMd+cQRZ kmNQFzLWw0jwgGuY6GEONkgtwD1kHFB3drI3PZtLiSmQnadaDIrSwopL2rUbFaNChbWD bVJ9fSDGPFQh++Px5DjrNJwkHBupoSQ1KXM+rbPuNPjV09tGpAveCoEkShcZZBlktn1h 9/AGtuE5NIrxfLCE49yzj41jf1vOWfL+6+TodxR8Y6h7Cw9PktuCdQ/a5+XRz+XGqSaL iO/G6rByrByH8GVjob5tGglCt8b+If7XSQT78x2hrUVakRSfmAL7oMYB6Ht4YrAIPPtS itng== X-Gm-Message-State: AC+VfDw9z39qT5vQ7QYirajAfnKBRWRjhMaaCMr4yYvm9BKrlC0DqvYy exCPrtDmXxl5syIh5YllyEE8jyvxy1yF+ZyjZe9tHLOP X-Google-Smtp-Source: ACHHUZ5DYKCXY4l5ZEffLVqcbBvMGWgzw9nOnoA0lxgFmXqGLIbr8l4+MUTYPpFDmKlSj8bSyzS5CLAzmYTJ9ePzjCA= X-Received: by 2002:a81:7405:0:b0:564:c4db:6329 with SMTP id p5-20020a817405000000b00564c4db6329mr4383787ywc.11.1684644841059; Sat, 20 May 2023 21:54:01 -0700 (PDT) MIME-Version: 1.0 From: Tom Gillespie Date: Sat, 20 May 2023 21:53:49 -0700 Message-ID: Subject: source of problem identified to be python-font-lock-extend-region To: 63622@debbugs.gnu.org Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 63622 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Despite my previous speculation about the regexp, the issue is in python-font-lock-extend-region. When replaced with a no-op the performance returns to normal. From debbugs-submit-bounces@debbugs.gnu.org Sun May 21 02:08:40 2023 Received: (at 63622) by debbugs.gnu.org; 21 May 2023 06:08:40 +0000 Received: from localhost ([127.0.0.1]:59897 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0cFD-0006xd-L2 for submit@debbugs.gnu.org; Sun, 21 May 2023 02:08:39 -0400 Received: from eggs.gnu.org ([209.51.188.92]:52912) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0cFC-0006xO-MM for 63622@debbugs.gnu.org; Sun, 21 May 2023 02:08:39 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q0cF5-0006sc-MH; Sun, 21 May 2023 02:08:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=Q0oeSrmegY10fZuUk/D93RqJzF/VJl22DHjvFR9JkP4=; b=aIam6FJlcZKe Ta6XKspOxriyagwyIXUCxpBv67DnHpuH9Nakbs85q7X0oH0nY2/dXnDdWjOMxo2mSuaJUHPty6I33 F8JlG0A/P6tyxU/48gQieN14SOySNjYnIUuyXjFDGFZxUAamDsiNovweXckeCYpNz2GyTS0YebvH1 A/zYpnawm/5vku7nPbuLD//E8KgegCyOM++WO+xZZ5mVb1KVJBB8DBdOxJ+Q2bqdZrJuTaSvLF3i6 +zhTLTZEETSeJF/NqyRuDMkxurNn/Zaqe/hA7aZ8Bt5C5sbUwmO8XFo41zlkWGgnFyqhfvO7AWXSK 9vwTfWRo2xS+dnjUgBRK3A==; Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q0cF4-000521-TG; Sun, 21 May 2023 02:08:31 -0400 Date: Sun, 21 May 2023 09:08:50 +0300 Message-Id: <83zg5yqkr1.fsf@gnu.org> From: Eli Zaretskii To: Tom Gillespie , kobarity , Stefan Monnier In-Reply-To: (message from Tom Gillespie on Sat, 20 May 2023 20:14:19 -0700) Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock References: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 63622 Cc: 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Tom Gillespie > Date: Sat, 20 May 2023 20:14:19 -0700 > > The changes in 4915ca5dd4245a909c046e6691e8d4a1919890c8 have > introduced a significant performance regression when editing python > code that contains dictionary structures across multiple lines. > > The current behavior makes Emacs unusable when editing python > dictionaries with more than 20 or so lines. I would suggest reverting > the commit until the performance issue can be addressed. If the problem is so severe, I wonder how come this comes up only now, 9 months after those changes were installed. It probably means these cases are quite rare in practice. Nevertheless, it would be good to solve them, of course. FWIW, python-ts-mode doesn't show performance issues in the examples you posted. > The literal dictionary below is sufficient to demonstrate the issue > and if you bisect and compare the behavior to the immediately prior > commit 31e32212670f5774a6dbc0debac8854fa01d8f92 the difference is > clear. Open the file and hit enter a couple of times and the lag is > obvious (if you can't detect the issue try doubling the number of > lines at the deepest nesting level from 25 to 50). > > By profiling and varying the number of repeated lines (e.g. by > doubling them) it appears that the issue is some lurking quadratic > behavior in syntax-ppss as a result of the changes in 4914ca. In my > testing 25, 50, and 100 lines take 100ms, 800ms, and 5 seconds > respectively to insert a new line while the cursor is inside the outer > most paren. > > Collapsing all the structures into one line hides the issue. The > longer each individual line the more rapid the slowdown. > > The example below is not syntactically correct python, however it > highlights the issue in a way that is clearer than it would be > otherwise. Any chance of your posting some real-life Python code where the issue rears its head? I mean, real-life code that makes sense, not just syntactically correct code invented to make a point? > Despite my previous speculation about the regexp, > the issue is in python-font-lock-extend-region. When > replaced with a no-op the performance returns to normal. kobarity, could you please look into this ASAP? Emacs 29.1 is in late stages of pretest, and I'd like to have this resolved, one way or another, soon enough. TIA. P.S. Tom, please don't change the Subject when posting followups, please use the same Subject for all your messages that discuss this bug. Also, for the record, please state which Emacs version are you using. You didn't use "M-x report-emacs-bug" to submit the bug report, so this and other important information is missing from your OP. From debbugs-submit-bounces@debbugs.gnu.org Sun May 21 03:13:55 2023 Received: (at 63622) by debbugs.gnu.org; 21 May 2023 07:13:55 +0000 Received: from localhost ([127.0.0.1]:59953 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0dGN-0000QI-2Q for submit@debbugs.gnu.org; Sun, 21 May 2023 03:13:55 -0400 Received: from mail-yw1-f177.google.com ([209.85.128.177]:61598) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0dGL-0000Q1-Hi for 63622@debbugs.gnu.org; Sun, 21 May 2023 03:13:54 -0400 Received: by mail-yw1-f177.google.com with SMTP id 00721157ae682-561f10b6139so31015477b3.2 for <63622@debbugs.gnu.org>; Sun, 21 May 2023 00:13:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684653227; x=1687245227; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=WezCPdWpUWZCisNsFJ0VfoJBn9jsntHVHs2REFPakgQ=; b=fzUqfNRFFkq+NxFkkO+GCojeuGrpnz/jwK+a2hgtI9wUS3WZW84TPdQWso6PIO/k+s VQjZ3xkfqaI4MJhOV8N1vCQawqV7o4eFjRl6bfG7T3eZpagWfTL8nYdzTYp9XewecOhv q0sQvj46eUmjn14CzWo5SqEpiKOdooB8Z0ieGHgrCOG649mE3ROJxDGxBk9QTA2OrfwS KWzNkiVZkZZUQRyy3fxBT2ZlOoNjTAqm+x4trpA7V4IVYm2fSZiX8PcnMRHbKISS/G4F JDAeebOg6hCIWtrWNf0Et6JzHRArNk7nbpRIVAmZBclsO6qDqhkYnARjz7ncZLqnv/y1 CyTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684653227; x=1687245227; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=WezCPdWpUWZCisNsFJ0VfoJBn9jsntHVHs2REFPakgQ=; b=byhLtPBeoG1B7xAAwfRr+X6j2Yuuzs8NZKP8C+KPQatf3aTrdN6TAIZQEEpnbdHklW VqZCi9RK6mij+cl36SOdqRBZNEWTTIscyyN4bOw8Bav2bXhEqzrIfEmrYP78MRLeQqRQ b5giuIliP0gzMioNf938Rk3IAEcSzrQAyKK5gbQ/VbmXCH9z348FWZ6m07ANIbRAXvp1 K73zWNAbLiMwCbQ+PSyAzM7v5vjcsGN6W458bXmuA+wZO6n4WMXJ8UH1+YKRYBJczH6n 266Qyenc6VJedQMP8xIqH4RUIObuf/i9aTQpF6w7+LkAP00U1S9wfYXemm669M7LiTuA 77sg== X-Gm-Message-State: AC+VfDyMd3KHLfWddwD3hre5ZkCkSnBXaEcCSvIIm09ddQcK4xr4VunE lizK47ATJP77bfqXoI8EznUKDzT5vzMHe0Suqhs= X-Google-Smtp-Source: ACHHUZ7kX4Q8xzETvh2tr4bq9X0lrt5lIZDBGe1H6VNyWNIcxHFtS5SPscV2u7pFPd6vnBDWrWUjOZriBkysHrMGs8c= X-Received: by 2002:a81:8787:0:b0:561:b58e:31ff with SMTP id x129-20020a818787000000b00561b58e31ffmr6858120ywf.44.1684653227618; Sun, 21 May 2023 00:13:47 -0700 (PDT) MIME-Version: 1.0 References: <83zg5yqkr1.fsf@gnu.org> In-Reply-To: <83zg5yqkr1.fsf@gnu.org> From: Tom Gillespie Date: Sun, 21 May 2023 00:13:36 -0700 Message-ID: Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock To: Eli Zaretskii Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 63622 Cc: kobarity , Stefan Monnier , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > If the problem is so severe, I wonder how come this comes up only now, > 9 months after those changes were installed. It probably means these > cases are quite rare in practice. Nevertheless, it would be good to > solve them, of course. I suspect it is because there are 3 factors that have to be just right to notice. 1 the opening paren and the quote must be immediately adjacent to get exceptionally bad behavior, there is still some performance degradation when there is separation, but it would be harder to notice. 2 a user would have to directly edit a dictionary literal with enough lines to notice the slowdown. 3 assigning the dictionary to a variable mitigates the issue, so only a dict that is not assigned results in the full slowdown. > FWIW, python-ts-mode doesn't show performance issues in the examples > you posted. I would imagine so. I've continued trying to hunt down the source of the issue, and it is triggered by setting python-font-lock-extend-region as the font-lock-extend-after-change-region-function function for python this is true in old versions of Emacs (e.g. 28.2) as well. As far as I can tell the existing implementation for python font locking has some quadratic behavior that is revealed when a region is extended inside a nested dictionary with multiple lines. > Any chance of your posting some real-life Python code where the issue > rears its head? I mean, real-life code that makes sense, not just > syntactically correct code invented to make a point? Yep, basically any nested dictionary literal with more than 15 lines is affected. With a note that the issue is masked if there is an equal sign (=) before the opening paren, which is the common case. An example of the particular file that caused me to spot the issue: https://github.com/tgbugs/pyontutils/blob/master/pyontutils/auth-config.py > P.S. Tom, please don't change the Subject when posting followups, > please use the same Subject for all your messages that discuss this > bug. Ack, apologies. I will keep it the same in the future. > Also, for the record, please state which Emacs version are you using. > You didn't use "M-x report-emacs-bug" to submit the bug report, so > this and other important information is missing from your OP. Ok, I wasn't sure how to handle it in this case since I was able to reproduce the issue in multiple versions. For the record: The version I spotted it on was the emacs-29 branch at 3bc5efb87e5ac9b7068e71307466b2d0220e92fb but everything on emacs-29 after 4915ca5dd4245a909c046e6691e8d4a1919890c8 is affected (according to git bisect results). So 29.0.90 and 29.0.91 should be affected as well. From debbugs-submit-bounces@debbugs.gnu.org Sun May 21 05:32:19 2023 Received: (at 63622) by debbugs.gnu.org; 21 May 2023 09:32:19 +0000 Received: from localhost ([127.0.0.1]:60029 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0fQI-0004B8-Ki for submit@debbugs.gnu.org; Sun, 21 May 2023 05:32:18 -0400 Received: from mail-pf1-f180.google.com ([209.85.210.180]:60902) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0fQF-0004Ak-6H for 63622@debbugs.gnu.org; Sun, 21 May 2023 05:32:17 -0400 Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-64d1a0d640cso2556399b3a.1 for <63622@debbugs.gnu.org>; Sun, 21 May 2023 02:32:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684661529; x=1687253529; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:from:to:cc:subject:date:message-id:reply-to; bh=XOF79bhjr/0+4j6+r/KPqY7LeH5m7kMSPSsXJ3eE2I0=; b=Kz4f2T/P62Jv9yvJWuF0mijXPYZvjx4SSWlY4ZqFEm29W30BLnyFvKX62aZwGdhcSb R3A+BJeyZ9x9aaoIbSId29p3yVEL2fBsYNepFOxs2Rh0MQyXyDM1JNry+XvJG1ss1gia Isu1sijpOSSbAVtORU2ae0p/9xnVnYF3i4Gi5zotqks+gdr+GwwNyQWFtyrnBQZN4mjS PsWCEYwi56KSLAB+v1aCb95eOfaJ8HEI+iMuF0OLCUWGjNnqr/IsGox/32hDdrKVA8or JuIl6LN8HjN+FhIVH8IlVZEfqcNPhkDm9QPZrZN9Qx6pIJ2w9VdiSrqLG/iFgkSOdbVI Xs8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684661529; x=1687253529; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=XOF79bhjr/0+4j6+r/KPqY7LeH5m7kMSPSsXJ3eE2I0=; b=lzFe7NMBPJHhrMKBF9oDfMsDpczsD2+kb9BMG78ez3FU9sYxSVKBWP2top/aePSEuu Zv9woicmYQP7Ta1tBokmSUJ6gaYmFi24KUaQTcrswJ0Qm5VSykFWV6TIjCVkUu38pKSK 6rXbxo8jFIRPX9fKwUPP8GJO/o0DN/HXMOAGJy+By9DccbcH6aknIjQbAGFbs3jpyoP3 ZWUsFKUNOfGuKegFfOvECyfKQO9E4lSQny1IvvwGjknWFA/PoDYcVyx1Xb8+9OcUOnYE A5iiu/WGDnE214i9YGafv8RRi4a0VpuMBP3Ylf5ozNiozK9W07RI1ppgtDN6VrPw4aPT DYfg== X-Gm-Message-State: AC+VfDyy9b6AQXecYJNltQ7BZl9rYcOsgKpBF2XTnsQNXIWD0wJcdALz K93vho3r/ivOXyxuNC27MP4= X-Google-Smtp-Source: ACHHUZ42pFABH0Uf0ZHmjdR8n8Z68gDfHtwX7o2NseXWfbrzqEMlEiqNqAU/K1GeKkBmRhWwceGrag== X-Received: by 2002:a05:6a20:914e:b0:104:f534:6c8d with SMTP id x14-20020a056a20914e00b00104f5346c8dmr7533394pzc.33.1684661529142; Sun, 21 May 2023 02:32:09 -0700 (PDT) Received: from localhost (58x12x133x161.ap58.ftth.ucom.ne.jp. [58.12.133.161]) by smtp.gmail.com with ESMTPSA id v17-20020a63f851000000b0052c3f0ae381sm2496036pgj.78.2023.05.21.02.32.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 21 May 2023 02:32:08 -0700 (PDT) Date: Sun, 21 May 2023 18:31:50 +0900 Message-ID: From: kobarity To: Tom Gillespie Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock In-Reply-To: References: <83zg5yqkr1.fsf@gnu.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.0.50 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/mixed; boundary="Multipart_Sun_May_21_18:31:47_2023-1" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 63622 Cc: Eli Zaretskii , Stefan Monnier , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --Multipart_Sun_May_21_18:31:47_2023-1 Content-Type: text/plain; charset=US-ASCII Tom Gillespie wrote: > The changes in 4915ca5dd4245a909c046e6691e8d4a1919890c8 have > introduced a significant performance regression when editing python > code that contains dictionary structures across multiple lines. Hi Tom and Eli, Thanks for bringing this issue to my attention. > As far as I can tell the existing implementation for python font locking > has some quadratic behavior that is revealed when a region is extended > inside a nested dictionary with multiple lines. I agree. It seems to me that it is not python-font-lock-extend-region itself that is slow, but rather font-lock's processing of the area extended by it. So one workaround would be to limit the number of lines to be extended, as in the attached patch. If this limit is exceeded, the area or the entire buffer must be font-locked manually later. What do you think of this idea? Even if we adopt this idea, there remain several points to consider: - How many lines are appropriate for the limit? - Is it better to make the limit customizable? - python-ts-mode should be excluded for this limit? --Multipart_Sun_May_21_18:31:47_2023-1 Content-Type: application/octet-stream; type=patch; name="0001-Workaround-performance-degradation-when-editing-mult.patch" Content-Disposition: attachment; filename="0001-Workaround-performance-degradation-when-editing-mult.patch" Content-Transfer-Encoding: 7bit >From 3224ef9d2718c85281f1fa789708efb0b5aa5fff Mon Sep 17 00:00:00 2001 From: kobarity Date: Sun, 21 May 2023 17:50:09 +0900 Subject: [PATCH] Workaround performance degradation when editing multiline Python expression * lisp/progmodes/python.el (python-font-lock-extend-max-lines): New variable. (python-font-lock-extend-region): Limit extending the region to python-font-lock-extend-max-lines. (Bug#63622) --- lisp/progmodes/python.el | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/lisp/progmodes/python.el b/lisp/progmodes/python.el index 6fc05b246a6..82d15536de0 100644 --- a/lisp/progmodes/python.el +++ b/lisp/progmodes/python.el @@ -869,16 +869,25 @@ python-font-lock-keywords Which one will be chosen depends on the value of `font-lock-maximum-decoration'.") +(defvar python-font-lock-extend-max-lines 10 + "Maximum number of lines to extend the font-lock region. +This is a workaround to avoid performance degradation when +editing expressions that span many lines. See Emacs Bug#63622.") + (defun python-font-lock-extend-region (beg end _old-len) "Extend font-lock region given by BEG and END to statement boundaries." (save-excursion (save-match-data (goto-char beg) (python-nav-beginning-of-statement) - (setq beg (point)) + (when (<= (- (line-number-at-pos beg) (line-number-at-pos)) + python-font-lock-extend-max-lines) + (setq beg (point))) (goto-char end) (python-nav-end-of-statement) - (setq end (point)) + (when (<= (- (line-number-at-pos) (line-number-at-pos end)) + python-font-lock-extend-max-lines) + (setq end (point))) (cons beg end)))) -- 2.34.1 --Multipart_Sun_May_21_18:31:47_2023-1-- From debbugs-submit-bounces@debbugs.gnu.org Sun May 21 11:17:30 2023 Received: (at 63622) by debbugs.gnu.org; 21 May 2023 15:17:30 +0000 Received: from localhost ([127.0.0.1]:60978 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0koM-0008M0-4J for submit@debbugs.gnu.org; Sun, 21 May 2023 11:17:30 -0400 Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:39657) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0knE-0008KN-Ul for 63622@debbugs.gnu.org; Sun, 21 May 2023 11:17:28 -0400 Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 4027180AFC; Sun, 21 May 2023 11:16:15 -0400 (EDT) Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id CF02180677; Sun, 21 May 2023 11:16:13 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1684682173; bh=Ibof+W5+SBqviJmmxzZhw52IVoOfC8RhC44NT5vZ1qw=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=TwFFKXZcXlKU8OsY0oWOQSL66/A09WzbwOKUch3W27XqUCltlMfb9No5IfU8kntgA T+OIHg48e4mbda55bJu32pu4bNdORdn8E0Z039SN+VwK2bRAeLpoltYzubqUaVEo3O y2KZMk8Ad85dOUiBe3DF+neskQGrzsWMqHg9V7Op9vM0u5txiMpD77/fH4OKI/OV8C ymAcPzMHRcTt3Yw3aIj+qfFCe3uX3lreWZVjxh+O6Yddrw8DhYUG7W/QWrJ/223N16 Se8n6OFsngbcglP5W3oq0h+UgdOjrJuEcJaSkiAUzczFuAjxVCcVrb0DZRw8UsREjV YYaVsgyYjRcxA== Received: from pastel (unknown [45.72.217.176]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 9F16E1203EF; Sun, 21 May 2023 11:16:13 -0400 (EDT) From: Stefan Monnier To: kobarity Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock In-Reply-To: (kobarity@gmail.com's message of "Sun, 21 May 2023 18:31:50 +0900") Message-ID: References: <83zg5yqkr1.fsf@gnu.org> Date: Sun, 21 May 2023 11:16:13 -0400 User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain X-SPAM-INFO: Spam detection results: 0 ALL_TRUSTED -1 Passed through trusted hosts only via SMTP AWL 0.085 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain T_SCC_BODY_TEXT_LINE -0.01 - X-SPAM-LEVEL: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 63622 Cc: Tom Gillespie , Eli Zaretskii , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > I agree. It seems to me that it is not python-font-lock-extend-region > itself that is slow, but rather font-lock's processing of the area > extended by it. So one workaround would be to limit the number of > lines to be extended, as in the attached patch. If this limit is > exceeded, the area or the entire buffer must be font-locked manually > later. What do you think of this idea? FWIW, I recommend against using `font-lock-extend-after-change-region-function`. E.g. in a case like `python-font-lock-assignment-statement-multiline-1`, the current code may misfontify code because jit-lock may decide to first call font-lock on a chunk that goes until: [ a, b and will call again font-lock later to fontify the rest: ] # ( 1, 2 ) and this can happen with no buffer modification at all (e.g. on the initial fontification of a buffer). You can use `font-lock-extend-region-functions` instead (which performs the region-extension right before fontifying a chunk) to avoid this problem. [ It won't help with the current performance issue, tho. ] `font-lock-extend-after-change-region-function` can also be costly when a command makes many changes (since `font-lock-extend-after-change-region-function` is called for every such change rather than once at the end). `font-lock-extend-region-functions` tends to be better behaved in this respect (it's called once per chunk, and there's usually only a single chunk (re)fontified per command, even after a command that makes many changes), One more thing: Tom mentioned a suspicion that the performance issue may have to do with interaction with `syntax-ppss`. This is odd, because `syntax-ppss` and `syntax-propertize` should not be affected by font-lock. Stefan From debbugs-submit-bounces@debbugs.gnu.org Sun May 21 11:44:16 2023 Received: (at 63622) by debbugs.gnu.org; 21 May 2023 15:44:16 +0000 Received: from localhost ([127.0.0.1]:60987 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0lEF-0000XX-Si for submit@debbugs.gnu.org; Sun, 21 May 2023 11:44:16 -0400 Received: from mail-pl1-f181.google.com ([209.85.214.181]:42309) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q0lEE-0000XL-RX for 63622@debbugs.gnu.org; Sun, 21 May 2023 11:44:15 -0400 Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1ae507af2e5so18431675ad.1 for <63622@debbugs.gnu.org>; Sun, 21 May 2023 08:44:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684683848; x=1687275848; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:from:to:cc:subject:date:message-id:reply-to; bh=ajLDPAw6y+ZRRptK4OdfzzJkp6tOjItl/iZTK4vmvCU=; b=XDPoBZJa2wUx/4En8TRfl5fGKxhHZz6QGM5qsAfAY7NU9EVI20n/Y/kHYyQffEoazI 03KLctSDTICUIgoDe0CYO7qhZRIn5dBBnYeCh8/fN7vff8WP5RSCmtxBTCTGlxOXJE6A xgNcwv/502cotj00sBKh6SCSO0/bl6gJee6FQpQC+lvxAnM0xHjmFI+MH0CJ0IFWmiWT 8G2ewPLO+ZL6un72n53DEDTzMkEihLi9YLj0ZF4ea7F3NzxZXQYlxEFc/kHBXIu/wYVY TURWxxWtH3HNcKu430uTsS0VzWu+MHLVj3Pa7XqjzBa+I+noUPNklcUVwVN0IVS3Zn9K pNmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684683848; x=1687275848; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ajLDPAw6y+ZRRptK4OdfzzJkp6tOjItl/iZTK4vmvCU=; b=Bcq+HCQSAbHFKq9wncvU8KjCpAOC6+TrqoOy1nQ6fSeQTF/EFzHhlG9NIOJ6mgCUOb HZo94M+ZHfGHKHzJL833RCRIRVbZBcJ0ckEnqnHAsli8bVhSJOWsMzLKtpnVwZMATqvP RoZ7/DatOxho+tBFh5H2+u2hwZuyYZwQ3sdeQ4d3EXKMxT/jAGvUCA1MlIq/cMzG+/wI ezMmZtI+zfrxTo3ZAeqcN1C8z47INOiUx05RYbfeL6vKbunLM0bqfRbkGNfyVhehx5iU m97oCEOlGFXKCV3MJiAd7l5TUs+S4ZTyWqoTAuUVfQBWBgoV1JbXIpjA3osHBojQvcvV +f0A== X-Gm-Message-State: AC+VfDzWrH07b1mA+PUuEzd4/Ql/busX6DUIV5Y1hiVqo6tAApZy5eON auruyOsAhsPyE5wmB0Nij4E= X-Google-Smtp-Source: ACHHUZ4aKlWPlbxrjW5nrgLXpRssDjZjItoIrSbIIsgln89Y+xCeswp91uSnMoTLjnkyOkhDfuDYsQ== X-Received: by 2002:a17:902:ecc8:b0:1ab:1260:19de with SMTP id a8-20020a170902ecc800b001ab126019demr10116175plh.11.1684683848419; Sun, 21 May 2023 08:44:08 -0700 (PDT) Received: from localhost (58x12x133x161.ap58.ftth.ucom.ne.jp. [58.12.133.161]) by smtp.gmail.com with ESMTPSA id h15-20020a170902f54f00b001ac5b0a959bsm3114562plf.24.2023.05.21.08.44.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 21 May 2023 08:44:07 -0700 (PDT) Date: Mon, 22 May 2023 00:44:03 +0900 Message-ID: From: kobarity To: Tom Gillespie Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock In-Reply-To: References: <83zg5yqkr1.fsf@gnu.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.0.50 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/mixed; boundary="Multipart_Mon_May_22_00:44:02_2023-1" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 63622 Cc: Eli Zaretskii , Stefan Monnier , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --Multipart_Mon_May_22_00:44:02_2023-1 Content-Type: text/plain; charset=US-ASCII I wrote: > I agree. It seems to me that it is not python-font-lock-extend-region > itself that is slow, but rather font-lock's processing of the area > extended by it. So one workaround would be to limit the number of > lines to be extended, as in the attached patch. If this limit is > exceeded, the area or the entire buffer must be font-locked manually > later. What do you think of this idea? The cause of the slowdown seems to be in python-info-docstring-p. So another option would be to improve it. Attached is a patch that determines that if point is in parens except for "(", it is not a docstring. I can't think of a case where a docstring is in parens except for "(". However, usually more tests should be done. --Multipart_Mon_May_22_00:44:02_2023-1 Content-Type: application/octet-stream; type=patch; name="0001-Optimize-python-info-docstring-p.patch" Content-Disposition: attachment; filename="0001-Optimize-python-info-docstring-p.patch" Content-Transfer-Encoding: 7bit >From 749afdc3cfe781f785c95501aa58472c78b71bf7 Mon Sep 17 00:00:00 2001 From: kobarity Date: Mon, 22 May 2023 00:01:16 +0900 Subject: [PATCH] Optimize python-info-docstring-p * lisp/progmodes/python.el (python-info-docstring-p): Add condition that is not inside paren except for "(". (Bug#63622) --- lisp/progmodes/python.el | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lisp/progmodes/python.el b/lisp/progmodes/python.el index 6fc05b246a6..af48d852182 100644 --- a/lisp/progmodes/python.el +++ b/lisp/progmodes/python.el @@ -6021,6 +6021,9 @@ python-info-docstring-p (re (concat "[uU]?[rR]?" (python-rx string-delimiter)))) (when (and + (not (and syntax-ppss + (when-let ((pos (nth 1 syntax-ppss))) + (/= (char-after pos) ?\()))) (not (python-info-assignment-statement-p)) (looking-at-p re) ;; Allow up to two consecutive docstrings only. -- 2.34.1 --Multipart_Mon_May_22_00:44:02_2023-1-- From debbugs-submit-bounces@debbugs.gnu.org Mon May 22 10:58:52 2023 Received: (at 63622) by debbugs.gnu.org; 22 May 2023 14:58:52 +0000 Received: from localhost ([127.0.0.1]:35739 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q16zs-0003wn-IC for submit@debbugs.gnu.org; Mon, 22 May 2023 10:58:52 -0400 Received: from mail-pf1-f179.google.com ([209.85.210.179]:42440) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q16zq-0003wW-3b for 63622@debbugs.gnu.org; Mon, 22 May 2023 10:58:50 -0400 Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-64d24136685so2573690b3a.1 for <63622@debbugs.gnu.org>; Mon, 22 May 2023 07:58:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684767524; x=1687359524; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:from:to:cc:subject:date:message-id:reply-to; bh=DYlybTlfKlIDPe14xLE6Nq4QZCM7g0qSn8N58iAGpMg=; b=IDEhxXZ6vfJXGlLcTanclxVqJIuhzv+GmjeC6sALpHuIzXbO3wuB83eZ3MPlpGKS+l wQRJeMw64ub5qbCnoD6f2a9/j7WT8o84wkS1e+a4eqvjrYWTFM44loPGK9/+s9v9jKxd NASi63p1GJgs1WWEYVaO5/4g4+kSJP9bN1i3coF8Q/jvVTVcUVsXTT56Eg9LDLCEAY4p a+VoNDFaZhEx4PTjpPoMje7cpU9dE9xUmvUkpxBwZeP5kSUeA8MXB0VLaCK20KK+kjZy GId3Yy4FCJ2/R4tQiXnPbxe/mlJnLDOaN+xewNsLZxpLoD3RACgjjSsLrp5T3cQpQIRc +4Zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684767524; x=1687359524; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=DYlybTlfKlIDPe14xLE6Nq4QZCM7g0qSn8N58iAGpMg=; b=SPoivCcIZLh82TkN8cDkhshqshjaahkqfODb4eRrar7nPhRetRe7adhjcPORiblXd2 qsWdI+DyC/txsOtMTSvyZEcH6K8c0vvHW54oMDlDEi/6gtK51NkCcu/j1UjqZFfNz2XU iQeyA6hDBG4MOYrjKeB+RVnC/6eVnKi1go5WVh0LKs1Z7Y8ULX/R23NR+Rtjbn57Ai+K 8IzdRh07oDzyPKnFjc4cm5qFet2J6iMRud+lDPstV35/xLSVMtZ8EkKOZBFRCBLGPxRB Io21Yjxn1+tPXaqf3xuH9a4G0NxwlmhlDy6X2obgbgGbl4Zu4e5I+sj7kk24Ii3W7vaY /R2Q== X-Gm-Message-State: AC+VfDxIa/l6ME6vKqzUaSKL+pICq+I4QCu0JruqIN8giBSAqCm35WVt ZQuwzAUeXNp3s3KE15qW+uo= X-Google-Smtp-Source: ACHHUZ6fwQwUrkgZHXderd2SUBCeS0xQHKNSWDzHTFFZLl+LCzqr95OQG40NJEywUkLIiUuGowlw7w== X-Received: by 2002:a05:6a20:7d8d:b0:105:3e47:7504 with SMTP id v13-20020a056a207d8d00b001053e477504mr11184275pzj.11.1684767523803; Mon, 22 May 2023 07:58:43 -0700 (PDT) Received: from localhost (58x12x133x161.ap58.ftth.ucom.ne.jp. [58.12.133.161]) by smtp.gmail.com with ESMTPSA id z8-20020aa791c8000000b0063b8ce0e860sm4276231pfa.21.2023.05.22.07.58.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 May 2023 07:58:43 -0700 (PDT) Date: Mon, 22 May 2023 23:58:37 +0900 Message-ID: From: kobarity To: Stefan Monnier Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock In-Reply-To: References: <83zg5yqkr1.fsf@gnu.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.0.50 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/mixed; boundary="Multipart_Mon_May_22_23:58:36_2023-1" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 63622 Cc: Tom Gillespie , Eli Zaretskii , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --Multipart_Mon_May_22_23:58:36_2023-1 Content-Type: text/plain; charset=US-ASCII Stefan Monnier wrote: > FWIW, I recommend against using > `font-lock-extend-after-change-region-function`. > > You can use `font-lock-extend-region-functions` instead (which performs > the region-extension right before fontifying a chunk) to avoid this problem. > [ It won't help with the current performance issue, tho. ] Thank you for your advice. Does the attached patch seem reasonable? --Multipart_Mon_May_22_23:58:36_2023-1 Content-Type: application/octet-stream; type=patch; name="0001-Use-font-lock-extend-region-functions-in-python-mode.patch" Content-Disposition: attachment; filename="0001-Use-font-lock-extend-region-functions-in-python-mode.patch" Content-Transfer-Encoding: 7bit >From fb899c0d9596c5912db1dc2f518ff1bab00c9b0d Mon Sep 17 00:00:00 2001 From: kobarity Date: Mon, 22 May 2023 23:42:28 +0900 Subject: [PATCH] Use font-lock-extend-region-functions in python-mode * lisp/progmodes/python.el (python-font-lock-extend-region): Change arguments and return value for python-font-lock-extend-region. (python-mode): Change from font-lock-extend-after-change-region-function to python-font-lock-extend-region. (Bug#63622) --- lisp/progmodes/python.el | 30 ++++++++++++++++-------------- 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/lisp/progmodes/python.el b/lisp/progmodes/python.el index 6fc05b246a6..03d57d0b378 100644 --- a/lisp/progmodes/python.el +++ b/lisp/progmodes/python.el @@ -869,18 +869,20 @@ python-font-lock-keywords Which one will be chosen depends on the value of `font-lock-maximum-decoration'.") -(defun python-font-lock-extend-region (beg end _old-len) - "Extend font-lock region given by BEG and END to statement boundaries." - (save-excursion - (save-match-data - (goto-char beg) - (python-nav-beginning-of-statement) - (setq beg (point)) - (goto-char end) - (python-nav-end-of-statement) - (setq end (point)) - (cons beg end)))) - +(defvar font-lock-beg) +(defvar font-lock-end) +(defun python-font-lock-extend-region () + "Extend font-lock region to statement boundaries." + (let ((beg font-lock-beg) + (end font-lock-end)) + (goto-char beg) + (python-nav-beginning-of-statement) + (setq font-lock-beg (point)) + (goto-char end) + (python-nav-end-of-statement) + (when (not (eobp)) (forward-char)) + (setq font-lock-end (point)) + (or (/= beg font-lock-beg) (/= end font-lock-end)))) (defconst python-syntax-propertize-function (syntax-propertize-rules @@ -6708,8 +6710,8 @@ python-mode nil nil nil nil (font-lock-syntactic-face-function . python-font-lock-syntactic-face-function) - (font-lock-extend-after-change-region-function - . python-font-lock-extend-region))) + (font-lock-extend-region-functions + . (python-font-lock-extend-region)))) (setq-local syntax-propertize-function python-syntax-propertize-function) (setq-local imenu-create-index-function -- 2.34.1 --Multipart_Mon_May_22_23:58:36_2023-1-- From debbugs-submit-bounces@debbugs.gnu.org Mon May 22 11:08:49 2023 Received: (at 63622) by debbugs.gnu.org; 22 May 2023 15:08:49 +0000 Received: from localhost ([127.0.0.1]:35826 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q179U-0004FM-Nh for submit@debbugs.gnu.org; Mon, 22 May 2023 11:08:48 -0400 Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:18549) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q179O-0004F4-Qi for 63622@debbugs.gnu.org; Mon, 22 May 2023 11:08:47 -0400 Received: from pmg1.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id 9604F100117; Mon, 22 May 2023 11:08:37 -0400 (EDT) Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id 8EE451000BE; Mon, 22 May 2023 11:08:36 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1684768116; bh=YOjFiDW5YebMANbf5RN67kvX0BrTWCJUbGGyrLDEw+Y=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=nFeih6/D40QvP6/5y3Zwg5vQEf9kQPa6OVgafm0UD/pz+ssxzxXBg+tpfoUQyrHXn z+DvtWduIZxRYbYa3zX64mOmrGp5Ngkw7b8bXTreVrDyp8NcPUK4uXYasBfRiYxofA 5jEBU/KfDHVm/SZyUcPO/6Qis59I1AhApMycc0tFj/KYTdMuBFjg39XEgttjbVhkZ2 R/JTbR+nFOykaAsze5hnSiwAByfTcRQ48dKYzuPBBg9tbigUlVHxksNDwHh1qKsxy2 RUTI2oxONXCPYIe3sWQTlesQmuxWrDf7gLvwls9MFG0BAof5ha0SaJ0rRGQ1M1vtFE F6j6UnRHiolgA== Received: from pastel (unknown [45.72.217.176]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 4EBD41203DF; Mon, 22 May 2023 11:08:36 -0400 (EDT) From: Stefan Monnier To: kobarity Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock In-Reply-To: (kobarity@gmail.com's message of "Mon, 22 May 2023 23:58:37 +0900") Message-ID: References: <83zg5yqkr1.fsf@gnu.org> Date: Mon, 22 May 2023 11:08:35 -0400 User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain X-SPAM-INFO: Spam detection results: 0 ALL_TRUSTED -1 Passed through trusted hosts only via SMTP AWL -0.179 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain T_SCC_BODY_TEXT_LINE -0.01 - X-SPAM-LEVEL: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 63622 Cc: Tom Gillespie , Eli Zaretskii , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > @@ -869,18 +869,20 @@ python-font-lock-keywords > Which one will be chosen depends on the value of > `font-lock-maximum-decoration'.") > > -(defun python-font-lock-extend-region (beg end _old-len) > - "Extend font-lock region given by BEG and END to statement boundaries." > - (save-excursion > - (save-match-data > - (goto-char beg) > - (python-nav-beginning-of-statement) > - (setq beg (point)) > - (goto-char end) > - (python-nav-end-of-statement) > - (setq end (point)) > - (cons beg end)))) > - > +(defvar font-lock-beg) > +(defvar font-lock-end) > +(defun python-font-lock-extend-region () > + "Extend font-lock region to statement boundaries." > + (let ((beg font-lock-beg) > + (end font-lock-end)) > + (goto-char beg) > + (python-nav-beginning-of-statement) > + (setq font-lock-beg (point)) > + (goto-char end) > + (python-nav-end-of-statement) > + (when (not (eobp)) (forward-char)) > + (setq font-lock-end (point)) > + (or (/= beg font-lock-beg) (/= end font-lock-end)))) > > (defconst python-syntax-propertize-function > (syntax-propertize-rules Looks fine to me (I assume you've checked that it behaves about as well as the previous code). > @@ -6708,8 +6710,8 @@ python-mode > nil nil nil nil > (font-lock-syntactic-face-function > . python-font-lock-syntactic-face-function) > - (font-lock-extend-after-change-region-function > - . python-font-lock-extend-region))) > + (font-lock-extend-region-functions > + . (python-font-lock-extend-region)))) > (setq-local syntax-propertize-function > python-syntax-propertize-function) > (setq-local imenu-create-index-function This is bound to break in some cases. Please use `add-hook` instead. Stefan From debbugs-submit-bounces@debbugs.gnu.org Tue May 23 11:45:38 2023 Received: (at 63622) by debbugs.gnu.org; 23 May 2023 15:45:38 +0000 Received: from localhost ([127.0.0.1]:40193 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1UCg-0007kr-8R for submit@debbugs.gnu.org; Tue, 23 May 2023 11:45:38 -0400 Received: from mail-pf1-f180.google.com ([209.85.210.180]:51481) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1UCe-0007kd-R1 for 63622@debbugs.gnu.org; Tue, 23 May 2023 11:45:37 -0400 Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-64d3491609fso3391330b3a.3 for <63622@debbugs.gnu.org>; Tue, 23 May 2023 08:45:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684856731; x=1687448731; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:from:to:cc:subject:date:message-id:reply-to; bh=CdCzWRU9zoqvRe1lvPn3XEQkhdkeYPzYaQ8go1fn7d0=; b=Xj6WBXtIMQE+jJvLXErwHAk67kCOA0/fLvF6DewDGi5Zc7apgzUoMfNhrEIKAhvUYD jvsoPNQPRX2xEWR9/coT5QreNap/hcKDWKxKcYAUvBg5HRvuaA3JxMEho8vpTXSdkyuh 5ntbKi5fOBOkLxXcVf3cOuM97hn9im3o7Ftv19hFRL2dHB79BO4jfwgibCYN3wLLT21V nkqpeZWeMMPUmzwjxlPVYbyhXaVrUyUbOCwSE9sPZayaOGPfuRXj62WseIuYKD5MrAo8 RE6qrV+nliWP/aQI4Ya3joE+N+AUM6Owmakrn8kKMDjKCQFIHTs+OVWVLIbGwgljuJQZ 1RuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684856731; x=1687448731; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=CdCzWRU9zoqvRe1lvPn3XEQkhdkeYPzYaQ8go1fn7d0=; b=lFniQsImalaappoLprXzbD0htwkA5cavSgqwe8CNLdiwxJhB4eT84LVarxD8Kafj8v VQ+3HAt6GBUsFDyTj59KMLlZOl9BECbLf6JgGRN05EyEExBvSaXHYCM+fRsuTS+3pp9n +M6kKTnylFfatvkvYdT2ynXoRu6KQnW2ph69trHA6Itb3yL5ymtzT7UoGZU6GtS9lr4U pcHDBcgI5zDtE5IRVaKMgidRtHWa4tmattG12bE/qgDL2kymrk5qYuNWzBgBthm4bZWw 6/0oJvCUB/q65cOANii/OvNhN3C+fJEQvKkHha1shiLGqtLjg8y3+MwI4QmQ7iDqz502 HXzA== X-Gm-Message-State: AC+VfDzjW01JiW8A4KwTrtJllCOILkndVqWJ6hBJ4ra633V6b/2tffDp 3NrXo13n4CcHa4xdV55z/ZA= X-Google-Smtp-Source: ACHHUZ77ebIzeCRli82bvmPq3FmxMHcI0m00cYQy0yyVmEaRUNVOq/BtSPikkIJ+cQdfPa9PY0fK0w== X-Received: by 2002:a05:6a20:7d81:b0:10c:8e4e:f303 with SMTP id v1-20020a056a207d8100b0010c8e4ef303mr2514746pzj.44.1684856730778; Tue, 23 May 2023 08:45:30 -0700 (PDT) Received: from localhost (58x12x133x161.ap58.ftth.ucom.ne.jp. [58.12.133.161]) by smtp.gmail.com with ESMTPSA id h13-20020aa786cd000000b0064d413ca7desm6157058pfo.171.2023.05.23.08.45.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 08:45:29 -0700 (PDT) Date: Wed, 24 May 2023 00:45:27 +0900 Message-ID: From: kobarity To: Stefan Monnier Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock In-Reply-To: References: <83zg5yqkr1.fsf@gnu.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.0.50 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/mixed; boundary="Multipart_Wed_May_24_00:45:26_2023-1" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 63622 Cc: Tom Gillespie , Eli Zaretskii , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --Multipart_Wed_May_24_00:45:26_2023-1 Content-Type: text/plain; charset=US-ASCII Stefan Monnier wrote: > > @@ -6708,8 +6710,8 @@ python-mode > > nil nil nil nil > > (font-lock-syntactic-face-function > > . python-font-lock-syntactic-face-function) > > - (font-lock-extend-after-change-region-function > > - . python-font-lock-extend-region))) > > + (font-lock-extend-region-functions > > + . (python-font-lock-extend-region)))) > > (setq-local syntax-propertize-function > > python-syntax-propertize-function) > > (setq-local imenu-create-index-function > > This is bound to break in some cases. Please use `add-hook` instead. Thanks. I also revised python-font-lock-extend-region. The attached 0001-Use-font-lock-extend-region-functions-in-python-mode.patch is the revised patch. As for the performance degradation, I am almost certain that it is a bug in python-info-docstring-p and/or python-rx string-delimiter. python-info-docstring-p uses the following regex as one of the conditions to determine if it is a docstring. (re (concat "[uU]?[rR]?" (python-rx string-delimiter)))) One of the problems is that string-delimiter matches even if there is a single character preceding the string. For example, string-delimiter matches both x" and 1". It might be reasonable to match r" or b", for example. Because they are correct string starter in Python. However, there is no reason to match x" or 1". Besides, python-info-docstring-p explicitly states "[uU]?[rR]?" as above. So (python-rx string-delimiter) in the above code should only match string delimiter without prefixes. Current python-info-docstring-p incorrectly considers a code like [""] or {""} to be a docstring. The performance problem in the example shown by Tom can be resolved by modifying the above code as follows: (re (concat "[uU]?[rR]?" (rx (or "\"\"\"" "\"" "'''" "'"))))) It breaks some ERTs, but I think we should fix the ERTs. However, there seems to be another problem in python-info-docstring-p. It intentionally considers contiguous strings as docstring as in the ERT python-info-docstring-p-1: ''' Module Docstring Django style. ''' u'''Additional module docstring.''' '''Not a module docstring.''' However, as far as I have tried with Python 3 and Python 2.7, this is not correct. Therefore, I feel it is better to mark the ERTs as expected failures than to modify it at this stage. The attached 0001-Fix-python-info-docstring-p.patch is the patch to do this. --Multipart_Wed_May_24_00:45:26_2023-1 Content-Type: application/octet-stream; type=patch; name="0001-Use-font-lock-extend-region-functions-in-python-mode.patch" Content-Disposition: attachment; filename="0001-Use-font-lock-extend-region-functions-in-python-mode.patch" Content-Transfer-Encoding: 7bit >From 7ea71e0c07a10959a93e4a1e9a219ad8f6810d22 Mon Sep 17 00:00:00 2001 From: kobarity Date: Tue, 23 May 2023 21:59:18 +0900 Subject: [PATCH] Use font-lock-extend-region-functions in python-mode * lisp/progmodes/python.el (python-font-lock-extend-region): Change arguments and return value for font-lock-extend-region-functions. (python-mode): Change from font-lock-extend-after-change-region-function to font-lock-extend-region-functions. (Bug#63622) --- lisp/progmodes/python.el | 34 +++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/lisp/progmodes/python.el b/lisp/progmodes/python.el index 6fc05b246a6..b363ef874be 100644 --- a/lisp/progmodes/python.el +++ b/lisp/progmodes/python.el @@ -869,18 +869,22 @@ python-font-lock-keywords Which one will be chosen depends on the value of `font-lock-maximum-decoration'.") -(defun python-font-lock-extend-region (beg end _old-len) - "Extend font-lock region given by BEG and END to statement boundaries." - (save-excursion - (save-match-data - (goto-char beg) - (python-nav-beginning-of-statement) - (setq beg (point)) - (goto-char end) - (python-nav-end-of-statement) - (setq end (point)) - (cons beg end)))) - +(defvar font-lock-beg) +(defvar font-lock-end) +(defun python-font-lock-extend-region () + "Extend font-lock region to statement boundaries." + (let ((beg font-lock-beg) + (end font-lock-end)) + (goto-char beg) + (python-nav-beginning-of-statement) + (beginning-of-line) + (when (< (point) beg) + (setq font-lock-beg (point))) + (goto-char end) + (python-nav-end-of-statement) + (when (< end (point)) + (setq font-lock-end (point))) + (or (/= beg font-lock-beg) (/= end font-lock-end)))) (defconst python-syntax-propertize-function (syntax-propertize-rules @@ -6707,9 +6711,9 @@ python-mode `(,python-font-lock-keywords nil nil nil nil (font-lock-syntactic-face-function - . python-font-lock-syntactic-face-function) - (font-lock-extend-after-change-region-function - . python-font-lock-extend-region))) + . python-font-lock-syntactic-face-function))) + (add-hook 'font-lock-extend-region-functions + #'python-font-lock-extend-region nil t) (setq-local syntax-propertize-function python-syntax-propertize-function) (setq-local imenu-create-index-function -- 2.34.1 --Multipart_Wed_May_24_00:45:26_2023-1 Content-Type: application/octet-stream; type=patch; name="0001-Fix-python-info-docstring-p.patch" Content-Disposition: attachment; filename="0001-Fix-python-info-docstring-p.patch" Content-Transfer-Encoding: 7bit >From 42dbb7ddc177e98a5b88d51cb51ce993888e5f20 Mon Sep 17 00:00:00 2001 From: kobarity Date: Wed, 24 May 2023 00:16:50 +0900 Subject: [PATCH] Fix python-info-docstring-p * lisp/progmodes/python.el (python-info-docstring-p): Stop using python-rx string-delimiter. * test/lisp/progmodes/python-tests.el (python-font-lock-escape-sequence-bytes-newline), (python-font-lock-escape-sequence-hex-octal), (python-font-lock-escape-sequence-unicode), (python-font-lock-raw-escape-sequence): Mark as expected failures until another bug in python-info-docstring-p is corrected. (Bug#63622) --- lisp/progmodes/python.el | 2 +- test/lisp/progmodes/python-tests.el | 4 ++++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/lisp/progmodes/python.el b/lisp/progmodes/python.el index 6fc05b246a6..2f65f389949 100644 --- a/lisp/progmodes/python.el +++ b/lisp/progmodes/python.el @@ -6019,7 +6019,7 @@ python-info-docstring-p (indentation (current-indentation)) (backward-sexp-point) (re (concat "[uU]?[rR]?" - (python-rx string-delimiter)))) + (rx (or "\"\"\"" "\"" "'''" "'"))))) (when (and (not (python-info-assignment-statement-p)) (looking-at-p re) diff --git a/test/lisp/progmodes/python-tests.el b/test/lisp/progmodes/python-tests.el index 50153e66da5..bc3b574e81f 100644 --- a/test/lisp/progmodes/python-tests.el +++ b/test/lisp/progmodes/python-tests.el @@ -729,6 +729,7 @@ python-font-lock-escape-sequence-multiline-string (845 . font-lock-string-face) (886)))) (ert-deftest python-font-lock-escape-sequence-bytes-newline () + :expected-result :failed (python-tests-assert-faces "b'\\n' b\"\\n\"" @@ -741,6 +742,7 @@ python-font-lock-escape-sequence-bytes-newline (11 . font-lock-doc-face)))) (ert-deftest python-font-lock-escape-sequence-hex-octal () + :expected-result :failed (python-tests-assert-faces "b'\\x12 \\777 \\1\\23' '\\x12 \\777 \\1\\23'" @@ -761,6 +763,7 @@ python-font-lock-escape-sequence-hex-octal (36 . font-lock-doc-face)))) (ert-deftest python-font-lock-escape-sequence-unicode () + :expected-result :failed (python-tests-assert-faces "b'\\u1234 \\U00010348 \\N{Plus-Minus Sign}' '\\u1234 \\U00010348 \\N{Plus-Minus Sign}'" @@ -775,6 +778,7 @@ python-font-lock-escape-sequence-unicode (80 . font-lock-doc-face)))) (ert-deftest python-font-lock-raw-escape-sequence () + :expected-result :failed (python-tests-assert-faces "rb'\\x12 \123 \\n' r'\\x12 \123 \\n \\u1234 \\U00010348 \\N{Plus-Minus Sign}'" -- 2.34.1 --Multipart_Wed_May_24_00:45:26_2023-1-- From debbugs-submit-bounces@debbugs.gnu.org Tue May 23 13:08:41 2023 Received: (at 63622) by debbugs.gnu.org; 23 May 2023 17:08:41 +0000 Received: from localhost ([127.0.0.1]:40368 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1VV2-0001xa-Pc for submit@debbugs.gnu.org; Tue, 23 May 2023 13:08:40 -0400 Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:58416) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1VV1-0001xM-2C for 63622@debbugs.gnu.org; Tue, 23 May 2023 13:08:39 -0400 Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 87DF7443807; Tue, 23 May 2023 13:08:33 -0400 (EDT) Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 5F35D443804; Tue, 23 May 2023 13:08:32 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1684861712; bh=JYeETzYyg4dCE+TQebpYTFJTlEj4Hwq3aZwwp2qBMLE=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=iDGnotbN/dRnDr67SZwmZscuI0trFX7ASOXt+2uWZPQSXY5Y8zlHTfsSHgl5BVEPh IIQq1iXNTGspM9nJeNsFHKpqpobcsE4yIwbiMOrSHvPtaYI9OCFlkzTuHrnKJxM/PF zEx7USjxTkzQkcb+0ZThxCZCkE5vpdu/VbpLnWfiBeussbWVZ9nFSq4THtbGhTNu65 ZsEKprruks4evFnA0S87Qz85/YroBfJjyVcdsf7oUI0qNA22ZqpcXAcDhgL7PVI3fF ib/7qmP7XeJhrrc3LDcgaouD++y3QgT87r4T4Y7TT38rQslWMn0q+EvcJpnzxI/Ndp 58JY3dHC1Wd7g== Received: from lechazo (lechon.iro.umontreal.ca [132.204.27.242]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 393AB120434; Tue, 23 May 2023 13:08:32 -0400 (EDT) From: Stefan Monnier To: kobarity Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock In-Reply-To: (kobarity@gmail.com's message of "Wed, 24 May 2023 00:45:27 +0900") Message-ID: References: <83zg5yqkr1.fsf@gnu.org> Date: Tue, 23 May 2023 13:08:25 -0400 User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain X-SPAM-INFO: Spam detection results: 0 ALL_TRUSTED -1 Passed through trusted hosts only via SMTP AWL 0.114 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain T_SCC_BODY_TEXT_LINE -0.01 - X-SPAM-LEVEL: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 63622 Cc: Tom Gillespie , Eli Zaretskii , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Thanks. I also revised python-font-lock-extend-region. The attached > 0001-Use-font-lock-extend-region-functions-in-python-mode.patch is the > revised patch. LGTM, thank you. I'm not sufficiently familiar with Python to have an opinion on the rest, tho your analysis sounds convincing. Stefan From debbugs-submit-bounces@debbugs.gnu.org Tue May 23 15:04:35 2023 Received: (at 63622) by debbugs.gnu.org; 23 May 2023 19:04:35 +0000 Received: from localhost ([127.0.0.1]:40429 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1XJC-0007bG-R8 for submit@debbugs.gnu.org; Tue, 23 May 2023 15:04:35 -0400 Received: from mail-yb1-f178.google.com ([209.85.219.178]:60706) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1XJB-0007b3-6D for 63622@debbugs.gnu.org; Tue, 23 May 2023 15:04:33 -0400 Received: by mail-yb1-f178.google.com with SMTP id 3f1490d57ef6-b9daef8681fso62885276.1 for <63622@debbugs.gnu.org>; Tue, 23 May 2023 12:04:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684868667; x=1687460667; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=oEsi5K/6dzDR5DZn8gHg6hwKecUE2AImvJiHj6aEMiY=; b=ACL0vgmgN09q71OttJ1QC+b5Gr51clHg6eY5YL2I6S/u2qaNCA/QIBF/wCKC7HYzQf HgUGtnAioIFRtzb3evwvruNsl8MN05YrrsxuydUwCd6dNWzyanresJBPWzprzdQUlyqm D4EIx3yCHt+dWXirddv6BxX7T++SVlDmopJe0P2ch7D22a3HZ5LOV4VRiwtvFR088A5k t0zAdC1LWhDs8TFz9/h19+bUJLmnrsC9oQgfCKQEqlHtVY7pdGbjie3nvxZDmFXrJ1yr BG+Zx7cbdr3vn6+KWcHfxhkQgNLCsvOQKkEhWAIgMHVep44EEsyj5WG+33TrK1L5rH/b 9oGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684868667; x=1687460667; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=oEsi5K/6dzDR5DZn8gHg6hwKecUE2AImvJiHj6aEMiY=; b=jUO7/DeJrDvZgqdC0ptUfTOpEi2Dzo6Og3I1Cail1vUPHFeoKOQvt33c8ZSakKVXj5 /h7aPvlfxHey3YRpgVbFxbhi89uV44nLu+uSrDRKi3B4RLi8sjIkiqOkXnbDgVO/3Qgf jGC0OZ7cxg8nheee0Ob0I5ruppCaLVsPJFP2L2tGyu8prntRsiGWtXALueCaqvuyTrJa /hObb+CSAVgtjMtFiw5qauWgMnY4poFQoozwp3oQDYBWaY4Bpy2PNdy5ZyV/S4tatS8P yGFiYgKK/E02zRJHawLU51ocJpbGDEII1SRb7JClzuKuffCr9hBqLmr71PCa5w2L2xjq s3gg== X-Gm-Message-State: AC+VfDwHeHeEVUZ0Smq6J/lIVt7LBSZQC9qxAR+XLLoZNxvHsfn9YH59 1FVA+2cqUqFCm9wXx12Kl9k1W/mvxevAXBPNGUQ= X-Google-Smtp-Source: ACHHUZ5wTQkOxEdjCkqK7c5vmPXc7MKgfsZgeyfhYGO8OM+2hM2tiEDfVz+wDQWdbN6qPmgMHWDc3zDYy33TBAI+pP4= X-Received: by 2002:a25:ea44:0:b0:ba2:6aea:2ba with SMTP id o4-20020a25ea44000000b00ba26aea02bamr13353230ybe.23.1684868667349; Tue, 23 May 2023 12:04:27 -0700 (PDT) MIME-Version: 1.0 References: <83zg5yqkr1.fsf@gnu.org> In-Reply-To: From: Tom Gillespie Date: Tue, 23 May 2023 12:04:16 -0700 Message-ID: Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock To: kobarity Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 63622 Cc: Eli Zaretskii , Stefan Monnier , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) I have tested the two patches and the performance is dramatically improved. > One of the problems is that string-delimiter matches even if there is > a single character preceding the string. I agree with the assessment about docstring syntax. The fix in Fix-python- is ok and fixes the font-locking performance in general. The general issue with the definition of string-delimiter is something that might need to be considered in the future if another perf issue crops up, but for now I think it is safe to leave it alone since docstring detection was the only place where it seems to be causing issues. While we're here I think [fF]? should probably be included in the prefix since format strings are also valid docstrings. Thank you very much for digging into this! Tom From debbugs-submit-bounces@debbugs.gnu.org Tue May 23 19:21:22 2023 Received: (at 63622) by debbugs.gnu.org; 23 May 2023 23:21:22 +0000 Received: from localhost ([127.0.0.1]:40601 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1bJi-0006Re-2i for submit@debbugs.gnu.org; Tue, 23 May 2023 19:21:22 -0400 Received: from mail-pl1-f170.google.com ([209.85.214.170]:48221) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1bJg-0006RS-QT for 63622@debbugs.gnu.org; Tue, 23 May 2023 19:21:21 -0400 Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-1afa6afcf4fso2706885ad.0 for <63622@debbugs.gnu.org>; Tue, 23 May 2023 16:21:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684884075; x=1687476075; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:from:to:cc:subject:date:message-id:reply-to; bh=V/UH0Of9VaXwqnismEaDAFV3PO9glv2hWVKXTr43SdE=; b=e2s832sjaZ3oCiwq3JxUnlQVz60BeS9X1t6WkttTyW4hczyGo0MBSYKjwab7OHZljr UicOfgiZxwEfGBy9eRuz7fVXRvKI2VmRdXbQRdeOA/mOdApGO9RUEc84GEes0iKZeo5U NnBNwwogQ/7H78JCQP06rBuURcdmVbeP/el5OyiDIwhIOQrTks600cLoGHFfkWvNs3HZ JdbTd8k5zRP9jVREtO6jrIaa+JkSXmMuWWIHn0nDnZJZUIBvuBWjOThzpa6kyosmQqSf 0TaDbVrfMJNW3/f6qwjTn7dIAm0ID+uAILw+Ru8Yn+1veau5pt19D6Z/whHz2tzVtUTt 2MvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684884075; x=1687476075; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=V/UH0Of9VaXwqnismEaDAFV3PO9glv2hWVKXTr43SdE=; b=RVuLLfxWKMlM0V3t65SLTSbLZYaET/7SH04tjekGUse+WUw4gZytm/jpx3UUAJ7q8f eOEyZnRDKrjcuOof83MAmvUDD8pGDcQjPUtA0QZVj67bhOlXZHmAuy92JT4PP2w5UeRu Qp22iirc120fZdPu0jo+iJx1Z0U17493d0bU7iE6jRJsL70p7zcWB7OGGtVTXojbdj8H zBIvCxfKNx1V594mVZjDdCW9//FA9uDRPa7rtPdsrFZbeNGNrsrsh70UcG9gBm9D0kYx AWJmpJqUC9bEHBYqrdoSprThM3XrqrqDjAQb6mDfHCj6w0OUrZjRVuOoeh7WIqOhL2BD r/pg== X-Gm-Message-State: AC+VfDws7T5QjqAanCrq/naXTzgz/9xJlzmXBXbjk6ikGvIJFAnvvS1o Gv09Ar5ofzn0e2q01xmHBDA= X-Google-Smtp-Source: ACHHUZ4jQvZwmpF63d6jQYv4uM/WjnEUiLXn7DA4fo+SHlxQb7qc8tqWVI7P1hcRPcGfSnFNyYDiig== X-Received: by 2002:a17:903:2283:b0:1af:c7f8:b329 with SMTP id b3-20020a170903228300b001afc7f8b329mr4747074plh.24.1684884074721; Tue, 23 May 2023 16:21:14 -0700 (PDT) Received: from localhost (58x12x133x161.ap58.ftth.ucom.ne.jp. [58.12.133.161]) by smtp.gmail.com with ESMTPSA id q7-20020a170902dac700b001ab06958770sm7315536plx.161.2023.05.23.16.21.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 16:21:14 -0700 (PDT) Date: Wed, 24 May 2023 08:21:11 +0900 Message-ID: From: kobarity To: Tom Gillespie Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock In-Reply-To: References: <83zg5yqkr1.fsf@gnu.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.0.50 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 63622 Cc: Eli Zaretskii , Stefan Monnier , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hi Stefan and Tom, Thank you very much for your confirmation. Tom Gillespie wrote: > While we're here I think [fF]? should probably be included in the prefix since > format strings are also valid docstrings. Is that so? As far as I have tried, f-string does not seem to be a docstring. Python 3.11.3 (main, May 2 2023, 21:12:31) [GCC 11.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> def f1(): ... "docstring" ... >>> f1.__doc__ 'docstring' >>> def f2(): ... f"not docstring" ... >>> f2.__doc__ >>> a = 1 >>> def f3(): ... f"not docstring {a}" ... >>> f3.__doc__ >>> From debbugs-submit-bounces@debbugs.gnu.org Tue May 23 19:42:16 2023 Received: (at 63622) by debbugs.gnu.org; 23 May 2023 23:42:16 +0000 Received: from localhost ([127.0.0.1]:40607 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1bdw-0006ve-2Z for submit@debbugs.gnu.org; Tue, 23 May 2023 19:42:16 -0400 Received: from netyu.xyz ([152.44.41.246]:36144 helo=mail.netyu.xyz) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1bdu-0006vW-5b for 63622@debbugs.gnu.org; Tue, 23 May 2023 19:42:14 -0400 Received: from smtpclient.apple ( [112.97.61.37]) by netyu.xyz (OpenSMTPD) with ESMTPSA id 6034c24a (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Tue, 23 May 2023 23:42:13 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Ruijie Yu Mime-Version: 1.0 (1.0) Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock Date: Wed, 24 May 2023 07:41:52 +0800 Message-Id: <35B358E3-13AD-41F7-8060-2B1B1968DAE2@netyu.xyz> References: In-Reply-To: To: kobarity X-Mailer: iPhone Mail (20F66) X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 63622 Cc: Tom Gillespie , Eli Zaretskii , Stefan Monnier , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On May 23, 2023, at 23:46, kobarity wrote: >=20 > The performance problem in the example shown by Tom can be resolved by > modifying the above code as follows: >=20 > (re (concat "[uU]?[rR]?" > (rx (or "\"\"\"" "\"" "'''" "'"))))) I didn=E2=80=99t read the context for this snippet, but isn=E2=80=99t it suf= ficient to match for only one single-quote and double-quote, instead of also= matching for the triple (multiline) counterparts?=20= From debbugs-submit-bounces@debbugs.gnu.org Wed May 24 11:05:22 2023 Received: (at 63622) by debbugs.gnu.org; 24 May 2023 15:05:23 +0000 Received: from localhost ([127.0.0.1]:44393 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1q3G-0004iY-Hk for submit@debbugs.gnu.org; Wed, 24 May 2023 11:05:22 -0400 Received: from mail-pl1-f173.google.com ([209.85.214.173]:59832) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1q3D-0004iJ-7p for 63622@debbugs.gnu.org; Wed, 24 May 2023 11:05:21 -0400 Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-1a516fb6523so3100125ad.3 for <63622@debbugs.gnu.org>; Wed, 24 May 2023 08:05:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684940713; x=1687532713; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:from:to:cc:subject:date:message-id:reply-to; bh=4PxHt7vlktAqXGoXh1T2OHSa8tAeUFQSP62cJYhvFYM=; b=RwRwiyP5jW9jQSSRhcS4kTg+z0tYvpoetcYCIz9dvulKWfsvjpEhkxNy9ySJqBO3QM WDlp9z48i6qWuWDHIZ84WA85dxKmJZPQ2szyK1myQ8nj4Zn+N/P9iuvI81lAAVGTWHO5 HvldJD2VrFK8E6qZ2IMVvxCW2CFms1OFKxG/0JSNspOayLWF7JCeJCrHHd2KGYifoDpD Y6s5tLEyO2am/dFTuRAz+LBCVoGMSi1lVq2c8cox4M4UV7j2GEZoCQ77wEI8Q45ketTW YnKXw+iTsEP8ZjsIkoMFLPETXnvNcDiZk7kiMDw7y0foG9fkySSkhVFZWLL8nOfHFbLC KZ0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684940713; x=1687532713; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=4PxHt7vlktAqXGoXh1T2OHSa8tAeUFQSP62cJYhvFYM=; b=lD8mXh51nXKGwJn0nRWimFTaLzNWIGUIYsxsUP0xahrjSwolPljxHiMLNA0y1j+Ui5 RJfkt1VFMf5n/W3Z2H77Tw5+G3qBUfVYwXhWQaezGaVgR2JEjFbLlKxgGCGLXkJnpx7N KHexsY7f0YXbvIiUftTI/pym4kuxZ4+z/QBnWgZL2+qiFjghODQNjdXzKLtj3EWYIZc0 eKsFK25gXCIboNKE1EsWMOZbfRHryy3nUyxIrNCjelAMLKgZdtFNEJY3oaqKb32MyPmS Xmbv+DZlqVa8dd0xkD5dqZArXIQjnrnottFknzz8g8GvbxFy53fTf85byfZEopY9ZO6A afJg== X-Gm-Message-State: AC+VfDwKown80DEu+ZduLd0ppY5bmN6+JL44qVcgnj5+YvrKkXXyFoZ6 8W9h1UKJRGyjSYpPIyx3iR4= X-Google-Smtp-Source: ACHHUZ5PKVF5sKLSLQi6bPO7xJFTxCMar+1c75pkfQUymje8owQnzIYgmgfRGXxrwqKKhnRupLYg0g== X-Received: by 2002:a17:902:a589:b0:1ab:797:afbe with SMTP id az9-20020a170902a58900b001ab0797afbemr18165060plb.8.1684940713158; Wed, 24 May 2023 08:05:13 -0700 (PDT) Received: from localhost (58x12x133x161.ap58.ftth.ucom.ne.jp. [58.12.133.161]) by smtp.gmail.com with ESMTPSA id az8-20020a170902a58800b001a4fecf79e4sm8848428plb.49.2023.05.24.08.05.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 May 2023 08:05:12 -0700 (PDT) Date: Thu, 25 May 2023 00:05:09 +0900 Message-ID: From: kobarity To: Ruijie Yu Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock In-Reply-To: <35B358E3-13AD-41F7-8060-2B1B1968DAE2@netyu.xyz> References: <35B358E3-13AD-41F7-8060-2B1B1968DAE2@netyu.xyz> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.0.50 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/mixed; boundary="Multipart_Thu_May_25_00:05:06_2023-1" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 63622 Cc: Tom Gillespie , Eli Zaretskii , Stefan Monnier , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --Multipart_Thu_May_25_00:05:06_2023-1 Content-Type: text/plain; charset=ISO-2022-JP Ruijie Yu wrote: > On May 23, 2023, at 23:46, kobarity wrote: > > The performance problem in the example shown by Tom can be resolved by > > modifying the above code as follows: > > > > (re (concat "[uU]?[rR]?" > > (rx (or "\"\"\"" "\"" "'''" "'"))))) > > I didn’t read the context for this snippet, but isn’t it sufficient to match for only one single-quote and double-quote, instead of also matching for the triple (multiline) counterparts? You are right. I copied the above code from the definition of python-rx string-delimiter. However, it was inside of the group construct. As group capturing is not needed in python-info-docstring-p, the regex can be simplified to: (re "[uU]?[rR]?[\"']")) The same regex was used in another place in python-info-docstring-p. So I fixed it too. Also I added a simple ERT to identify this fix. I wrote: > It breaks some ERTs, but I think we should fix the ERTs. However, > there seems to be another problem in python-info-docstring-p. It > intentionally considers contiguous strings as docstring as in the ERT > python-info-docstring-p-1: > > ''' > Module Docstring Django style. > ''' > u'''Additional module docstring.''' > '''Not a module docstring.''' > > However, as far as I have tried with Python 3 and Python 2.7, this is > not correct. This was my misunderstanding. PEP-257 (https://www.python.org/dev/peps/pep-0257/#what-is-a-docstring) clearly states: #+begin_quote String literals occurring elsewhere in Python code may also act as documentation. They are not recognized by the Python bytecode compiler and are not accessible as runtime object attributes (i.e. not assigned to __doc__), but two types of extra docstrings may be extracted by software tools: 1. String literals occurring immediately after a simple assignment at the top level of a module, class, or __init__ method are called “attribute docstrings”. 2. String literals occurring immediately after another docstring are called “additional docstrings”. #+end_quote However, there still seems to be a bug in python-info-docstring-p. Therefore, I would like to keep failing ERTs as expected fail at this time. Maybe f-string can also be a docstring. Attached is a series of patches that replace the previous patches. --Multipart_Thu_May_25_00:05:06_2023-1 Content-Type: application/octet-stream; type=patch; name="0001-Fix-python-info-docstring-p.patch" Content-Disposition: attachment; filename="0001-Fix-python-info-docstring-p.patch" Content-Transfer-Encoding: 7bit >From ce25adbbae75426ad568a7cd12e02cddc4e9e23e Mon Sep 17 00:00:00 2001 From: kobarity Date: Wed, 24 May 2023 22:01:12 +0900 Subject: [PATCH 1/2] Fix python-info-docstring-p * lisp/progmodes/python.el (python-info-docstring-p): Stop using python-rx string-delimiter. * test/lisp/progmodes/python-tests.el (python-font-lock-escape-sequence-bytes-newline), (python-font-lock-escape-sequence-hex-octal), (python-font-lock-escape-sequence-unicode), (python-font-lock-raw-escape-sequence): Mark as expected failures until another bug in python-info-docstring-p is corrected. (python-info-docstring-p-7): New test. (Bug#63622) --- lisp/progmodes/python.el | 7 ++----- test/lisp/progmodes/python-tests.el | 16 ++++++++++++++++ 2 files changed, 18 insertions(+), 5 deletions(-) diff --git a/lisp/progmodes/python.el b/lisp/progmodes/python.el index 6fc05b246a6..032a17c52ff 100644 --- a/lisp/progmodes/python.el +++ b/lisp/progmodes/python.el @@ -6018,8 +6018,7 @@ python-info-docstring-p (let ((counter 1) (indentation (current-indentation)) (backward-sexp-point) - (re (concat "[uU]?[rR]?" - (python-rx string-delimiter)))) + (re "[uU]?[rR]?[\"']")) (when (and (not (python-info-assignment-statement-p)) (looking-at-p re) @@ -6040,9 +6039,7 @@ python-info-docstring-p backward-sexp-point)) (setq last-backward-sexp-point backward-sexp-point)) - (looking-at-p - (concat "[uU]?[rR]?" - (python-rx string-delimiter)))))) + (looking-at-p re)))) ;; Previous sexp was a string, restore point. (goto-char backward-sexp-point) (cl-incf counter)) diff --git a/test/lisp/progmodes/python-tests.el b/test/lisp/progmodes/python-tests.el index 50153e66da5..cbaf5b698bd 100644 --- a/test/lisp/progmodes/python-tests.el +++ b/test/lisp/progmodes/python-tests.el @@ -729,6 +729,7 @@ python-font-lock-escape-sequence-multiline-string (845 . font-lock-string-face) (886)))) (ert-deftest python-font-lock-escape-sequence-bytes-newline () + :expected-result :failed (python-tests-assert-faces "b'\\n' b\"\\n\"" @@ -741,6 +742,7 @@ python-font-lock-escape-sequence-bytes-newline (11 . font-lock-doc-face)))) (ert-deftest python-font-lock-escape-sequence-hex-octal () + :expected-result :failed (python-tests-assert-faces "b'\\x12 \\777 \\1\\23' '\\x12 \\777 \\1\\23'" @@ -761,6 +763,7 @@ python-font-lock-escape-sequence-hex-octal (36 . font-lock-doc-face)))) (ert-deftest python-font-lock-escape-sequence-unicode () + :expected-result :failed (python-tests-assert-faces "b'\\u1234 \\U00010348 \\N{Plus-Minus Sign}' '\\u1234 \\U00010348 \\N{Plus-Minus Sign}'" @@ -775,6 +778,7 @@ python-font-lock-escape-sequence-unicode (80 . font-lock-doc-face)))) (ert-deftest python-font-lock-raw-escape-sequence () + :expected-result :failed (python-tests-assert-faces "rb'\\x12 \123 \\n' r'\\x12 \123 \\n \\u1234 \\U00010348 \\N{Plus-Minus Sign}'" @@ -6598,6 +6602,18 @@ python-info-docstring-p-6 (python-tests-look-at "'''Not a method docstring.'''") (should (not (python-info-docstring-p))))) +(ert-deftest python-info-docstring-p-7 () + "Test string in a dictionary." + (python-tests-with-temp-buffer + " +{'Not a docstring': 1} +'Also not a docstring' +" + (python-tests-look-at "Not a docstring") + (should-not (python-info-docstring-p)) + (python-tests-look-at "Also not a docstring") + (should-not (python-info-docstring-p)))) + (ert-deftest python-info-triple-quoted-string-p-1 () "Test triple quoted string." (python-tests-with-temp-buffer -- 2.34.1 --Multipart_Thu_May_25_00:05:06_2023-1 Content-Type: application/octet-stream; type=patch; name="0002-Use-font-lock-extend-region-functions-in-python-mode.patch" Content-Disposition: attachment; filename="0002-Use-font-lock-extend-region-functions-in-python-mode.patch" Content-Transfer-Encoding: 7bit >From f166993f4874d28d2f533bfbc3df1c7ca8d3aa2e Mon Sep 17 00:00:00 2001 From: kobarity Date: Wed, 24 May 2023 22:06:51 +0900 Subject: [PATCH 2/2] Use font-lock-extend-region-functions in python-mode * lisp/progmodes/python.el (python-font-lock-extend-region): Change arguments and return value for font-lock-extend-region-functions. (python-mode): Change from font-lock-extend-after-change-region-function to font-lock-extend-region-functions. (Bug#63622) --- lisp/progmodes/python.el | 34 +++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/lisp/progmodes/python.el b/lisp/progmodes/python.el index 032a17c52ff..adaeacc2ec1 100644 --- a/lisp/progmodes/python.el +++ b/lisp/progmodes/python.el @@ -869,18 +869,22 @@ python-font-lock-keywords Which one will be chosen depends on the value of `font-lock-maximum-decoration'.") -(defun python-font-lock-extend-region (beg end _old-len) - "Extend font-lock region given by BEG and END to statement boundaries." - (save-excursion - (save-match-data - (goto-char beg) - (python-nav-beginning-of-statement) - (setq beg (point)) - (goto-char end) - (python-nav-end-of-statement) - (setq end (point)) - (cons beg end)))) - +(defvar font-lock-beg) +(defvar font-lock-end) +(defun python-font-lock-extend-region () + "Extend font-lock region to statement boundaries." + (let ((beg font-lock-beg) + (end font-lock-end)) + (goto-char beg) + (python-nav-beginning-of-statement) + (beginning-of-line) + (when (< (point) beg) + (setq font-lock-beg (point))) + (goto-char end) + (python-nav-end-of-statement) + (when (< end (point)) + (setq font-lock-end (point))) + (or (/= beg font-lock-beg) (/= end font-lock-end)))) (defconst python-syntax-propertize-function (syntax-propertize-rules @@ -6704,9 +6708,9 @@ python-mode `(,python-font-lock-keywords nil nil nil nil (font-lock-syntactic-face-function - . python-font-lock-syntactic-face-function) - (font-lock-extend-after-change-region-function - . python-font-lock-extend-region))) + . python-font-lock-syntactic-face-function))) + (add-hook 'font-lock-extend-region-functions + #'python-font-lock-extend-region nil t) (setq-local syntax-propertize-function python-syntax-propertize-function) (setq-local imenu-create-index-function -- 2.34.1 --Multipart_Thu_May_25_00:05:06_2023-1-- From debbugs-submit-bounces@debbugs.gnu.org Wed May 24 14:53:21 2023 Received: (at 63622) by debbugs.gnu.org; 24 May 2023 18:53:21 +0000 Received: from localhost ([127.0.0.1]:45170 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1tbt-0005WM-7k for submit@debbugs.gnu.org; Wed, 24 May 2023 14:53:21 -0400 Received: from mail-yb1-f177.google.com ([209.85.219.177]:49409) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q1tbo-0005W2-T5 for 63622@debbugs.gnu.org; Wed, 24 May 2023 14:53:20 -0400 Received: by mail-yb1-f177.google.com with SMTP id 3f1490d57ef6-bab8f66d3a2so1871264276.3 for <63622@debbugs.gnu.org>; Wed, 24 May 2023 11:53:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684954391; x=1687546391; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=QL0aTjdIK6z/iSre9aaJ7Nhw6fO9sKLxPXeRkLbIjTk=; b=c3nSibkokxVgUcA1LlFeNR6rd4KZvrgdQ+Sx0U8RQf0FxY0NDHr+8Oqp9Boz6e3odF j1/l8U+F+enUffoiJBDI58S1NGO0r60ync8HomuDxxVxHQBI/HAMalLQRb7+rQs8tOTw lw2Uci0WPmnNre1wrVPV1WG9ODgkPac3GLzv/XzGiraJTrsmrDOTqhwlCC4tGPA3quWz bFeXXlB+S7+iojLJAJVaO/Xl78s9xeFEOuuK6mP+1Ppk/50jVZ/TdNABkGME0pQi7w2X cqBTQWC7+HNFcaMWCBmAM810kOWzblhxkEelDuV+CaaNb5ER4oyfSai5w+ONoQY4Rmcq Dcgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684954391; x=1687546391; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=QL0aTjdIK6z/iSre9aaJ7Nhw6fO9sKLxPXeRkLbIjTk=; b=kEC45SgLAA5nzdOqqL4xnvsGnCcfCVrWjqWnc7k5WKQbgiuawW2MFMOE4JVQGLDKob 38knWkbrRF0Nd6ax7b/QBjFzPXYbvJH8qln6Z2Tq9xcjWM0TujDfMDpmPlU0J5XrHrBL FjLmoidZgV13aBItDLYMMYSdjP7hI4mvmkchLAHVhmcHgZh2scrmizKLuVHV67VKvPoD B9aYfDSuPWhxmAqP0i8Nq9bnNgO88Ou6tg6dAnapdpPc0idIHWXR4NlJRzEDZSz8ACvj WhgIQK2uVIrUs5mTExxloOrkFPuztATu/red1SSqwaTUnFSXXfPswuV/PalBVCru7i/O nIZA== X-Gm-Message-State: AC+VfDx9Jkl8gM5stcynmEa2C0lA7AA0ub5en4j+0JKmzzlfdIooKU78 E1LUq7+sjsY84DChcyrHSi571Vw3sEoxfaFXm+o= X-Google-Smtp-Source: ACHHUZ7+3v8DEA3vX6RI3jkKq4+5e73HYMTUBgcpoilvTYiYLMjmW90M7pbgM+9CtbMbjzIZcpQUoYexIFqn/5GgXdQ= X-Received: by 2002:a25:e60d:0:b0:b9e:4d05:1f96 with SMTP id d13-20020a25e60d000000b00b9e4d051f96mr716273ybh.44.1684954391219; Wed, 24 May 2023 11:53:11 -0700 (PDT) MIME-Version: 1.0 References: <83zg5yqkr1.fsf@gnu.org> In-Reply-To: From: Tom Gillespie Date: Wed, 24 May 2023 11:53:00 -0700 Message-ID: Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock To: kobarity Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 63622 Cc: Eli Zaretskii , Stefan Monnier , 63622@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > Is that so? As far as I have tried, f-string does not seem to be a > docstring. Oops. Indeed you are correct. I remember this being an issue in one of my files as I had to assign to __doc__ manually since of course docstrings do not require the file to be run for the purposes of extracting documentation. Best! Tom From debbugs-submit-bounces@debbugs.gnu.org Fri May 26 06:01:11 2023 Received: (at 63622-done) by debbugs.gnu.org; 26 May 2023 10:01:11 +0000 Received: from localhost ([127.0.0.1]:48897 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q2UFz-00051z-A8 for submit@debbugs.gnu.org; Fri, 26 May 2023 06:01:11 -0400 Received: from eggs.gnu.org ([209.51.188.92]:56748) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q2UFw-00051l-VY for 63622-done@debbugs.gnu.org; Fri, 26 May 2023 06:01:09 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q2UFr-0003rs-L9; Fri, 26 May 2023 06:01:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=+8KMDWLh0e/IqOK24f9+b1R2TyLfm3yfHjZoJYMqK9g=; b=c6KL9uCnmMSy p5XEyXhBFfDGocOXHHP7MsNAuFFS55zTvRSTyUEl6dVRSsUv62JBZSCg43eLVbAMjGRtEgQWnpx11 htwIzrsHw6ycBIjANMl1IuBtMJDNwo+mFMUik4sWMbM+x8qowxBWp0PO9WvDMWOlGV9HB2KwkXADN JKt1GpMMhX0KLn4nWvqqf3KVePEipG0OjbeEkGXmqOMh+jY/KLiKD92CuUzOpDV3B0Ka8MIceN6O7 O2mjWMJQx0APqHLf4OrJzH+wys67cQaxQzXza8jfAr6uGL9qhZfrBhz14g5/GeFGpbbt4fzMJkO2+ krQOFlsHy/ohtGZuA2ZDng==; Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q2UFr-0003Wv-4d; Fri, 26 May 2023 06:01:03 -0400 Date: Fri, 26 May 2023 13:01:33 +0300 Message-Id: <838rdbl8ci.fsf@gnu.org> From: Eli Zaretskii To: kobarity In-Reply-To: (message from kobarity on Thu, 25 May 2023 00:05:09 +0900) Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock References: <35B358E3-13AD-41F7-8060-2B1B1968DAE2@netyu.xyz> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 63622-done Cc: ruijie@netyu.xyz, tgbugs@gmail.com, 63622-done@debbugs.gnu.org, monnier@iro.umontreal.ca X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Thu, 25 May 2023 00:05:09 +0900 > From: kobarity > Cc: Stefan Monnier , > Tom Gillespie , > Eli Zaretskii , > 63622@debbugs.gnu.org > > Attached is a series of patches that replace the previous patches. Thanks, installed on the emacs-29 branch, and closing the bug. From unknown Thu Jun 19 14:02:31 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 23 Jun 2023 11:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator