#63622 - lisp/progmodes/python.el: performance regression introduced by multiline font-lock

GNU bug report logs - #63622
lisp/progmodes/python.el: performance regression introduced by multiline font-lock

Package: emacs;

Reported by: Tom Gillespie <tgbugs <at> gmail.com>

Date: Sun, 21 May 2023 03:15:02 UTC

Severity: normal

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Message #32 received at 63622 <at> debbugs.gnu.org (full text, mbox):

From: kobarity <kobarity <at> gmail.com> To: Stefan Monnier <monnier <at> iro.umontreal.ca> Cc: Tom Gillespie <tgbugs <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>, 63622 <at> debbugs.gnu.org Subject: Re: bug#63622: lisp/progmodes/python.el: performance regression introduced by multiline font-lock Date: Wed, 24 May 2023 00:45:27 +0900

[Message part 1 (text/plain, inline)]

Stefan Monnier wrote: > > @@ -6708,8 +6710,8 @@ python-mode > > nil nil nil nil > > (font-lock-syntactic-face-function > > . python-font-lock-syntactic-face-function) > > - (font-lock-extend-after-change-region-function > > - . python-font-lock-extend-region))) > > + (font-lock-extend-region-functions > > + . (python-font-lock-extend-region)))) > > (setq-local syntax-propertize-function > > python-syntax-propertize-function) > > (setq-local imenu-create-index-function > > This is bound to break in some cases. Please use `add-hook` instead. Thanks. I also revised python-font-lock-extend-region. The attached 0001-Use-font-lock-extend-region-functions-in-python-mode.patch is the revised patch. As for the performance degradation, I am almost certain that it is a bug in python-info-docstring-p and/or python-rx string-delimiter. python-info-docstring-p uses the following regex as one of the conditions to determine if it is a docstring. (re (concat "[uU]?[rR]?" (python-rx string-delimiter)))) One of the problems is that string-delimiter matches even if there is a single character preceding the string. For example, string-delimiter matches both x" and 1". It might be reasonable to match r" or b", for example. Because they are correct string starter in Python. However, there is no reason to match x" or 1". Besides, python-info-docstring-p explicitly states "[uU]?[rR]?" as above. So (python-rx string-delimiter) in the above code should only match string delimiter without prefixes. Current python-info-docstring-p incorrectly considers a code like [""] or {""} to be a docstring. The performance problem in the example shown by Tom can be resolved by modifying the above code as follows: (re (concat "[uU]?[rR]?" (rx (or "\"\"\"" "\"" "'''" "'"))))) It breaks some ERTs, but I think we should fix the ERTs. However, there seems to be another problem in python-info-docstring-p. It intentionally considers contiguous strings as docstring as in the ERT python-info-docstring-p-1: ''' Module Docstring Django style. ''' u'''Additional module docstring.''' '''Not a module docstring.''' However, as far as I have tried with Python 3 and Python 2.7, this is not correct. Therefore, I feel it is better to mark the ERTs as expected failures than to modify it at this stage. The attached 0001-Fix-python-info-docstring-p.patch is the patch to do this.

This bug report was last modified 2 years and 49 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #63622 lisp/progmodes/python.el: performance regression introduced by multiline font-lock

GNU bug report logs - #63622
lisp/progmodes/python.el: performance regression introduced by multiline font-lock