GNU bug report logs -
#66363
gdb-control-commands-regexp issues
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 66363 in the body.
You can then email your comments to 66363 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#66363
; Package
emacs
.
(Thu, 05 Oct 2023 15:08:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Mattias Engdegård <mattias.engdegard <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Thu, 05 Oct 2023 15:08:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
The variable `gdb-control-commands-regexp` in gdb-mi.el cannot work as currently defined and used. Only group 3 is of interest, but that group hasn't referred to anything useful for several years.
Group 3 probably refers to a part of the regex's tail where the command argument is matched:
"\\([[:blank:]]+\\([^[:blank:]]*\\)\\)*$"
However, this seems to be broken as well, because all groups here are inside repetitions. This part of the regexp is also exponential in form if not in practice but we'd better simplify it anyway.
Attached is a suggested patch which makes explicit the command abbreviations matched, and leaves only a single submatch. It also changes the tail to assuming that the command argument doesn't contain non-newlines (or the final eol anchor wouldn't make sense) but that it can contain spaces (which seems reasonable). However, right now the argument is only checked for being non-empty or not.
I don't have a working gdb setup at the moment so if someone would be kind to test it, I would be very grateful for it.
[gdb-control-commands-regexp.diff (application/octet-stream, attachment)]
Added tag(s) patch.
Request was from
Stefan Kangas <stefankangas <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Thu, 05 Oct 2023 15:14:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#66363
; Package
emacs
.
(Thu, 05 Oct 2023 16:30:02 GMT)
Full text and
rfc822 format available.
Message #10 received at 66363 <at> debbugs.gnu.org (full text, mbox):
> From: Mattias Engdegård <mattias.engdegard <at> gmail.com>
> Date: Thu, 5 Oct 2023 17:07:22 +0200
>
> The variable `gdb-control-commands-regexp` in gdb-mi.el cannot work as currently defined and used. Only group 3 is of interest, but that group hasn't referred to anything useful for several years.
>
> Group 3 probably refers to a part of the regex's tail where the command argument is matched:
>
> "\\([[:blank:]]+\\([^[:blank:]]*\\)\\)*$"
>
> However, this seems to be broken as well, because all groups here are inside repetitions. This part of the regexp is also exponential in form if not in practice but we'd better simplify it anyway.
>
> Attached is a suggested patch which makes explicit the command abbreviations matched, and leaves only a single submatch. It also changes the tail to assuming that the command argument doesn't contain non-newlines (or the final eol anchor wouldn't make sense) but that it can contain spaces (which seems reasonable). However, right now the argument is only checked for being non-empty or not.
>
> I don't have a working gdb setup at the moment so if someone would be kind to test it, I would be very grateful for it.
I'd appreciate a few examples of using this that don't work correctly
(or not at all), as it is otherwise not easy to understand the
problem, and much less the proposed solution and how to test it.
TIA
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#66363
; Package
emacs
.
(Thu, 05 Oct 2023 17:17:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 66363 <at> debbugs.gnu.org (full text, mbox):
> I'd appreciate a few examples of using this that don't work correctly
> (or not at all), as it is otherwise not easy to understand the
> problem, and much less the proposed solution and how to test it.
I don't have any examples but the problems are clear from reading the code.
The regexp is defined by:
> (defvar gdb-python-guile-commands-regexp
> "python\\|python-interactive\\|pi\\|guile\\|guile-repl\\|gr"
> "Regexp that matches Python and Guile commands supported by GDB.")
>
> (defvar gdb-control-commands-regexp
> (concat
> "^\\("
> "comm\\(a\\(n\\(ds?\\)?\\)?\\)?\\|if\\|while"
> "\\|def\\(i\\(ne?\\)?\\)?\\|doc\\(u\\(m\\(e\\(nt?\\)?\\)?\\)?\\)?\\|"
> gdb-python-guile-commands-regexp
> "\\|while-stepping\\|stepp\\(i\\(ng?\\)?\\)?\\|ws\\|actions"
> "\\|expl\\(o\\(re?\\)?\\)?"
> "\\)\\([[:blank:]]+\\([^[:blank:]]*\\)\\)*$")
> "Regexp matching GDB commands that enter a recursive reading loop.
> As long as GDB is in the recursive reading loop, it does not expect
> commands to be prefixed by \"-interpreter-exec console\".")
which results in the string regexp
> "^\\(comm\\(a\\(n\\(ds?\\)?\\)?\\)?\\|if\\|while\\|def\\(i\\(ne?\\)?\\)?\\|doc\\(u\\(m\\(e\\(nt?\\)?\\)?\\)?\\)?\\|python\\|python-interactive\\|pi\\|guile\\|guile-repl\\|gr\\|while-stepping\\|stepp\\(i\\(ng?\\)?\\)?\\|ws\\|actions\\|expl\\(o\\(re?\\)?\\)?\\)\\([[:blank:]]+\\([^[:blank:]]*\\)\\)*$"
It is used in one place only:
> (let* ((control-command-p (string-match gdb-control-commands-regexp string))
> (command-arg (and control-command-p (match-string 3 string)))
As you can see it refers to group 3, which is matched by "\\(n\\(ds?\\)?\\)" -- clearly not what anybody intended.
As for the tail, even if the match group is corrected it's a rather roundabout way of checking whether there is a non-blank character after the command (and is written using nested repetitions which is how this came to my attention).
For the examples you asked about, perhaps commit 30c0f81f9f which introduced the now broken mechanism can be of help.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#66363
; Package
emacs
.
(Thu, 05 Oct 2023 17:44:01 GMT)
Full text and
rfc822 format available.
Message #16 received at 66363 <at> debbugs.gnu.org (full text, mbox):
> From: Mattias Engdegård <mattias.engdegard <at> gmail.com>
> Date: Thu, 5 Oct 2023 19:16:32 +0200
> Cc: 66363 <at> debbugs.gnu.org
>
> > I'd appreciate a few examples of using this that don't work correctly
> > (or not at all), as it is otherwise not easy to understand the
> > problem, and much less the proposed solution and how to test it.
>
> I don't have any examples but the problems are clear from reading the code.
> The regexp is defined by:
>
> > (defvar gdb-python-guile-commands-regexp
> > "python\\|python-interactive\\|pi\\|guile\\|guile-repl\\|gr"
> > "Regexp that matches Python and Guile commands supported by GDB.")
> >
> > (defvar gdb-control-commands-regexp
> > (concat
> > "^\\("
> > "comm\\(a\\(n\\(ds?\\)?\\)?\\)?\\|if\\|while"
> > "\\|def\\(i\\(ne?\\)?\\)?\\|doc\\(u\\(m\\(e\\(nt?\\)?\\)?\\)?\\)?\\|"
> > gdb-python-guile-commands-regexp
> > "\\|while-stepping\\|stepp\\(i\\(ng?\\)?\\)?\\|ws\\|actions"
> > "\\|expl\\(o\\(re?\\)?\\)?"
> > "\\)\\([[:blank:]]+\\([^[:blank:]]*\\)\\)*$")
> > "Regexp matching GDB commands that enter a recursive reading loop.
> > As long as GDB is in the recursive reading loop, it does not expect
> > commands to be prefixed by \"-interpreter-exec console\".")
>
> which results in the string regexp
>
> > "^\\(comm\\(a\\(n\\(ds?\\)?\\)?\\)?\\|if\\|while\\|def\\(i\\(ne?\\)?\\)?\\|doc\\(u\\(m\\(e\\(nt?\\)?\\)?\\)?\\)?\\|python\\|python-interactive\\|pi\\|guile\\|guile-repl\\|gr\\|while-stepping\\|stepp\\(i\\(ng?\\)?\\)?\\|ws\\|actions\\|expl\\(o\\(re?\\)?\\)?\\)\\([[:blank:]]+\\([^[:blank:]]*\\)\\)*$"
>
> It is used in one place only:
>
> > (let* ((control-command-p (string-match gdb-control-commands-regexp string))
> > (command-arg (and control-command-p (match-string 3 string)))
>
> As you can see it refers to group 3, which is matched by "\\(n\\(ds?\\)?\\)" -- clearly not what anybody intended.
So the problem is only that 3 should be changed to the correct group
number?
> As for the tail, even if the match group is corrected it's a rather roundabout way of checking whether there is a non-blank character after the command (and is written using nested repetitions which is how this came to my attention).
Here you are talking about some optimization of the regexp, or is it
another bug? If the latter, what is the bug here?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#66363
; Package
emacs
.
(Thu, 05 Oct 2023 18:12:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 66363 <at> debbugs.gnu.org (full text, mbox):
5 okt. 2023 kl. 19.43 skrev Eli Zaretskii <eliz <at> gnu.org>:
> So the problem is only that 3 should be changed to the correct group
> number?
That would perhaps work, but it wouldn't be a very robust solution. There are many groups preceding that we don't care about, and the bug arose precisely because they were added without considering that a group was being used.
Better then to get rid of all groups save the one in use. (The proposed patch does that.)
That would make it much more difficult for the same bug to arise again.
> Here you are talking about some optimization of the regexp, or is it
> another bug? If the latter, what is the bug here?
It's related. When the reference to subgroup 3 was added (30c0f81f9f), the tail looked like this:
"\\([[:blank:]]+\\([^[:blank:]]*\\)\\)?$"
^^^^^^^^^^^^^^^^^^^
and subgroup 3 was the second group in this substring (underlined above). Later (f71afd600a) the last `?` was changed into a `*`, but that made the contents of that group somewhat hazy because of the repetition:
"\\([[:blank:]]+\\([^[:blank:]]*\\)\\)*$"
^
which doesn't leave us with a useful group for the purpose of detecting command arguments at all.
So the tail of the regexp has to be rewritten anyway, and we might as well do it in a more straightforward way. (The proposed patch does that as well.)
The reason I found this is that it contains a sub-pattern on the form (A+B*)* which is super-linear in general but usually easy to fix so that the result is actually more readable, yet faster.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#66363
; Package
emacs
.
(Thu, 05 Oct 2023 18:53:02 GMT)
Full text and
rfc822 format available.
Message #22 received at 66363 <at> debbugs.gnu.org (full text, mbox):
> From: Mattias Engdegård <mattias.engdegard <at> gmail.com>
> Date: Thu, 5 Oct 2023 20:11:24 +0200
> Cc: 66363 <at> debbugs.gnu.org
>
> 5 okt. 2023 kl. 19.43 skrev Eli Zaretskii <eliz <at> gnu.org>:
>
> > So the problem is only that 3 should be changed to the correct group
> > number?
>
> That would perhaps work, but it wouldn't be a very robust solution. There are many groups preceding that we don't care about, and the bug arose precisely because they were added without considering that a group was being used.
>
> Better then to get rid of all groups save the one in use. (The proposed patch does that.)
> That would make it much more difficult for the same bug to arise again.
>
> > Here you are talking about some optimization of the regexp, or is it
> > another bug? If the latter, what is the bug here?
>
> It's related. When the reference to subgroup 3 was added (30c0f81f9f), the tail looked like this:
>
> "\\([[:blank:]]+\\([^[:blank:]]*\\)\\)?$"
> ^^^^^^^^^^^^^^^^^^^
> and subgroup 3 was the second group in this substring (underlined above). Later (f71afd600a) the last `?` was changed into a `*`, but that made the contents of that group somewhat hazy because of the repetition:
>
> "\\([[:blank:]]+\\([^[:blank:]]*\\)\\)*$"
> ^
> which doesn't leave us with a useful group for the purpose of detecting command arguments at all.
> So the tail of the regexp has to be rewritten anyway, and we might as well do it in a more straightforward way. (The proposed patch does that as well.)
>
> The reason I found this is that it contains a sub-pattern on the form (A+B*)* which is super-linear in general but usually easy to fix so that the result is actually more readable, yet faster.
Thanks. I'm okay with the changes in principle, but someone will have
to test them by running all of the control commands and verifying they
work after the fix, before this can be installed. I myself won't have
the time for doing that any time soon, sorry.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#66363
; Package
emacs
.
(Fri, 06 Oct 2023 12:10:02 GMT)
Full text and
rfc822 format available.
Message #25 received at 66363 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
5 okt. 2023 kl. 20.52 skrev Eli Zaretskii <eliz <at> gnu.org>:
> I'm okay with the changes in principle, but someone will have
> to test them by running all of the control commands and verifying they
> work after the fix, before this can be installed. I myself won't have
> the time for doing that any time soon, sorry.
That's fine, I can wait. The bug only affects GDB users so I'm not personally inconvenienced.
Here's an improved version of the patch that also fixes another bug in the original code: the extra match against gdb-python-guile-commands-regexp is both incorrect (not anchored) and superfluous as that information is available in the match just made.
For instance, consider what happens if the command string is "stepping ...". It will match gdb-control-command-regexp and also gdb-python-guile-commands-regexp and set python-or-guile-p, despite not being a Python or Guile command, and prevent gdb-control-level from being incremented.
[gdb-control-commands-regexp-2.diff (application/octet-stream, attachment)]
Reply sent
to
Mattias Engdegård <mattias.engdegard <at> gmail.com>
:
You have taken responsibility.
(Sun, 29 Oct 2023 16:36:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
Mattias Engdegård <mattias.engdegard <at> gmail.com>
:
bug acknowledged by developer.
(Sun, 29 Oct 2023 16:36:01 GMT)
Full text and
rfc822 format available.
Message #30 received at 66363-done <at> debbugs.gnu.org (full text, mbox):
Pushed to master after finally finding a machine with gdb on it.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Mon, 27 Nov 2023 12:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 1 year and 263 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.