Package: emacs;
Reported by: Tim Ruffing <crypto <at> timruffing.de>
Date: Fri, 5 Jan 2024 21:20:01 UTC
Severity: normal
Tags: patch
Done: Stefan Monnier <monnier <at> iro.umontreal.ca>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Tim Ruffing <crypto <at> timruffing.de> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 68272 <at> debbugs.gnu.org Subject: bug#68272: [PATCH] Fix -1 leaking from C to lisp in 'read-event' etc. Date: Sat, 06 Jan 2024 15:32:23 +0100
Hey, thank you very much for the long email. I was somewhat prepared for hesitant reply, and I totally understand that "there be dragons". And I'm also aware that it doesn't spark confidence when someone with (almost) zero code contributions submits a patch of this kind. I feel the same when new people send complex patches to the C library I maintain... And sure, it took a me quite a while to navigate this maze of keyboard.c and to come up with this patch and make the tests pass, but I feel rather confident now that this is the right approach. I agree, we can never be fully sure that this introduces regressions, but let me still try to convince you that this approach and these patches are carefully crafted and thought through. Here's my analysis of the situation: First of all, I think the bug is real. read-event (and read-char and read-char-exclusive, which I won't mention from now on for brevity) can return -1, but the docs don't mention -1, and it's arguably a strange "magic" value for a lisp function. If anything, one would expect nil, or some special symbol. Of course, we could simply document the -1 return value. But the problem is not just that this is not elegant. It's worse because it's also pretty unclear what the caller should do in this case. In particular, the caller can't simply skip the -1 and try again: The caller will see an infinite stream of -1 values, until the keyboard macro has been resolved properly, i.e., as long as executing_kbd_macro is non-nil. One thing the caller can simply do is to error out. But if the caller wants an event, the only thing it can do is to touch executing_kbd_macro and set it nil explicitly (this is what subr.el does currently). We could also document this, but I feel something like "this function can return -1 and if it does, set executing_kbd_macro to nil and try again" is really a bit too unelegant (and it has more downsides, see next paragraph). However, this now hints at a different approach: Could we handle this in read-event locally and just perform the retrying for the caller? First of all, we'd probably still do it one level below in the call stack in order not to mess up with the timeouts (if we simply try again, we could double the timeout), so we'd want to do it inread_filtered_event. But I think that approach will then be *not* localized and thus dangerous: it's not clear if setting executing_kbd_macro to nil has unexpected side effects have. Resolving the macro properly is ultimately the responsibility of Fexecute_kbd_macro and pop_kbd_macro in macros.c, and we shouldn't just declare at some other point in the code that we wish the macro handling to be done. ------------- Let me try to convince you that these commits are more harmless than they look at the first glance. I'll go through them one by one. I'm also happy to add a revised version of this to the commit messages. 1. "Extract check for end of macro to function": This is pure refactoring. A brief code review shows that this does not change semantics at all. 2. "src/keyboard.c (requeued_events_pending_p): Revive" 2a. On Sat, 2024-01-06 at 09:42 +0200, Eli Zaretskii wrote: > For example, your patches modify > requeued_events_pending_p, but that function is called in several > places, including outside of keyboard.c -- how can we be sure we are > not introducing regressions this way? Oh, good catch. In fact, it's (without the patches here) only called outside of keyboard.c. I was misled by the comment that said "This isn't used yet. The hope is to make wait_reading_process_output call it". Indeed, this comment is outdated and the (only) caller is wait_reading_process_output. wait_reading_process_output really wants only keyboard output, so it should really just check Vunread_command_events and not unread_post_input_method_events and Vunread_input_method_events; the latter two are for non-keyboard input. I can rework this. I think the way to go is to split this into two functions requeued_events_pending_p and requeued_command_events_pending_p or similar, and fix the outdated comment. 2b. This changes two !NILP to CONSP to be consistent with the checks within read_char (see 3. below). This is fine under the assumption that the variables are always lists (as they should be). I was about to say that I can take this back if you feel it's too dangerous, but if this function needs to be split into two anyway, we won't need the change from !NILP to CONSP. 3. "Fix -1 leaking from C to lisp in 'read-event' etc." 3a. On Sat, 2024-01-06 at 09:42 +0200, Eli Zaretskii wrote: > And read_char is called in > several places, including by lread.c and read_char itself -- did you > figure out how will this change affect those? > Yes, I carefully checked all the callers. (I know, it's not trivial to review this commit, and I think any careful reviewer would need to do the same in the end...) The only caller which acts specially on -1 is read_key_sequence, so it makes sense to handle this case in read_key_sequence. This is done in this commit. The other callers are not prepared to get -1 (which is the root cause of the bug!), and exactly what we would like to avoid. 3b. On Sat, 2024-01-06 at 09:42 +0200, Eli Zaretskii wrote: > Moreover, the changes in read_char modify code more than is > necessary to handle the "at end of macro" situation, so we are > risking > changes that will break something else. Here's an example: > > > --- a/src/keyboard.c > > +++ b/src/keyboard.c > > @@ -2610,7 +2610,8 @@ read_char (int commandflag, Lisp_Object map, > > goto reread_for_input_method; > > } > > > > - if (!NILP (Vexecuting_kbd_macro)) > > + /* If we're executing a macro, process it unless we are at its > > end. */ > > + if (!NILP (Vexecuting_kbd_macro) && !at_end_of_macro_p ()) > > { > > /* We set this to Qmacro; since that's not a frame, nobody > > will > > try to switch frames on us, and the selected window will > > @@ -2624,15 +2625,6 @@ read_char (int commandflag, Lisp_Object map, > > selected. */ > > Vlast_event_frame = internal_last_event_frame = Qmacro; > > > > - /* Exit the macro if we are at the end. > > - Also, some things replace the macro with t > > - to force an early exit. */ > > - if (at_end_of_macro_p ()) > > - { > > - XSETINT (c, -1); > > - goto exit; > > - } > > - > > This hunk moves the at_end_of_macro_p test to a higher level, which > has a side effect of not executing the code before the original > at_end_of_macro_p test -- how do we know this won't break something > that happens to depend on that code which will now not execute in the > at-end-of-macro case? The commit takes care of this, and it's not too hard to verify: If you look at read_char, except for setting local variables, this is what it does above the original at_end_of_macro_p test: * if (CONSP (Vunread_post_input_method_events)) ... * Vlast_event_device = Qnil; * if (CONSP (Vunread_command_events)) ... * if (CONSP (Vunread_input_method_events)) * Vlast_event_frame = internal_last_event_frame = Qmacro; (if in a kbd macro) The "if" blocks don't matter. That's exactly why the caller checks for patch check for !requeued_events_pending_p () in the patch: if those blocks were to become active, we'd *not* skip read_char. Consider "Vlast_event_frame = internal_last_event_frame = Qmacro". The purpose this is to indicate that the last event came from a kbd macro. Since we now don't return the "event" -1 any longer, it also doesn't make sense to set this in this case. (And when we read a new event, this will be correctly updated.) What remains is setting "Vlast_event_device = Qnil". The same logic applies here. I don't think we should set to this nil at the end of a keyboard macro, because no new event was processed, i.e., the "last event" didn't change. That's why I choose to skip this line. But it's certainly more conservative to keep the two assignments to avoid any unexpected consequences, I'm happy to add this to the caller with an appropriate comment, if this is desired 4. "Remove workarounds for solved 'read-event' bug" On Sat, 2024-01-06 at 09:42 +0200, Eli Zaretskii wrote: > Also note that at least in subr.el we don't only test the -1 value, > we > also test that a keyboard macro is being executed, and in this and > other callers that handle -1, each application behaves in > application-specific way when it receives -1. It is not clear to me > how your changes modify the behavior in those cases, and your > explanations and the log message doesn't seem to answer this > question. > For example, this code in calc-prog: > > > --- a/lisp/calc/calc-prog.el > > +++ b/lisp/calc/calc-prog.el > > @@ -1230,8 +1230,6 @@ calc-kbd-skip-to-else-if > > ch) > > (while (>= count 0) > > (setq ch (read-char)) > > - (if (= ch -1) > > - (error "Unterminated Z[ in keyboard macro")) > > (if (= ch ?Z) > > (progn > > (setq ch (read-char)) > > now signals a specific error in the case of an unterminated keyboard > macro: what will the behavior in that case be after the changes? The behavior is that it will wait for interactive user input. Say you define a keyboard macro that performs only part of a calculation. Then instead of signaling an error, it will simply wait for the user the provide the remaining key strokes. On Sat, 2024-01-06 at 10:15 +0100, Andreas Schwab wrote: > > Seems like calc actually wants to know when the kbd macro ends > prematurely, and removing the option to detect it is a regression To be honest, I really don't think that signaling an error here was a crucial feature. I'm pretty sure what happened is that -1 appeared as an unexpected return case, and then the error was added. Yes, this is a semantic change, but I think it's an acceptable one. This doesn't remove any real functionality, it simply avoids an error condition. But one thing I wasn't sure about is whether the change that -1 can't occur anymore would warrant a NEWS item or similar. If yes, I'll add one. ------------- On Sat, 2024-01-06 at 09:42 +0200, Eli Zaretskii wrote: > In this case, given > that we want a more elegant solution for a situation that we already > handle, I tend to avoid any such changes to begin with, for the > reasons I explained above, and instead perhaps document the special > meaning of -1 in these cases. And if we want to consider such > changes > these dangers notwithstanding, I would like to see them affecting as > little other code as possible. Can you suggest a safer changeset? As I explained above, documenting this is not trivial either. While we can, of course, simply document that -1 can be returned, it's hard to give meaningful advice except "this function can return -1 and if it does and you really want to wait for input, set executing_kbd_macro to nil and try again", which is not only unelegant, but could have further consequences. I'm happy to make the changes based on your review and on other feedback, and expand the commit message a lot, and I think this will make the patches more conservative and safer. It's still not trivial to review and verify my analysis. But this is just how it is. The code change wasn't easy to come up with, so it's necessarily not very easy to review. With the modifications I hinted at based on your review, I think what we end up with is: 1. commit: no semantic change 2. commit: will be reworked to have no semantic change at all 3. commit: only change is that -1 is not propagated to lisp, other callers of read_char are unaffected as I explained above. (modulo Vlast_event_device and Vlast_event_frame -- which I can also still set if you want me to to). 4. commit: this only removes code paths which are unreachable now, so again no semantic chance. But then it will be good upfront if it's at least realistic that these patches get in in some form or another. If I can get conceptual approval or dismissal of this approach, I know what to do next. Best, Tim
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.