Finally nailed it, after finding out about ert-simulate-command. The following patch adds a test that passes on master and emacs-27, and "successfully fails" when reverting Stefan's fix (commit c1ce9fa7f2).