On 6/16/20 6:05 AM, Jacob Bachmeyer wrote:
> Tom de Vries wrote:
>> Hi,
>>
>> the gdb testsuite uses the dejagnu framework.
>>   
> 
> Apologies for the delay in responding; I took a few days to think about
> this issue because it requires a careful balance, as silently continuing
> after aborting a testcase could cause errors to go unnoticed in large
> testsuites.
> 
>> If I add a call to foobar at the end of a test-case in that testsuite,
>> and I do a run of the testsuite, then the test-case aborts, and
>> subsequently the testsuite run aborts, that is, no further tests are run.
>>
>> While it is as expected that the test-case aborts, it's not desirable
>> that the testsuite run aborts.
>>   
> 
> The main problem I have with this is that calling an unknown procedure
> is presumably a very serious error.  Aborting the entire run is a good
> way to ensure that the problem is noticed, but perhaps with a testsuite
> as large (and long-running) as the GDB testsuite has become, continuing
> may also be reasonable to salvage at least some results.
> 
> Where in the GDB testsuite is this a problem?  When is it appropriate to
> continue after such a serious error?

It's not a problem in a specific test-case, but a generic problem. If
somebody makes a typo in a test-case and checks it in, there's no need
to abort the test run.

> How to ensure that the error is
> not buried in the logs amidst the normal output from other tests and is
> actually noticed?
> 

I monitor ERROR and WARNING in my test runs, so I didn't think of this,
but indeed, those are not tracked in the summary, so that could be a
problem, agreed.

>> We've installed a hack in ${tool}_init/${tool}_finish to workaround this
>> in the gdb testsuite (
>> https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=26783bce15adc0316f9167a216519cebcf1ccac3
>>
>> ).
>>
>> It would preferable though if this problem was addressed in dejagnu,
>> such that we can eventually remove the hack.
>>   
> 
> That hack (if not reverted entirely) needs to be made conditional on the
> actual existence of ::tcl_unknown as soon as possible -- the existence
> of ::tcl_unknown (kept to allow Tcl autoloading to work) is very much an
> internal implementation detail, and it *will* be moved out of the global
> namespace and eventually *will* cease to exist entirely in the
> interpreters that run testcases.  Really, tcl_unknown is not supposed to
> be there and relying on it introduces a latent bug into the GDB testsuite.
> 
> I am sorry, and I will seek to work with you to fix these problems with
> a minimum of breakage, but I have to put my foot down on this:  I
> *cannot* treat every internal symbol as long-term stable API.  That hack
> will *not* be supported long-term.  Any GDB releases that include it
> *will* break with some as-yet-undetermined future DejaGnu release.  I
> have to draw a line somewhere or I may as well declare DejaGnu
> unmaintained and frozen at 1.6.2 forever.  I have no inclination to do
> the latter.
> 

Ack, makes sense.

>> A possible solution to this problem could be to make the exiting on
>> error (which is done in dejagnu's version of proc unknown in
>> framework.exp) optional.
>>   
> 
> This is obviously a good solution, but leads to the obvious question of
> how (command-line option?  (Make normally aborts on error, but has a
> --keep-going option.)  configuration variable?) that should be
> determined, or if exit-on-unknown-procedure should ever be a default. 
> Could simply reporting an UNRESOLVED test (indicating that the testcase
> script aborted due to calling an undefined command) be a better option? 
> It would show up in the summary report at the end.
> 
> At first, I was planning to schedule this for the 1.7 series, but I
> checked and runtest.exp:runtest already catches Tcl errors, so fixing
> this does not require a significant change to the normal control flow. 
> Would simply reporting an "UNRESOLVED:  testcase $file aborted on
> unknown command '$what'" result, then propagating the Tcl error in the
> ::unknown proc be a workable solution?

I think so, yes.  I've tried that out in attached gdb patch, which also
solves the reliance on ::tcl_unknown.

Then when running into the unknown command in the test-case, we have in
gdb.sum:
...
Running
/home/vries/gdb_versions/devel/src/gdb/testsuite/gdb.ada/O2_float_param.exp
...
PASS: gdb.ada/O2_float_param.exp: compilation foo.adb
PASS: gdb.ada/O2_float_param.exp: frame
UNRESOLVED: gdb.ada/O2_float_param.exp: testcase aborted on unknown
command 'foo'
ERROR: tcl error sourcing
/home/vries/gdb_versions/devel/src/gdb/testsuite/gdb.ada/O2_float_param.exp.
ERROR: invalid command name "foo"
    while executing
"::gdb_unknown foo"
    ("uplevel" body line 1)
    invoked from within
"uplevel 1 ::gdb_unknown $args"
    (procedure "::unknown" line 4)
    invoked from within
"foo"
    (file
"/home/vries/gdb_versions/devel/src/gdb/testsuite/gdb.ada/O2_float_param.exp"
line 33)
    invoked from within
"source
/home/vries/gdb_versions/devel/src/gdb/testsuite/gdb.ada/O2_float_param.exp"
    ("uplevel" body line 1)
    invoked from within
"uplevel #0 source
/home/vries/gdb_versions/devel/src/gdb/testsuite/gdb.ada/O2_float_param.exp"
    invoked from within
"catch "uplevel #0 source $test_file_name""

                === gdb Summary ===

# of expected passes            2
# of unresolved testcases       1
...

I get similar results if I disable the hack in the gdb testsuite, and
use attached runtest.exp demonstrator patch.

Note btw that in both cases I didn't put $file in the unresolved
message, because that's already present in pf_prefix, but that might not
always be true.

Thanks,
- Tom