GNU bug report logs - #75964
Switching the Emacs build to -Wswitch-enum in src/

Previous Next

Package: emacs;

Reported by: Pip Cet <pipcet <at> protonmail.com>

Date: Fri, 31 Jan 2025 09:41:02 UTC

Severity: wishlist

Full log


View this message in rfc822 format

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: pipcet <at> protonmail.com, 75964 <at> debbugs.gnu.org, stefankangas <at> gmail.com
Subject: bug#75964: Switching the Emacs build to -Wswitch-enum in src/
Date: Tue, 4 Feb 2025 13:23:25 -0800
[Message part 1 (text/plain, inline)]
On 2/3/25 04:08, Eli Zaretskii wrote:
>> From: Paul Eggert <eggert <at> cs.ucla.edu>
>> The main point is to catch more potential inadvertent omissions of "case
>> X:" in a "switch (E)" where E's type is an enum containing X. etags.c,
>> like the rest of Emacs, currently uses one style to do this; the
>> proposed style would be better at catching these mistakes.
> 
> And how do the proposed changes achieve that goal?

With an enum expression E and code like this:

   switch (E) { case A: ...; case B: ...; }

there is currently no compile-time diagnostic for mismatches between the 
type and the code when the enum type also has a value C, either because 
the programmer forgot to handle case C when the switch was written, or 
because the enum type was changed and the programmer forgot to check all 
uses of the enum.

There *is* a compile-time diagnostic for similar mismatches between type 
and code, e.g., if the enum type lacks a B (perhaps because B was 
formerly present but has since been removed). But there is no 
compile-time diagnostic for the mismatch in question.

Mismatches like these are all too common and can lead to user-visible 
bugs. It's a win to detect these mismatches statically.


>> For these switch statements, use "switch (+E)" instead of "switch (E)".
>> This pacifies GCC and clearly signals to the reader that the switch's
>> cases are not intended to exhaust the enum. A "switch (E)" must list all
>> the enum values; a "switch (+E)" need not do so. A reasonable guideline
> 
> The above seems to say that "+E" is different from "E" in ways that
> the compiler knows about.  So I'm asking where is this special meaning
> documented

It is more than merely documented; it's been common practice for years. 
For example, given enum W {X,Y,Z} the type of an expression like X+1 has 
been int (not enum W) ever since enums were standardized in the C89 
standard.

For documentation chapter and verse, please see C23 §6.5.4.3 ¶2, which 
says that the type of +E is like that of -E: the type is E's type after 
integer promotions. Also please see C23 §6.3.1.1 ¶1–2, which says that 
int is the type of these Emacs enums after integer promotion. So, when E 
is one of these enums, the type of +E (and of -E) is int.


> If the special meaning of "+E" is not well understood

I hope the above suffices to explain why it is well understood.


> I don't think I see how these changes improve checking and make the
> code easier to read.  For me, the "+E" thing is an obstacle to
> negotiate; I'm sure others will also stumble on that.

If necessary we can package "+E" inside an inline identity function. 
That might help prevent stumbling by people who don't know the C 
language well, and it would not have any runtime cost in the usual -O2 
case. I'd rather not use such a function (as +E is clear once you get 
used to it) but using one would be better than doing nothing, as it 
would improve static checking.


> apart of that change, the code basically remained the same.

Yes, and that's a plus for +E: you don't have to change the code much. 
In that sense it's better than Pip Cet's earlier proposal.


>> +      case NEUTRAL_ON: case NEUTRAL_S: case NEUTRAL_WS:
>> +      case STRONG_AL: case STRONG_L: case STRONG_R:
>> +      case WEAK_AN: case WEAK_CS: case WEAK_EN: case WEAK_ES: case WEAK_ET: case WEAK_NSM:
>>  	if (override == L2R)
>>  	  return STRONG_L;
>>  	else if (override == R2L)
> 
> If you really consider this long list of values more readable than the
> original code, then I guess we disagree on what is and isn't readable.

Oh, that long list was in some sense a misfire, as the patch did two 
things where I suppose it should have done one. First, the patch 
improved the runtime checkcing for invalid values, something that can 
and should be done independently of what we do about -Wswitch-enum. 
Second, the patch pacified -Wswitch-enum.

I fixed this misfire by splitting the patch into two parts, which I'm 
attaching. The first attached patch improves the runtime checking 
without changing the -Wswitch-enum style, and I installed that on 
master. The second attached patch, which I have not installed, is the 
-Wswitch-enum change proper.

In this example, pacifying -Wswitch-enum helped find code where Emacs's 
internal runtime checking was missing some invalid values. So this is an 
example of why this style change is helpful in practice.
[0001-Improve-bidi_get_time-runtime-checking.patch (text/x-patch, attachment)]
[0002-Pacify-gcc-Wswitch-enum-in-bidi.c.patch (text/x-patch, attachment)]

This bug report was last modified 127 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.