do not return or else
Document #: | D2121R0 |
Date: | 2020-08-02 |
Project: | Programming Language C++ |
Audience: |
EWG |
Reply-to: |
Barry Revzin <barry.revzin@gmail.com> |
This is somewhat of a novel proposal in that it is not motivated by any problem which currently exists. Instead, it is motivated by the problems that arising out of the Pattern Matching [P1371R2] proposal and its excursion into more complex expressions.
The expression form of inspect
needs its return type to be compatible with all of the cases. But not every case actually contributes to the type:
Here, we have two cases: one has type int
, and the other technically has type void
. But because the throw
actually escapes the scope anyway, we don’t need to consider that case when resolving the type. As a result, the above can be a perfectly valid use of inspect
- the type of the expression can be said to be int
.
This, in of itself, isn’t novel. We already carve out an exception for throw
in the conditional operator, and the following rewrite of the above example has been valid for a long time:
But while throw
is the only exception (not sorry) for the conditional operator, there are other statements in C++ which escape their scope and thus could potentially be excluded from consideration when it comes to types, and could be used as scope-escaping expressions. The full set of such keyword-driven statements is:
break
continue
co_return
goto
return
throw
goto
is a little special, but the others behave a lot like throw
: they necessarily escape scope and have no possible value, so they can’t meaningfully affect the type of an expression. The following could be a perfectly reasonable function (though this paper is not proposing it):
But in addition to those keywords, there’s one more thing in C++ that is guaranteed to escape scope: invoking a function marked [[noreturn]]
(such as std::abort()
or std::terminate()
). And here, we run into a problem. While it’s straightforward to extend the rules to treat all escaping statements as valueless, we cannot do the same for [[noreturn]]
functions:
The guidance we adopted in Albuquerque [Attributes] during the discussion of [P0840R0] was:
Compiling a valid program with all instances of a particular attribute ignored must result in a correct implementation of the original program.
The above inspect
, if we added semantic meaning to [[noreturn]]
, could work - it would ignore the second pattern and simply deduce the type of the inspect
-expression as int
. But if we ignored [[noreturn]]
, then we have two patterns of differing type (int
and void
) and the type of the inspect
-expression would have to be either void
or ill-formed, either way definitely not int
. This violates the attribute rule.
However, being able to std::terminate()
or std::abort()
or std::unreachable()
in a particular case is an important feature in an inspect
-expression, and so the problem of how to make it work must be resolved… somehow.
This paper goes through four mechanisms for how we could get this behavior to work. It will first introduce the four mechanisms, and then compare and contrast them.
At a session in Prague, the paper authors proposed new syntax for introducing a block which marked the block as noreturn
:
The above could then be allowed: the second case would be considered an escaping statement by virtue of the !{ ... }
and would thus not participate in determining the type. This leaves a single case having type int
.
Such an annotation could be enforced by the language to not escape (e.g. by inserting a call to std::terminate
on exiting the scope), so there is no UB concern here or anything.
There may be other mechanisms to annotate escaping blocks besides !{ ... }
but this paper considers any others to be just differences in spelling anyway, so only this one is considered (and the spelling isn’t really important anyway).
The other three suggested mechanisms all are based on applying some kind of annotation to the functions that escape rather than to the blocks that invoke these functions. These are:
C and C++ have the type void
, but despite the name, it’s not an entirely uninhabited type. You can’t have an object of type void
(yet?), but functions which return void
do, in fact, return. We could introduce a new type that actually has zero possible values, which would indicate that a function can never return. For the sake of discussion, let’s call this type true void
. Actually, that’s a bit much. Let’s call it std::never
.
We could then change the declaration of functions like std::abort()
:
The advantage here is that once the noreturn functions return std::never
, the language can understand that a block ending with one of these functions can never return, so we don’t need the !{ ... }
syntax. The motivating example just works:
int maybe_terminate(int arg) {
return inspect (i) {
0: 42;
_: std::terminate(); // ok, returns std::never, so never returns
};
}
We could take the [[noreturn]]
function attribute and elevate it into a first class language feature, so that future language evolution (e.g. pattern matching) may then take this into account in determining semantics.
The syntax this paper proposes is:
Just kidding. The syntax this paper is actually proposing (for reasons that will become clearer shortly) is the keyword spelled _Noreturn
(this is already a keyword in C so seems straightforwardly available):
This means that libraries straddling multiple language versions may end up having to write:
but because there are a fairly small number of such functions, I don’t think it’s a huge problem.
[[noreturn]]
attributeInstead of introducing a new language feature to mark a block as escaping (as proposed in Prague), or introducing a new language feature to mark a function as escaping (as in the previous two sections), let’s actually just take advantage of the fact that we already have a language feature to mark a function as escaping: the [[noreturn]]
attribute.
That is: don’t introduce anything new at all. Just allow the [[noreturn]]
attribute to have semantic meaning. Say that a function so annotated counts as an escaping function, as a language rule, and allow that to work.
The set of [[noreturn]]
-annotated functions is very small (the standard library has 9: std::abort
, std::exit
, std::_Exit
, std::quick_exit
, std::terminate
, std::rethrow_exception
, std::throw_with_nested
, std::nested_exception::rethrow_nested
, and std::longjmp
- with std::unreachable
on the way), and we already have to annotate these functions. It’s hard to count how many uses of this attribute exist in the wild since it so frequently shows up being a macro, but I think it’s safe to say that the number of invocations of escaping functions far exceeds the number of declared escaping functions. By many orders of magnitude.
Given that, and the fact that we need to make some kind of language change to make this work anyway, it seems like we should change the language to recognize the escaping functions themselves rather than recognize uses of them. The suggested path of !{ std::abort(); }
isn’t exactly enormous syntactic overhead over std::abort()
, it’s probably about as minimal an annotation as you can really get, but it just seems like the wrong direction to take - and it seems better for the annotation to be localized to the functions rather than the invocations of them. We should instead elevate noreturn
to be a first-class language feature so that we can treat std::terminate()
the same as a return
or a throw
without requiring further annotation on all uses of it.
Let’s go through the suggested options for annotating the function itself.
The problem with introducing a new type like std::never
is the enormous amount of work necessary to really work through what std::never
means in the type system. Can you have a…
never&
? Much like void&
, there cannot be such an object, so forming a valid reference is impossible. Would that mean that the type itself is ill-formed or would that mean that a function returning a never&
never returns?never*
? Since there can never be a never
object to point to, this seems like the same case as never&
- but there is one exception. A never*
could still have a null pointer value. That’s not pointing to an object right? Does this mean that a never* f()
necessarily returns a null pointer?pair<never, int>
? As a proxy for having a class type with a never
- this would also be a type that’s impossible to form. So this one, like never&
, would either also be an “escaping type” or ill-formed.optional<never>
? This is a lot like never*
- it can never hold a value but it’s perfectly fine to empty? But how would you construct the language rules such that it’s implementable properly? If you could, then this would be a conditionally escaping type? What would that mean? Another isomorphic type would be variant<T, never>
, which would necessarily hold a T
.All of these questions seem quite interesting to think about, but ultimately the benefit doesn’t seem to be there at all. It’s nice to only have to annotate the escaping functions - rather than all escaping uses of those functions - but this direction just has too many other questions.
That reduces us to the last two choices:
_Noreturn
as a function-specifier, or[[noreturn]]
.The advantage of the former is it allows us to preserve the adopted guidance on the meaning of attributes from Albuquerque.
The advantage of the latter is: we already have an existing solution to exactly this problem, and having to introduce a new language feature, to solve exactly the same problem, seems like artificial and pointless language churn. The issue is that we are not allowing ourselves to use the existing solution to this problem. Maybe we should?
Sure, such a direction would open the door to wanting to introduce other attributes that may want to have normative semantic impact, and we’d lose the ability to just reject all of those uniformly. But I think we should seriously consider this direction. It would mean that we would not have to make any changes to the standard library at all. Any user-defined [[noreturn]]
functions that already exist would just seamlessly work without them having to make any changes.
Note that [[no_unique_address]]
, the attribute during whose discussion we adopted this guidance, already is somewhat fuzzy with this rule. The correctness of a program may well depend annotated members taking no space (e.g. if a type so annotated needs to be constructed in a fixed-length buffer). We more or less say this doesn’t count, and there is certainly no such fuzziness with the other attributes like [[likely]]
, [[fallthrough]]
, or [[deprecated]]
.
But there’s one other important thing to consider…
An important thing to consider is C compatibility. C also has functions that do not return, and we should figure out how to treat those as escaping functions as well. C has a different function annotation to indicate an escaping function, introduced in C11 by [C.N1478]:
Where the C functions longjmp()
, abort()
, exit()
, _Exit()
, and quick_exit()
so annotated. C also provides a header which #define
s noreturn
to _Noreturn
.
On top of this, WG14 is pursuing the C++ [[noreturn]]
attribute itself, via [C.N2410].
This suggests that pursuing a different keyword (possibly a context-sensitive one) for noreturn
would just introduce a new incompatibility with C, that C is currently working to remedy. Unless the keyword we picked was, specifically, _Noreturn
.
But the C compatibility issue is actually even stronger than this. While in C++, we just have guidance that attributes should be ignorable, this is actually normative in C. From the latest C working draft, 6.7.11.1p3 [C.N2478]:
A strictly conforming program using a standard attribute remains strictly conforming in the absence of that attribute.
with corresponding footnote:
Standard attributes specified by this document can be parsed but ignored by an implementation without changing thesemantics of a correct program; the same is not true for attributes not specified by this document.
That’s pretty clear. If C++ adopts semantics for [[noreturn]]
, that kills any attempt at C compatibility going forward.
Given WG21’s guidance that attributes should be ignorable, and WG14’s normative rule of the same, it seems like the best course of action is to introduce a new, keyword to indicate that a function will not return.
For compatibility with C, which already has exactly this feature, we should just adopt the C feature.
This would be a novel direction in C++, since we typically don’t use these kinds of names, but as mentioned before, the number of noreturn functions is small so it seems far more important to get a consistent feature than it is to have that feature have nice spelling.
We would then go through the library and swap out the [[noreturn]]
attribute for the _Noreturn
specifier:
I want to be very clear that regardless of the direction taken for this paper, given:
the type of f
is still void()
, the same as the type of g
. Though it turns out that clang already models __attribute((noreturn))
(but not [[noreturn]]
) in the type system:
template <typename T>
constexpr bool is_noreturn(T ()) { return false; }
template <typename T>
constexpr bool is_noreturn(__attribute__((noreturn)) T ()) { return true; }
int x();
__attribute__((noreturn)) float y();
[[noreturn]] double z();
static_assert(not is_noreturn(x));
static_assert(is_noreturn(y));
static_assert(not is_noreturn(z));
But, again, not something I’m interesting in.
In 5.11 [lex.key], add _Noreturn
as a keyword.
Change 7.5.5 [expr.prim.lambda]/3:
3 In the decl-specifier-seq of the lambda-declarator, each decl-specifier shall be one of
mutable
,constexpr
,orconsteval
, or_Noreturn
. [Note: The trailing requires-clause is described in [dcl.decl]. — end note]
Add somewhere in 7.5.5.1 [expr.prim.lambda.closure]:
* If the lambda-expression’s decl-specifier-seq contains _Noreturn
and if the function call operator or any given operator template specification is called and eventually returns, the behavior is undefined.
In 9.2.2 [dcl.fct.spec], change the grammar to add _Noreturn
as a function-specifier:
Add a new paragraph to the end of 9.2.2 [dcl.fct.spec] (this is the same wording as in 9.12.9 [dcl.attr.noreturn]):
5 If a function f
is called where f
was previously declared with the _Noreturn
specifier and f
eventually returns, the behavior is undefined. [Note: The function may terminate by throwing an exception. — end note]
Change all uses of [[noreturn]]
as an attribute in 17 [support] to use the _Noreturn
specifier instead. Those uses are:
abort
,exit
,_Exit
, andquick_exit
in 17.2.2 [cstdlib.syn]abort
,exit
,_Exit
, andquick_exit
in 17.5 [support.start.term]terminate
,rethrow_exception
, andthrow_with_nested
in 17.9.1 [exception.syn]terminate
in 17.9.4.4 [terminate]rethrow_exception
in 17.9.6 [propagation]nested_exception::rethrow_nested
andthrow_with_nested
in 17.9.7 [except.nested]longjmp
in 17.13.2 [csetjmp.syn]
Add the feature-test macro __cpp_noreturn
. This will let users properly add non-returning semantics to their functions:
#if __cpp_noreturn
# define NORETURN _Noreturn
#elif __cpp_has_attribute(noreturn)
# define NORETURN [[noreturn]]
#else
# define NORETURN
#endif
Thanks to Aaron Ballman for pointing me to the relevant C rules and discussing the issues with me.
[Attributes] EWG. 2017. EWG discussion of P0840R0.
http://wiki.edg.com/bin/view/Wg21albuquerque/P0840R0
[C.N1478] David Svoboda. 2010. Supporting the “noreturn” property in C1x.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1478.htm
[C.N2410] Aaron Ballman. 2019. The noreturn attribute.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2410.pdf
[C.N2478] WG14. 2020. C Working Draft.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2478.pdf
[P0840R0] Richard Smith. 2017. Lamguage support for empty objects.
https://wg21.link/p0840r0
[P1371R2] Sergei Murzin, Michael Park, David Sankel, Dan Sarginson. 2020. Pattern Matching.
https://wg21.link/p1371r2