Assign-Through vs. Rebinding: The 3rd option nobody talks about
I’ve been reading a bunch of articles about std::optional<T&>
recently, and it’s a very contentious topic. In C++ 17, attempting to create a std::optional<T&>
will result in a compile error. There are some people who are disappointed with this state of affairs, and with good reason: If you are writing some generic code, and inside your implementation, you want to use a std::optional<T>
, then you must jump through a lot of hoops if you want to support T
being a reference type. Yikes.
This hasn’t been added to C++ yet, because people can’t agree on the semantics of operator=
for std::optional<T&>
. There are basically two camps: The rebinding camp, who claim that the assignment operator should bind the optional to a new reference. In practice that looks like this:
int i = 3;
std::optional<int&> optref = i; // optref bound to i
optref.value() = 8; // The value of i is now 8int j = 7;
optref = j; // optref now bound to j
optref.value() = 12; // the value of j is now 12
The other camp is the assign-through camp. They want the assignment operator to modify the bound object. That one would like this:
int i = 3;
std::optional<int&> optref = i; // optref refers to i
optref.value() = 8; // The value of i is now 8 (same as before)int j = 7;
optref = j; // The value of i is now 7 (different from before)
I’m in the assign-through camp, for one simple reason: I believe that a std::optional<T>
should behave like a T
with exactly one difference: It can be std::nullopt
. No other differences. Not for any type T
, even if T
is a reference type. A regular T&
can’t be rebound, so in keeping with the previous principle, a std::optional<T&>
shouldn’t be able to be rebound.
The Snag: Empty Optional References
Let’s take a look at the following code:
std::optional<int&> optref = std::nullopt;int j = 7;
optref = j; // ?????
What does this do? The rebinding camp has an easy answer: It rebinds optref
to refer to j
, just like it always does. The assign-through camp has a more difficult time answering this question. They might say “OK, we’ll allow rebinding, but only when the optional is empty”. You can see this sentiment in this article by Jonathan Boccara:
When
orx
is empty, as in this piece of code, it doesn’t make sense to forward the assignment to the underlying reference, since there is no underlying reference – it’s an empty optional. The only thing to do with this emptyorx
is to bind its underlying reference toy
.
And again in this article by JeanHeyd Meneide:
The empty case is the same for assign-through (it rebinds, because there is no other option).
JeanHeyd refers to another article by Jonathan Müller who says the same thing:
if opt was empty before, it will now refer to obj
if opt wasn’t empty before, it will now refer to an object with the same value as obj
Seems tricky right? If you assign to an empty optional reference, you have to do something, right? And there is no bound object to modify, so the only remaining option is to rebind, right? Doing nothing clearly isn’t an option, right?
Doing Nothing: Always an Option
Wrong! Doing nothing is an option! In fact it’s always an option! I will call this camp (my camp) the always-assign-through camp, because it means never rebinding, even when the assignment destination is an empty optional. The camp previously known as the assign-through camp shall now be known as the rebind-if-empty camp, and rebind shall be known as always-rebind.
std::optional<int&> optref; // create an unbound optional referenceint i = 3;
optref = i; // This doesn't do anything; optref remains unbound
This approach solves one of the problems with optional references, kinda solves another, and introduces a couple problems (but I believe they are minor).
Problem #1 Solved: Assignment from Temporaries
I believe that there is a major problem with both how the always-rebind and the rebind-if-empty camps want optional references to work that never seems to get brought up: assignment from temporaries. If you have a std::optional<int>
, you can assign to it from a literal, or from the result of a computation:
std::optional<int> opt; // opt starts out empty
opt = 3; // now opt contains 3
The same applies to int&
:
int i = 0;
int& ref = i;
ref = 3; // now i contains 3
But what about std::optional<int&>
?
std::optional<int&> optref; // optref starts out empty
optref = 3; // what now?
If you are in the rebind camp, you don’t have many choices. The authors of boost::optional
are in the rebind camp and made boost::optional
’s assignment operator rebind. They decided to disallow assignment from rvalues at compile time. This is probably the only reasonable choice if you are in the rebind camp. This is also a reasonable choice if you are in the rebind-if-empty camp, though that camp may also prefer throwing an exception. If you are in that camp, it probably does seem like assigning a temporary to a bound optional reference should assign through to the bound object, so that code should at least compile, right?
I dislike those options because they widen the gap between std::optional<T&>
and T&
. Now T&
can’t be rebound and can be assigned to from a temporary, and std::optional<T&>
can be rebound, and can’t be assigned to from a temporary. My primary design goal for std::optional<T&>
is that it should differ fromT&
in exactly one way. If you agree, then this should cement you in the always-assign-through camp, because assigning from a temporary is fine in the always-assign-through camp: if the optional reference is bound, it modifies the bound object, and if it’s not bound, it does nothing. There is no possibility of illegally binding to a temporary.
Problem #2 Kinda Solved: Lifetime Extension of Const References
A somewhat obscure fact about C++ is that it actually is allowed to bind a reference to a temporary… if the reference is const
. This will actually extend the lifetime of the temporary to be the same as the lifetime of the reference:
int a = 1;
int b = 2;
const int& i = a + b; // This is actually OK!
foo(i); // Using i is fine, even though it binds to a temporary
Let’s think about how this applies to optional references. But first, we have to introduce a concept we haven’t talked about before: the optional const reference. std::optional<const T&>
. What does that even do? I’ll start with the always-assign-through camp. If you’ve been paying attention, you might remember that I believe that a std::optional<T>
should have exactly one difference from T
, for all types T
including const and reference types. So how does that apply when T
is a const reference? Well, it still can’t be rebound, since regular const references can’t be rebound. It also can’t even be assigned to, since a regular const reference can’t be assigned to. So there is my answer: std::optional<const T&>
should not be able to be assigned to. If you attempt to assign to it, you get a compile error. If it is initially bound to a temporary, it should extend the lifetime of the temporary to be the same as the optional, just like const T&
. That would be the ideal semantics, in my opinion.
Unfortunately, there is problem, which the always-rebind camp has also noticed: this lifetime extension is a facet of C++, built directly into the language itself, but std::optional
is a mere library, it can’t modify the lifetime rules of the language, regardless of the semantics of the assignment operator. This is discussed in the boost documentation and mentioned briefly in Jonathan Müller’s article from before. But wait! I said earlier that this problem was kinda solved by always-assign-through! How? Well, think about how you might end up with with a std::optional<const T&>
that isn’t extending a temporary’s lifetime, and you wish it was extending its lifetime. You would have to write code like this:
int a = 1;
int b = 2;
std::optional<const int&> optref = a + b;
But if you can never rebind the reference, then there is no reason to use std::optional
here at all. It can never contain std::nullopt
, so there is no reason not to just use const int&
, and then you get that lifetime extension you wanted. If you did construct a std::optional<const int&>
that contained std::nullopt
, then there would be no temporary to extend the lifetime of, and that std::optional
could never contain anything other than std::nullopt
. No problems!
(In case you were wondering, boost::optional
does allow rebinding of boost::optional<const T&>
)
Problem #3 Introduced: Assignment from Another Optional
Generally, when types have an assignment operator, assigning from that same type is allowed. In the context of optional references that means you should be able to write code that looks like this, and have it compile:
std::optional<int&> optref_1 = /* ... */;
std::optional<int&> optref_2 = /* ... */;
optref_1 = optref_2;
But what are the semantics of this code? The always-rebind camp has an easy answer: optref_1
becomes bound to the same thing optref_2
was bound to. This might be nothing, or it might be something. The always-assign-through and rebind-if-empty camps have a tougher time. There are 4 possible scenarios:
- Empty source, empty destination. This is always easy, nothing should happen. Everyone should be able to agree about this one.
- Empty source, non-empty destination. This means that there is a bound object that could be assigned to, but nothing to assign to it. Doing nothing is always an option, but that would a significant different between optionals of references types and non-reference types. Rebinding the destination to nothing seems like it wouldn’t be acceptable to the rebind-if-empty camp, since the destination isn’t empty, and that contradicts that camp name, but I just gave them that name earlier in the article, and maybe they wouldn’t like that name, and would be fine with rebinding the destination to nothing. Maybe we need a 4th camp. But I digress, the always-assign-through camp is definitely not OK with rebinding the destination to nothing.
- Non-empty source, empty destination. The rebind-if-empty camp has an easy answer to this one: rebind the destination to the thing the source is bound to, just like you always do for empty destinations. always-assign-through also has an easy answer: do nothing, just like you always do for empty destinations.
- Non-empty source, non-empty destination. Modifying the bound object of the destination is possible here, and seems unobjectionable to any but the always-rebind camp.
Scenario #2 is the tough one, for both the rebind-if-empty and the always-assign-though camps. I not going to make any claims on behalf of the rebind-if-empty camp, but as I am in the always-assign-through camp, I do feel obligated to choose some desired semantics, so here is my choice: attempting to assign to a std::optional<T&>
from a std::optional<T&>
should be a compile error. That might seem strange and drastic given that only 1 of the 4 scenarios is problematic, and it’s pretty uncommon to have types with assignment operators that don’t allow assignment from themselves. This is the part of the proposal that I’m least sure of, but I do believe that it is the least confusing option for both this problem and problem #4 below.
Problem #4 Introduced: Broken Assignment Identities
C++ makes no claims to be a mathematically inspired language. This is in contrast to languages like Haskell, where certain interfaces have mathematically inspired laws that all implementations of that interface must obey in order to be considered correct. If these laws are broken, the API just stops making sense. If you find out one of these laws doesn’t hold for all types, you probably feel betrayed. For example, you may have felt betrayed when you discovered that (a * b) * c
is not equal to a * (b * c)
for all standard types in C++, specifically float point numbers. The always-assign-though camp might make you feel betrayed in this way: you would probably hope that for all types T
, if you have variables a
and b
of that type, and you do a = b;
, then immediately do a == b
, it should return true. But that identity is broken if assigning to an empty optional reference doesn’t do anything. Ew.
My solution to this problem is the same as my solution to problem #3: Don’t allow assignment to std::optional<T&>
from std::optional<T&>
. Note that the identity with a
and b
is only expected to hold when a
and b
have the same type, but with this proposal, the a = b;
part won’t compile when a
and b
are both std::optional<T&>
. It would still compile if a
is std::optional<int&>
and b
is int
or int&
, and to further hammer home that this identity doesn’t hold for optional references, I would even remove the bool operator==(const std::optional<T>&, const U&);
family of functions when T
is a reference type. That way using the contained value in a comparison can only be done by explicitly calling .value()
, and the opportunities for confusion are further reduced.
Hold On, is This Even Useful?
Philosophical APIs and mathematical purity are nice and all, but if you end up with an API that isn’t useful for anything, then why bother? Wouldn’t you rather have an actually useful API that isn’t quite as pure and idealistic? I would, and it’s on me to provide a defense that the API preferred by the always-assign-through camp is still useful. Here is that defense.
The number one thing I want to use optional references for is optional function parameters of large types. Look at this code:
struct LargeType;void foo(
int a,
const std::optional<LargeType>& large = std::nullopt
);int main() {
LargeType huge;
foo(1, huge);
}
The goal of the foo
API is to have an optional function parameter of type LargeType
, but of course we don’t want to accept it by value, because that would require a copy, and copying this LargeType
is expensive. Fortunately C++ has an answer: accept the parameter by const&
. Then no copy will be made, right? But in this case, a copy actually is made. The std::optional<LargeType>
has an implicit constructor from LargeType
, and that constructor is the one that gets called, and it performs a copy of the LargeType
. So much for zero-cost abstractions. This problem could be solved more efficiently by having foo
accept a std::optional<const LargeType&>
as it’s parameter. The caller code would look exactly the same, but would run faster because it isn’t copying the LargeType
.
You might recommend overloading foo
instead, but that quickly gets unwieldy if there are more than two or three optional parameters, or if you want to allow the caller to explicitly pass std::nullopt
.
(Update) You also might recommend accepting a pointer instead, but then you can’t pass temporaries. Accepting temporaries is critical if you want your API to compose nicely.
Personally I think that this use case alone is sufficient to prioritize adding optional references to C++, but if you have more use cases that are or aren’t satisfied by always-assign-through, add them in the comments!
Update: Reddit Discussion
There has been some good discussion of this post on Reddit. There was some additional concerns and use cases brought up that I want to address.
New Problem: Construction Identities
There was a claim that T a = b;
should behave the same as T a; a = b;
for all types T
including optional references. This identity can never hold for types without default constructors, of which regular references are an example. My goal is for optional references to be have like regular references with exactly one differences, so what would be the analogy to regular references? Probably something like this:
int i;
int j;// version 1
int& ref = i;
ref = j;// version 2
int& ref = j;
If you want assignment-after-default-construction to be identical to assignment-during-construction, then it seems to follow that you should want assignment-after-non-default-construction to also be identical to assignment-during-construction. But this is not the case for regular references, so I argue that it should not be the case for optional references. I think that the identity is fundamentally only valid for value types, but references are not value types, and I don’t think optional references should be either. Trying to treat optional references as value types would be a major difference from non-optional references, and I want there to be exactly one difference between references and optional references.
New Use Case: Return Type Of Container Access
A solid use case was brought up on Reddit and I want to analyze it in the context of each camp: a hypothetical std2::vector<T>
that has its .front()
, .back()
, and operator[]
methods return std::optional<T&>
when the access is out of bounds. For reference, the existing std::vector
returns T&
and exhibits undefined behaviour when the access is out of bounds.
Regardless of which camp you are in, I urge you to look at this code and think about what you want the semantics to be:
std2::vector<int> vec = MakeAVector();
std::optional<int&> frontItem = vec.front();// Does this change the front item to be 1?
frontItem = 1;// Does this change the front item to equal the back item, or
// rebind the reference to be bound to the back item?
frontItem = vec.back();
I actually don’t think there are any good answers here. I definitely still think it should be allowed to assign to a std::optional<T&>
from a temporary. Both std::optional<T>
and T&
allow assigning from a temporary, so disallowing assignment from temporaries to std::optional<T&>
means the people trying to write generic code still can’t do so. The code doing nothing at all in the case of vec
being empty doesn’t “feel right”. And I think I’ve made myself clear on why I don’t like rebinding. Throwing an exception, such as std::bad_optional_access
when assigning to an empty optional reference was brought up, and calling it Undefined Behaviour is also an option. I personally dislike both exceptions and undefined behaviour, but that would be most similar to the existing state of affairs, where you have to explicitly check the size of an vector before accessing it or risk undefined behaviour… but if that is your API, why bother returning via std::optional
? Makes me think that the standard actually got it right by disallowing optional references.
Conclusion
This is the part where I make a final recommendation. Here is what I think should be added to std::optional
: ̶O̶̵̶p̶̵̶t̶̵̶i̶̵̶o̶̵̶n̶̵̶a̶̵̶l̶̵̶ ̶̵̶r̶̵̶e̶̵̶f̶̵̶e̶̵̶r̶̵̶e̶̵̶n̶̵̶c̶̵̶e̶̵̶s̶̵̶ ̶̵̶s̶̵̶h̶̵̶o̶̵̶u̶̵̶l̶̵̶d̶̵̶ ̶̵̶b̶̵̶e̶̵̶ ̶̵̶a̶̵̶l̶̵̶l̶̵̶o̶̵̶w̶̵̶e̶̵̶d̶̵̶.̶̵̶ ̶̵̶T̶̵̶h̶̵̶e̶̵̶ ̶̵̶a̶̵̶s̶̵̶s̶̵̶i̶̵̶g̶̵̶n̶̵̶m̶̵̶e̶̵̶n̶̵̶t̶̵̶ ̶̵̶o̶̵̶p̶̵̶e̶̵̶r̶̵̶a̶̵̶t̶̵̶o̶̵̶r̶̵̶ ̶̵̶f̶̵̶o̶̵̶r̶̵̶ ̶̵̶s̶̵̶t̶̵̶d̶̵̶:̶̵̶:̶̵̶o̶̵̶p̶̵̶t̶̵̶i̶̵̶o̶̵̶n̶̵̶a̶̵̶l̶̵̶<̶̵̶T̶̵̶&̶̵̶>̶̵̶ ̶̵̶f̶̵̶r̶̵̶o̶̵̶m̶̵̶ ̶̵̶T̶̵̶ ̶̵̶o̶̵̶r̶̵̶T̶̵̶&̶̵̶s̶̵̶h̶̵̶o̶̵̶u̶̵̶l̶̵̶d̶̵̶ ̶̵̶a̶̵̶s̶̵̶s̶̵̶i̶̵̶g̶̵̶n̶̵̶ ̶̵̶t̶̵̶h̶̵̶r̶̵̶o̶̵̶u̶̵̶g̶̵̶h̶̵̶ ̶̵̶t̶̵̶o̶̵̶ ̶̵̶t̶̵̶h̶̵̶e̶̵̶ ̶̵̶b̶̵̶o̶̵̶u̶̵̶n̶̵̶d̶̵̶ ̶̵̶o̶̵̶b̶̵̶j̶̵̶e̶̵̶c̶̵̶t̶̵̶ ̶̵̶w̶̵̶h̶̵̶e̶̵̶n̶̵̶ ̶̵̶t̶̵̶h̶̵̶e̶̵̶ ̶̵̶o̶̵̶p̶̵̶t̶̵̶i̶̵̶o̶̵̶n̶̵̶a̶̵̶l̶̵̶ ̶̵̶i̶̵̶s̶̵̶ ̶̵̶n̶̵̶o̶̵̶n̶̵̶-̶̵̶e̶̵̶m̶̵̶p̶̵̶t̶̵̶y̶̵̶,̶̵̶ ̶̵̶a̶̵̶n̶̵̶d̶̵̶ ̶̵̶s̶̵̶h̶̵̶o̶̵̶u̶̵̶l̶̵̶d̶̵̶ ̶̵̶d̶̵̶o̶̵̶ ̶̵̶n̶̵̶o̶̵̶t̶̵̶h̶̵̶i̶̵̶n̶̵̶g̶̵̶ ̶̵̶w̶̵̶h̶̵̶e̶̵̶n̶̵̶ ̶̵̶t̶̵̶h̶̵̶e̶̵̶ ̶̵̶o̶̵̶p̶̵̶t̶̵̶i̶̵̶o̶̵̶n̶̵̶a̶̵̶l̶̵̶ ̶̵̶i̶̵̶s̶̵̶ ̶̵̶e̶̵̶m̶̵̶p̶̵̶t̶̵̶y̶̵̶.̶̵̶ ̶̵̶T̶̵̶h̶̵̶e̶̵̶ ̶̵̶a̶̵̶s̶̵̶s̶̵̶i̶̵̶g̶̵̶n̶̵̶m̶̵̶e̶̵̶n̶̵̶t̶̵̶ ̶̵̶o̶̵̶p̶̵̶e̶̵̶r̶̵̶a̶̵̶t̶̵̶o̶̵̶r̶̵̶ ̶̵̶f̶̵̶o̶̵̶r̶̵̶ ̶̵̶s̶̵̶t̶̵̶d̶̵̶:̶̵̶:̶̵̶o̶̵̶p̶̵̶t̶̵̶i̶̵̶o̶̵̶n̶̵̶a̶̵̶l̶̵̶<̶̵̶T̶̵̶&̶̵̶>̶̵̶ ̶̵̶f̶̵̶r̶̵̶o̶̵̶m̶̵̶ ̶̵̶s̶̵̶t̶̵̶d̶̵̶:̶̵̶:̶̵̶o̶̵̶p̶̵̶t̶̵̶i̶̵̶o̶̵̶n̶̵̶a̶̵̶l̶̵̶<̶̵̶T̶̵̶&̶̵̶>̶̵̶ ̶̵̶s̶̵̶h̶̵̶o̶̵̶u̶̵̶l̶̵̶d̶̵̶ ̶̵̶b̶̵̶e̶̵̶ ̶̵̶d̶̵̶e̶̵̶l̶̵̶e̶̵̶t̶̵̶e̶̵̶d̶̵̶.̶̵̶ ̶̵̶T̶̵̶h̶̵̶e̶̵̶ ̶̵̶f̶̵̶a̶̵̶m̶̵̶i̶̵̶l̶̵̶y̶̵̶ ̶̵̶o̶̵̶f̶̵̶ ̶̵̶c̶̵̶o̶̵̶m̶̵̶p̶̵̶a̶̵̶r̶̵̶i̶̵̶s̶̵̶o̶̵̶n̶̵̶ ̶̵̶o̶̵̶p̶̵̶e̶̵̶r̶̵̶a̶̵̶t̶̵̶o̶̵̶r̶̵̶s̶̵̶ ̶̵̶b̶̵̶e̶̵̶t̶̵̶w̶̵̶e̶̵̶e̶̵̶n̶̵̶ ̶̵̶s̶̵̶t̶̵̶d̶̵̶:̶̵̶:̶̵̶o̶̵̶p̶̵̶t̶̵̶i̶̵̶o̶̵̶n̶̵̶a̶̵̶l̶̵̶<̶̵̶T̶̵̶&̶̵̶>̶̵̶ ̶̵̶a̶̵̶n̶̵̶d̶̵̶ ̶̵̶T̶̵̶ ̶̵̶s̶̵̶h̶̵̶o̶̵̶u̶̵̶l̶̵̶d̶̵̶ ̶̵̶b̶̵̶e̶̵̶ ̶̵̶d̶̵̶e̶̵̶l̶̵̶e̶̵̶t̶̵̶e̶̵̶d̶̵̶.̶̵̶
Based on discussion in Reddit and with friends, I now believe that it std::optional<T&>
should exist, but it should support no assignment operators. Modifying the referred-to value should require calling .value()
and assigning to the result. Rebinding could be allowed through an explicit .rebind()
method that only exists for optional reference types.
I hope you’ve enjoyed this article! I would love to hear your thoughts, especially if you started out in the always-rebind camp, and are still there after reading. Thanks for staying this long.