- Okay, hello, everyone.
My name is Mateusz Pusz.
I'm a C++ trainer and the
Chief Software Architect
at ePump Systems.
I work there mostly on the projects
from the finance domain
with high performance
and low intensity requirements.
Today we'll talk about std::variat
and how to use it in order
to replace polymorphism
which is a problem in
the projects I work on.
First, why I care about it.
As I said, I am working
on low intensity domain.
Latency, basically, is the time required
to perform some action
from the start to the end of it.
And in low intensity, basically,
we try to be faster than the competition.
This is what we do,
this is how, basically,
this market gains money.
In order to verify our software,
in most cases we are not interested
in the average performance of our product,
but we are most interested
in the worst case
execution time of our project.
That's why we operate on percentiles,
that you will see at the end of the talk.
Where I provide, speak about my domain,
I mostly talk about things
that not should be done
on the latency software, but
things that you should avoid
in order to not make mistakes.
So, these things are,
for example, to not use tools
that trade the usability
for performance, like
shared_ptr, std::function,
to not throw exceptions
on the likely code path.
Exceptions are fine, but as
long as they don't appear
on the code path,
they are not raised.
Dynamic polymorphism
is wrong, multiple inheritance, RTTI,
dynamic memory allocations.
Today, we'll scope mostly
on those two subjects.
So, how to do it?
This is typical case of
dynamic polymorphism.
We have base class,
we have a class x and class y
that inherits from the base class,
and they implement the interface.
Then we have, in most cases,
to dynamically allocate
the type class.
And we refer to it from
the base point of interface
like this.
And the solution I wanted to provide here
for you today:
We will talk about the variant.
In this case, we have
two structures, x and y,
as you can see they're
not related to each other.
They have similar or the same interface.
And with variant you can put them
to one object, b, and
call a visit on them.
That causes shorter, it's faster,
it has value semantics.
As I said, it works on unrelated classes.
And it's more flexible because
you may have different signatures here,
but as long as they obey
the rules of that visitor.
We don't have, not a default,
this part here, and for this allocation.
And, basically,
that's all.
This is the talk, this is the future.
I wanted to share with you today.
If you can find the
questions, then please,
but I have also quite a few
bonus slides if you want?
There will be a lot of them,
and we have only 24 minutes
left, as we started late.
So, I hope to go as much of them
as we can, yeah.
So, bonus slides.
For an example of the technique
we'll talk about the
finite state machines.
Probably more than half of you,
if not at least one of those in your life,
because this is what we
do all over it again.
This is the example for today.
We have three states: idle;
connecting; connected.
States are changed in
the finite state machine
and based on the inputs they receive.
Those inputs are called events.
We have events like connect, connected,
disconnect, and timeout.
Here you can see the
timeout for the state,
it depends on some
protocol here that
determines if you are still
doing the reconnect or going back to idle.
The change from one state to
another is called a transition.
And this is defined with a list of states.
The initial state, marked
with this circle here.
And conditions for each transition.
So, case number one:
Single dynamic dispatch.
We'll implement such a class diagram
that has some base implementation
for a finite state machine,
that stores a state, that has an event,
function, virtual function,
states will inherit from it,
and we have enum as an event.
So, it can be painted like this one.
It has noncopyable, restore the structure.
And pure virtual function.
Our finite state machine will store it
as a unique_ptr here,
and we'll call this batch
here, that's why it's
called single dispatch,
because it does it only once.
We have state, we call an event,
on the event we received as
an argument of the function.
If we're doing something
different than new_ptr,
we have a new_state, so we
have to switch the state
in the machine.
It's called dynamic because we
have virtual functions here.
Virtual because, in most cases,
nearly always means slow.
And this is what we try
to not use in our domain.
So, this is our finite state machine.
And for the sake of the next slides,
I will use such a using.
Of course I will not use
it in the production,
but it makes my slide easier to read.
I have to implement all of the states.
All of them have the on_event function.
And this is how it's implemented.
For the first one we implement moving
to the connecting state.
On another one we have two
possible events to come.
So, we either move to the
connected or idle state.
And, the last one,
we'll move to idle if
event is disconnected.
This is how you implement, traditionally,
finite state machines.
This is the slow part here.
We have to make dynamic locations.
And for testing those transitions
we can use fold expressions.
From C++ event
to past number of events,
to our finite state
machine, and we are done
with testing of those.
Single dynamic dispatch
has such characteristics:
It's open to new alternatives.
You can always add a
new state or a new event
to this and it will work fine.
It's closed to new operations.
You have one event and
you cannot add anymore
without changing the whole...
Inheritance, all the design
of the polymorphic tree.
It's multi-level; you
can inherit from states
that already are a child of the base.
It's object oriented.
And yeah, that's all in this case.
Another case to implement
finite state machine
is using so called dynamic dispatch.
So, let's assume that we want our events
to pass some data, for
example, event_connect,
to provide us information.
Which address is being
used for the connection?
In such a case we could do this, yes?
But it turns out to be a really bad idea.
So, we are using double dispatch,
also known as a visitor pattern.
And it involves two objects,
and it dispatches between them.
So, this is our event.
Right now, event is not an enum anymore.
It will be another
polymorphic tree of classes.
It will have Q virtual function dispatch
that takes the state.
Our finite state machine right now
would not call an event on state.
It will call this dispatch function.
So this is the first dispatch
in the double dispatch pattern.
We have to implement those in every event.
And then as you can see,
we then call on state,
on_event, so you call the second dispatch,
providing current object
with current type,
already being
a specific type, not the base type,
two states.
With that, our state has to
have different interface.
It has to be overloaded on all the events
we have in our system.
And all of them are virtual functions.
By default I assume that we
are not changing the state for them.
It's nullptr, so the finite
state machine will stay
in the same state for
each of those events.
And this is how you implement,
for example, state_idle.
For one of the type classes.
This is class diagram of this solution.
So you can see this is a
pretty complex tree right now,
with a lot of different instances.
And it is fixed by the
implementation details, in fact.
We cannot change it easily.
One thing that bothers me here,
is that if I were to
look at like, 100 events,
I will have to rewrite that exact thing
each time.
CRTP becomes handy here.
Way it provide is template
parameter the child type
of the event and then we static_cast this
to this child and call it an enum on this.
So, thanks to that, we
can type it like this,
or even have all the
implementation right now
on the slides and everything fits fine.
Transitions:
In this case we don't use if or switches
like in previous case.
We have already dispatched
everything fine.
We know that in state idle,
we'll have event_connect.
And for all the other states we also have
all the dispatch done for us.
So now we only implement the behavior.
So, moving to the
state_connecting, state_connected,
idle or static in the same state,
and to idle.
This is the slow part, yes?
Once again, dynamic model locations.
For testing of transitions, we
use similar dispatch function
and this time it's important
to notice that once again,
we have to allocate memory for events.
In most cases in most
of the implementations.
And once again, this is another
slow part in our design.
So, to summarize, dynamic dispatch
is pretty similar to the single dispatch
with the difference that it's not open
to new alternatives.
As you remember, right now,
the state has virtual functions
on every event you have.
So, in order to add a new event
to the finite state machine
you would need to modify the base class
in the interface in the library.
So, going to the fun stuff,
yes, going to the variant.
Because this talk is about variant,
not about the old approach.
First of all, consider to
talk about the introduction
to variants, I tried
to do it on three sides
within three minutes.
So, if you are not familiar
with variants right now,
and that was introduced in C++ eventing,
is different from the first
version we had earlier.
It represents the types of union
that in most of the cases holds the value
but it can be also in a special state
called valueless_by_exception,
which is basically,
compromise, this is compromise in order
to not have identical locations
that were done in boost.
This is the conceptual,
simplified interface.
And the implementation of it.
So we can think that variant
is something like a union
of different types, this is
not a value in C++, of course.
And it doesn't store as an index
of the current active option
in our variant class.
Std::variant in C++ standard,
the variant's not allowed
to allocate dynamic memory.
That's a difference from the boost one.
It doesn't hold references,
arrays, or void.
Empty variants are ill-formed,
but you can use something
called monostate to
simulate the empty variants.
It's permitted to hold the
same value more than once,
and when it's differently initialized,
it initializes the first
option of the variant.
This is how you do dispatch on variant.
So you can either do it dynamically,
with the index number function.
Or in most cases it's
preferred to use visitor.
Visitor is a simple factor
with overloaded co operator
for each of the types of the variant.
Then you call std::visit.
Provide the visitor as the first argument,
and the list of variants
that you would like
to pass to this visitor.
You can provide more than one here.
So, let's come back to
our finite state machine.
Right now we can define events
as not rated classes.
One of them will store the address,
another one would be empty.
And we put everything
to a variant of events.
We do the same with states.
To implement our transitions,
we will create a struct
called transitions that will
have overloaded co operators
for all of the possible
transitions we define
in our finite state machine.
So, we have a state_idle
with event_connect.
State_connected with event_connected.
State_connecting with event_timeout.
And state_connected with event_disconnect.
So, all of the transitions here.
And we define what is being
done in each of those.
And what we are missing here?
Default, yes, what will
be a default action,
for that will take, really, template,
that just says, for all other cases
we are staying in the same case.
Notice that I'm returning
here, optional of state.
Another C++ event into,
that makes it really nice here.
So, our finite state
machine engine right now
is template class, class template,
that has three arguments: it has states
that are passed as the first arguments
to the visitor as a first variant tool,
and to the visitor to the visit function,
it has events and it has the transitions,
which in fact is our visitor class.
Now we can
state diagram example
like something like this.
So, we don't have any inheritance,
we have some classes that are unrelated,
put into the variant.
The design
uses them in our engine.
To test the transitions,
once again we have similar,
a fold expression, and we are back
to passing objects by value,
like in the single dispatch we don't have
to allocate memory anymore.
Another use case
that may be interesting here is
such an approach.
We have once again, finite state machine
that have the same template arguments.
It has different, it
has derived right now.
So, when you see derived
as a first argument,
it means that CRTP value
probably here are used.
And we have visitor,
similar to the previous one.
So, as a first operation in the function
we will do the static_cast
to derived type.
And with visit, we'll
provide the lambda function
for every state and every event
that was written as the optional of state.
And we pass it to on_event member function
of a child class of finite state machine.
We are doing perfect folding here.
Oh, okay.
So, we can also lose the
variant of events, in this case.
So right now we are only having variant
of state's events are passed
as a template argument.
And we don't have to care about
another variant in our design.
Right now, with this design,
we have the finite state
machine defining all
of the transitions, so in this case,
let's say that our
child class is a visitor
with all of the transitions defined.
And once again, we have all
of the transitions defined
explicitly and the default at the end.
This is similar to the previous case.
And this time, our class diagram
shows that events are not
related to each other.
They are loosely coupled
with all of the design,
and basically your code you use,
and add an event to the
design if needed for those.
There was one more case
that you can use here.
It will have the most
complicated implementation,
and basically, I'm in
favor of the previous one,
not this one, but that's
interesting because
that makes it similar
to the double dispatch,
where all the logic is put into the states
not into the finite state machine itself.
So, in this case we may
have it, once again,
the dispatch function and we
have so called overloaded,
which is a new pattern
that is coming in C++ 20.
So, beware, lambdas are coming.
This is the first one,
that takes any state,
and in case our state has an event
for specific event provided
by our template argument,
it will
pass the
SV file check,
and will write an optional StateVariant
and do an event on this state.
In order to handle all
the defaults in this case,
I have another lambda,
that is a worse overload
because of the template
argument parameter pack.
That calls the policy
provided as a template parameter
to the finite state machine.
And again, providing
transitions is really simple.
Right now we are doing this in state.
It looks really similar to
the double dispatch approach
but we don't have any virtual dispatching
or dynamic locations.
And this is our default, yeah,
this is the policy we use
for all of the cases.
We can do nullopt, we
can throw an exception,
or you can do something else here.
So, why are we talking about this?
We are talking about performance, yes?
I did set a micro benchmark of this,
but they provide some numbers.
But you will find the real
numbers in your project.
You always have to
measure, like Fedra said
on the previous presentation.
Micro benchmarks are synthetic.
They have, in most cases, hot caches.
They don't represent real life.
I must say that when
I use those techniques
in my finance project, we really see
those changes visible on the large play,
scale of the project in all
of the timing we've seen improvement.
Not only some specific small part of it.
So basically it was visible
from the product scope.
So it's enough to care about it.
So, for micro benchmarks, I use
bigger finite state machine.
So, I implemented the TCP
state diagram like this one,
in order to make sure that
I had the branch predictors.
Caches will have a bit harder work here.
And these are the results I got.
These are the single dispatch.
You can see that on the
X-axis we have percentile.
It's not linear scale.
And these are nanoseconds.
For double dispatch
it's, of course, slower,
because we have a lot
more dynamic locations,
two virtual dispatches for this approach.
With variant, any of
those that I presented,
it's much faster.
And it's much more deterministic,
even in the worst 99.9 percentile here.
And this was the slowest one I measured.
For low latency stuff, for
things that need high performance
for things that you must be sure
about the worst execution time,
I would recommend using such techniques.
So, to summarize:
Variant versus inheritance.
In the case of inheritance, we are open.
In case of single dispatch and closed.
In case of double dispatch,
then you've alternatives.
In case of variant, we
cannot add new states
to the variant easily,
but we could add, also,
events if needed.
But basically, it's a finite state machine
so it will not have more
states, so we are fine.
It's closed to new operations.
In case of inheritance we can
not easily add new operations.
In case of variant we
can add more visitors,
doing different stuff, as
you've seen, for example,
second case of variant,
our states, our events
were dump class, that didn't have anything
or had some string view on the side.
They didn't have any operations.
All operations were
done externally to them
so you can always add more
operations doing the same.
Inheritance is multi-level;
you can inherit whole tree arch etry,
from the base class in case of variant,
you have single level of
the, let's say, inheritance.
One is object related, and
the other one is functional.
One has pointer semantics,
and the other one
has value semantics, which
is very important here,
which makes all of this
stuff easier to handle.
And in case of inheritance,
we have one fixed design
that was full enforced
by the implementation details,
because we decided to use polymorphism.
We couldn't much choose
what you would like.
As you've seen, I've shown you at least,
I've shown you three, but
probably there are even more
possible designs for
doing this with variant.
And you as an architect, as
the developer of the code,
as you are responsible
to choose what you like
not what you have to
do because of the code.
It has to be written like this.
In case of inheritance, you
are forced, in most cases,
to use dynamic memory allocations.
With variant, you don't have those.
Strict interfaces,
in case of inheritance.
So, each state, each event
has to have exactly the same
signature as the base class.
In case of variant you have duck typing,
you may have returning void or int,
your function may have, for example,
default argument.
These are the values
provided for the argument,
so it will pass the duck
typing signature check.
I would say that
inheritance is more complex,
at least in the number of
lines you have to write,
in case of reasoning about the design.
Variant is simple.
And yeah, we see in the graph.
Polymorphism is slow, variant is fast.
And also, in terms of 99.9 percentile,
it means that most of the cases
are at the same performance, nearly.
There are no spikes at the
end that you will not expect
in low latency software.
So, with that,
that's all.
I'm done, and do you have any questions?
(audience applauding)
Hi.
- [Man] Let's say that I
present the opposite point.
At the beginning, in his opening keynote,
Bianna told us that duck typing is bad,
that the linear stuff is good.
Personally I would like to
have both in my toolkit.
But anyway.
As you explain, the
polymorphism base version
is open to new alternatives,
but not to new operations.
You explained to us that
the variant-based solution
is closed to alternate states
but open to new function.
Right?
- Yes.
- So, I can see how it
makes sense in the context
of one application, and
indeed, using a switch
on a closed set of types
can beat polymorphism.
I disagree with you on two points:
If you want to write libraries,
or even in the same company,
share code between several applications,
what you present is just not extensible.
And my second objection is that
the benchmarks you present,
I find them unfair.
Because (sighs), to me, too strong.
And you're comparing performance of things
which try to achieve different goals.
- Yeah, I'm not saying
that you should replace
all of the polymorphism with variants,
because in many cases
it will not be possible.
Here you have options.
You may have choose inheritance,
you many choose variant,
but if it's,
and problems related to each
technique on this slide.
It's up to you to decide which one you
can choose or must choose,
in case the other one will
not work for you at all.
- [Man] Okay, I'm much
my sympathetic to this,
because a little earlier we had a talk
in this room which set up OOP-STUD,
and in the end the conclusion
was, these are just tools
in our toolbox.
- Yes.
- [Man] And now you're saying the same.
- Yeah.
- [Man] In that, we are on the same page.
- Yeah, this session's over,
so we have to finish this.
But just to summarize, as I said,
it's a technique you can choose to use.
It's not that you can
replace every polymorphism
in your design with that solution.
It's up to you to decide.
- [Man] Okay.
- Thank you.
