- So let me tell you a story why I started to care about state machines and try to fix some misconceptions
about them and that's where we'll start here today. So, about few years, I was
an intern in a company. It was a Christmas period time. I was ready to go back home and I noticed that a
lot of developers were really hard working and I asked them, it's
like Christmas period, why do you work so hard right now? And they replied that
they have a contract going the country which doesn't have holidays in the Christmas
so they had to be on call and they had to begin into the problems, possible problems which
they may encounter. So, I came back from
Christmas, I was an intern. I was working, right? So, when I go back, I ask
them, it's like, how did it go? They said, well, Kris,
we had a lot of issues and specifically one which
it took us all the Christmas to sort out. So they didn't go back home,
they were really annoyed and it was bad and the problem was that, just let me just explain the
system they were working on. The system was very simple, it
was just having some inputs, manipulates them and had some outputs. So not the rocket science
but the problem was that they haven't had proper practices and they found that they didn't reset the Boolean variable in some switch, in some loop, in some form and that was introduced
by the bad merge request and the conflict. So, I learned quite a
few things out of that. The first one, well, I
didn't work for that company. They worked for during the Christmas time. The second one, I learned that, well, they didn't have a good practices because they didn't test it. The way they found the bug,
it was by reviewing the code so they didn't have any good practices. So, again I didn't want
to work for that company but the main problem I learned is that they didn't have the
declarative way of expressing the application flow and
before, they had to compare different solutions using the code review. The diff between the working and staging instead of being able
to figure out upfront what was going on. So, I asked them, why you didn't use the state machines, for example. Well, I decided the performance
issues and other issues which we didn't want to deal with. So, this talk is about how we can actually clarify those misconceptions and hopefully spread the word about the state machines. So, let's begin. But first, let's introduce a problem. So, if you have a healthy
company, not the one, as the one I was working on, we will get the requirements
from our product owner, or for example, client. Let's do that in a TDD, BDD style. So we have a feature connection. We have a scenario, establish connection, very simple, TCP approach. So given I don't have a connection when I receive a request to connect then I should try to
establish the connection when I got an acknowledgement. I should be connected. Very simple and easy way of
defining the requirements. We have obviously more the
requirements than that. How can we actually implement that guy? Well, let's take a look
into state machines. Obviously, there are
different ways of doing that but we don't consider them in that. We will talk about
Unified Modeling Language. Is everyone familiar with that guy? Yeah, more or less everyone. So, it's a standard, it's
specified in OMG.org. You can read it. I think it's even worse than
the C++ standard to read so I don't advise you to do that but you can. There a lot of features specified
it in it so that's useful. So our features, scenarios, we can actually
translate into the state machine and how does it look like? So, I guess everyone have
seen some pictures like that. We have an initial state, disconnected, we have an event connect and we and we call the establish action and we transit to the connection state, connecting state and if you
have, for example, in connected, we have the ping event, if the guard which is in
the square bracket is valid, we call the reset timeout
and we stayed in that state. Otherwise, we don't
call there is a timeout and we don't do the transition. Makes sense? Easy to follow. Everyone knows that. So, today, we'll try to implement that guy many different ways. So, we'll take a look
into naive solutions. By naive, I mean C++98 features. If, else, switch, enum,
inheritance, state pattern. We'll take a look into
STL and std::variant, might be used to implement state machines. Coroutines might be used
to implement state machines and we'll also take a look
into the Boost libraries. Oops. Let's first introduce some
common implementation. The common across all the solutions so that we don't have to
repeat ourselves all the time. So, the event will be just simple structs. We can start from that although
they may have data as well but we don't really care
about it in our case. We have a guard, one guard is valid. It returns true by default. It takes an event. Nothing special about it. We have an action which will
just print to the output. We'll use put just in
order to see in the Godbolt whether it's inline or not, otherwise, we would have a lot of code from SCDL derivative streams which we don't really care about. Sorry about that. So, let's take a look
into a naive solutions. So, the first one will be there if, else. We'll start doing it from
implanted a connection class. We'll have obviously free
Booleans to represent our state. The state will be implicit
because they'll be implemented in the code. By default, the
disconnected state is true, other ones are false. And when in process event
which is the connect event, we verify with the UI. In disconnected state if we are, we call establish which is the action and after that, we have
to reset all the states because that would you learn
from the story in the beginning that you always have
to reset all the states just in case because if you have really nested version, you might actually forget
about it and the transition is just set in the
connecting state to true. What about processing the ping? We get the connected. We verify whether we're
in the connected state. We check whether the guard
is valid and we set the time, timeout and we stay in the current state. Makes sense, right? So what about the Godbolt? As you notice on the
left side on the bottom, there are 64 lines of code. Not bad, not a lot of boilerplate and everything is very well inline and optimized with flux. You can see on the top. So, that's good. And why is that optimized? Well, if anything else, it's
really easy for compilers to see it through. So that's a great thing
about the if, else, right? So it's inlined in GCC and in Clang. I just want Clang but GCC
produced the same output. There's no heap usage. That's great, we don't have to care about shared point as unique
points or anything like that. There's a smallish memory footprint, however, it's not ideal
because we have three Booleans. As a state machine, we
can only be in one state at the given time so that's a waste. It's hard to reuse and to illustrate that. Let's see, it's so hard to reuse that I guess everyone have seen
the code when you have multiple lifts, switches
and everything is nested, it's like how much can you go? I don't know. It depends on your, I guess, it depends how
many spaces use you've used for your taps, I don't know. We can go quite far. But, yeah, obviously, it's hard to reuse because it's hard to implement
those guys and maintain. So doesn't the best solution. Let's take a look into a different one which is still very easy and
everyone probably have seen it. Switch and enum. Everyone's love it, I guess
because it's so popular. You can see it in any legacy code base. So, it's good for the state machines from the perspective that
only one state might be active at the given time so we use enum. So, it's pretty much the same
as the previous solutions. Instead of Boolean variables,
we use the enum state with the three disconnected
connected and connected states. Disconnected is the default one. When we process the event,
instead of if, we do the switch to handle current state. By default, we break and if you don't now
default main is just enough. Label so it might be put
anywhere in the switch. In case of disconnected,
we call establish. We change the state, we break. Nothing really spectacular here. And the same case for the ping. If you are in connected, we verify their, the even is valid, we set
the timeout and we break. So far so good. What about the performance of that guy? Again, compile is already good
in optimizing simple things and switches, one of the
main constructions of C++ in which compile is very good at. So we have 65 lines of code
pretty similar to if, else and everything is optimized very well. So that's great. But what about the switch,
what's bad about it? Well, it's inlined, that's good. We fix the issue of the memory footprint. It's one byte, that's fantastic. There's no heap usage again. It's still hard to reuse. We still have a lot of nested
switches and ifs and force and everything like that. It will be hard to add a new
state and to maintain that. So, in a sense like,
you code for six minutes and debug for six hours. Even with the good test coverage, it might be difficult to find those bugs. So, yeah, maybe that's not
the best solution either. Let's try a different one. We all know that object-oriented design is great, right? We can use short pointers
everywhere, throw them away. So yeah, let's do it because we care about the performance so, so yeah, we'll see. So we have our state which is an interface and with a lot of process event for all the events we can handle. And after that, we implement it per state, on per state basis. That's the thing which
actually gives us a lot of, a lot of goodness because
we can implement a state transition, we can add a
state and transitions to it without changing the connection itself. So that's valuable. It means that a lot of teams can work on different transitions and
combine them together. So that's positive. So how do we do that? We have a disconnected state
we inherit from the state, we have the constructor, well, we have the connections
we have to pass through and in case of connect,
notice that I use final here because that'll be important
for the devirtualizations, possible devirtualizations. And we change the state to connecting. It's pretty much the same implementation as we've seen before,
just a bit different way. The same about the ping. We have to implement here the ping and probably also other events to satisfy our requirements
from the UML state machine which we described before but all in all, it's very simple. Sorry. It's playing with me. What about the performance? Well, we have a bit more lines of code because, well, it's
object-oriented design. We write a lot of code, boilerplate code just
to satisfy the design. And unfortunately, we have the vtable. It wasn't devirtualized
although we had final. We have for Clang and GCC, we have around 220 lines. Not ideal. Not perfect. And I'm using the newest
Clang and the newest GCC here and all the examples
are all in the Godbolt, you can check them out yourself
and experiment if you want. So, what's good about the state pattern? As I said, it's easy to reuse an extend because we can add the states quite easily by just adding a state
inherit from the interface and just add your transitions you want. That's cool. It's object-oriented. That's what we want from it. Does highish memory footprint because we have to allocate
everything on the heap which we have to use and
we have dynamic allocations all over the place, probably shared pointers
as unique pointers, all the goodness. But because of that, it's not inlined. It's not even devirtualized in our case. so that's bad because we
care about the performance. So although this guy is pretty good from the developer perspective, it's not really good from
the production perspective because it doesn't give us what we really need. So okay, so we've come
through the naive solution quite quickly. Nothing really new about it. Let's take a look into something else. It's like if I have to, if I could pick one thing
which C++ is really good for, I'd say it's the writing libraries. And one of the best
libraries available is STL. I hope you agree with that. And since C++17 and C++20, we have at least two new ways of implementing state machines. one of them is std::variant. Everyone is using std::variant? Who likes it? Yeah, it seems like
half half of the crowd. I think it's very powerful idea. Maybe we'll see how
powerful it might be used, how powerful it might be
used for the state machines. So here, let's take a
look into connection. We go back to the one connection class instead of having it per state. And right now, the states are implemented using
just types and struts, which is very powerful for
us because they can have data which we couldn't have before
when we used enum, for example or a Boolean variable. So keep in mind that although
we won't have an example here, we can have the data for the states which may have a different stuff in it. So disconnected may not have anything but when we connected, we
can have an IP, specific IP which will be just active
and available for us to use when we are in that state because variant might be in one given state as well. So that might be very handy for us. And we declared the variant as the state with the disconnected by default. How do we actually handle the transitions? Well, we have to do the visit on it. Overload is just an idea
and it's proposed for C++20 to choose which lambda to call so we don't have to implement
a lot of boilerplate to satisfy that. We could use the generic
one and use if constexpr, for example, if you want but overload seems to
be cleaner in that case. So if you get the disconnected, it means that we ended disconnected state and we got the connect event, we do the same. I guess all of you can see the similarity to previous solutions. So we call establish and we
change the state to connecting. The next case is just
to satisfy the compiler because it's generated compile time so we have to call it for all the events, all the states possible. So in case of connecting and connected, we just discard those guys because we don't care about
them in that transition. Makes sense? So, what about the ping? In the ping, we have actually very similar but even an easier way of doing it. We can just verify whether in
connected states using get_if which is the part of the
interface of std::variant. Verify whether the guard
is valid and satisfied and there is a timeout
being called afterwards. We stayed in the same state
so there is no issue here. What about the performance? From the Godbolt and the
newest and greatest STD leap and Clang 7, we actually
get it be inlined. So that's fantastic. It's not as good for GCC. GCC doesn't inline it at all. It's 140 lines of generated assembly with mysterious vtable being there. We have 70 lines of code to implement that so all of the implementation
is very similar to the previous revisions. As you could probably
see, it's very similar to switch and enum. Instead of enums, with just use types which gives us RAII and a lot of benefits. So, if you sum up the variant, it's quite small memory footprint because we use the struct, the struct is one byte. So if you start the variant, we serve the max size of all the states which by default would be
let's say one byte in our case because we just have them to struct plus we have to store something to, to know which one is actually active so that depends on the
compile implementation. That might be one byte,
eights bytes, whatever. But the footprint will be quite small and it'll be really
efficient because we can use those structs to add the data to it. So, for example, we can
have an IDE or last ping or whatever we need for those guys. So that's handy. Also, variant integrates
very well with the expected and static exceptions, ideas because we can actually return an error which is just part of the
variant or expected type with some error case which
we couldn't do before. I will be talking here is
like we have exceptions in disabled in our cases just
for the performance reasons but error handling is quite important and handling the errors in
enum or switch, it's not ideal. We probably would end
up from an exceptions or something like that
or using the return code or global variables or
something like that. With variant, we actually
don't have to do that, we can return something unexpected which is nice, which is good. And it's inlined on Clang. That's good. It's not inlined on
GCC, I guess that's bad but yeah, there is still
room for improvement and it's quite a decent solution for the simple state machines. It doesn't offer a lot of capabilities besides simple transitions
but it's still very useful and having the efficient
memory footprint is quite neat. It's quite hard to reuse though because it's very similar to
the switch and enum approach so we have to, you know, if you have nested state machines like calculus state machines
or composite state machine, we'll have to have a
visit inside the visit inside the visit. You see the pattern. It's not much better than if
else, if else, if else, right? Okay. So let's get to the good part. And the newest part. We can actually use in C++20. So, coroutines. Coroutines are a new way of implementing state machines as well in C++20. Has anyone experienced
the coroutines so far? Tried them out? Yeah, quite a few hand. Not a lot of hands, actually. So yeah, coroutines. We'll try to dig it a bit
more into them in a second. So, notice that in our case, the connection won't
be as tracked anymore. It's not that type, it's a function. So that's the first
difference between coroutines, when using coroutines as a state machine because before, because coroutines are functions. They're called resumable
functions and we can use them basically as a functions. So notice on the line, the second line, we have the for, infinite for loop. That's odd, right? Why we would have infinite for loop? Well, we need that guy
because we can actually use the coroutines to resume and suspend and that's where the co_await is for and if you don't get an event which we do want, we want to come back, do the loop and suspend
again in the co_await. It's a different way of
thinking but, you know, you can think of it like the state is the position in the functions when we, in the function code
in a sense where we at. So here, on the third line, we have the co_await,
we wait on the event. If we get it and the event is the connect we do establish and we go further which means that we'll be in
the further in the function. If we don't get event which we satisfied, we do the loop and
co_await will suspend again and wait for it to be resumed by, by a process event call. Notice that I'm using here a
single further application. These coroutines are often associated with asynchronous ideas
but you can use them in the synchronous way
very efficiently as well. So there's nothing about being here like multi-threaded environment
or something like that. It could be, could be the
case but it's not in our case. So that's very important. So you can use coroutines
in a single threaded way. So right now, we do a second loop because we want to be a bit
further in the function. We never come back to the first loop, again, in our case, for example, because we transit a bit further
and right now, we stay here because when we are resumed, we are resumed from the co_await which is on the second
line on the second part. We won't be resumed before that guy. So I hope, I'm trying to make it clear
but it's like it's really hard to think of it if you don't
know what the coroutines are. We are going to be resumed
from the co_await call in that place. We won't be resumed anywhere else. So we do the loop
because we have to verify whether we only established, whether we got an
established event or not. I hope that makes sense. There's a question? The question is what is the end label for? And that what you asked,
we figure out in a sec because we need it for that guy. So, as you see, we have a lot of for loops and the best way to get out
of the nested for loop is to, is goto, right? So I hope that answering a question that to get to the end. It's a different way of thinking but you have to think of it
it's like we have dysfunctions, we are in the position at some point in that function, we are resumed from the point,
the co_await was being called and after that, we react on the event and all this hassle with
the continue break and goto is just to go to the different
for loop part of the function because we always stayed in that function. So yeah, it is possible
with the coroutines. That specific condition is quite difficult but as was mentioned,
we use goto so maybe. States are really
implicit here, by the way. As you notice, I pointed them with arrows because we don't really
need to know the state because the state is represented by where we added the function which is like intriguing, right? It's not really what we actually
think of states but maybe, maybe we would like to take a
look into that a bit further. So what about the performance? We have more code. There's more boilerplate
code to implement by default because we don't have
all these facilities. Coroutines are implemented
in the core language but we will have quite a bit of hassle to satisfy the interfaces. You can check it at Godbolt
and should try it out. The interface is quite
difficult to deal with. And therefore, the different
proposals have to deal with it. However, it wasn't optimized very well. Although coroutines can be optimized especially yield might
be optimized very well than the generators. The co_await seems not and we play for the heap allocations. Not great. So to sum up the coroutines. What is it good about it? The structured code is actually
really, really neat, right? We can use ifs, else,
force and before loop not especially in the way we would like to but we can still use them. So, everyone who is new to
the language doesn't have to learn libraries or anything like that. They may just use the language and just have to understand the co_await will resume you at at the
point you were suspended. Pretty cool. As I pointed out, you can use it in a
synchronous environment but you can also use it in
a synchronous environment very easily which is not really a case with an older solution which
I will be presented here. There's is a learning
curve to it, definitely. It's different way of thinking so yeah, you have to get used to it. Might be worth it. Requires a heap. That might be not the case in the future. I don't know, I just checked
out the Clang implementation because GCC doesn't support coroutines. I'm sorry, I didn't
bother with Visual Studio but I wouldn't expect it to be faster than Clang implementation. Even Gary said that as we
see that most experimental in Clank is doing it right. So, it wasn't allowed it, unfortunate. We have these implicit states which is good and bad, it depends how you look at it. Depending on where we are in the function, we know whether we end
the given state or not, that's good or bad. I don't know, it depends. And the co_await retains a type so it has to be a common
type, sort of, right? Because we cannot return different types unless we use variant something like that so that's not ideal either. And I have to admit that these
for loops are quite weird. Okay, so let's just do the
goto just for the fun of it. Because we can so it's
pretty much the same example instead of having the for loops, we need the first for loop
because the state machine has to go forever. But besides that, we
don't actually need them. Instead, we can have labels if you really want. Disconnected label means
disconnected state. Connecting label is a connecting state. Gonna connect, yeah. And basically, after that we
go to the connected state, everything is just described by the label. And after that, we can do goto, which seamlessly quite even more need when, you
know, that's the way we exit the loops if you
have really nested ones and we don't wanna do the
checking of the Booleans. Well, that's a good solution, I don't know but it's definitely valuable
solutions with coroutines. It seems like a function. Changing the position in the function might be used with the goto. Please don't quote me on that that I'm not saying they
goto is the solution here. I'm just saying there's an option. You can explore it yourself and yeah. So if it comes to performance, still the same amount of code basically. We still have allocations
didn't help with that. We didn't expect the goto to
help with allocations though. I hope. So what about the summary? We don't have these
infinite loops anymore. Pretty cool, we didn't like them. We have explicit states which we can treat as a positive or negative depends whether we liked it before or not. And we have the goto and they story about the goto, it's you never know where you end up. Maybe goto it's not the best idea. One more different variant we can use. We can use coroutines with
functions and variant. So if you think about the disconnected state as a function, we can actually come back
to our for infinite loops, sorry about that. And just have the state being
represented as a function, separate functions and use the co_return to go to the different state which is quite handy because
we can separate the functions. It's pretty much the
same for any other event, state transition we want. So you have a connected state which uses the for loop for
the same reasons as before and we just, There's question? Right, so the question is where is the in? So, those guys are part
of the connected class which has an in. Sorry about it. So yeah, we actually in
the a bit higher scope. We could possibly pass the in into the, our functions but yeah, it's basically we have a connection class which has an in as a variable and we can use those guys like that. So yeah, that's very neat. Performance, didn't help,
actually made it worse because the variant, we use the variant to
have the different types of the return of co_await
because we couldn't deal with their common
type so we wanted to have something more flexible but it didn't help. So the functions, we can
use them quite easily. We can add a new state behavior. We can we have types of events and it's fancy. It's a bit like Apple new headphones. They can easily get lost and you can easily get lost
with coroutines, I guess. So what about the Boost? So, we talked about libraries. STL is great, Boost is
maybe even better sometimes. Sometimes, it's not. So let's compare a few other solutions. We'll take a look into Statechart, MSM which are available in Boost. You can download them, use
them right now and SML. SML is the library I'm the author of and we'll just take a look. It's not a Boost library. It just called Boost to confuse everyone. No, the reason is it is
aimed to be a Boost library. Okay, sorry about that. So, let's take a look into Statechart. Has anyone used Statechart? Not a lot of you guys. So let's take a look how
we can actually implement a state machine in Statechart. So we have to change our events to inherit from the Statechart event which is a CRTP thing. Well, not the best start. We have to do some more boilerplate code but you can deal with that. Actions, unfortunately Statechart is more like the state pattern in the object-oriented kind of design so we have to inherit
from the state machine. The first argument parameter
here is the CRTP thing and the other one is the initial state so you have to have a bit of
knowledge how to use that guy and all the actions have to
be put into their connection implementations just in order to to be usable by the Statechart. So as in the state
pattern, we implement that using the per state basis. So we have a simple state
which we have to inherit from, we passed the type you have, the connection we use
and we have the reactions and the reactions are basically
the list of transitions. So you have to be able to read it. So is a transition from the disconnected state on a connect event to the connecting state? Connection is the class and
we have the establish event so it's basically like before. We've see then other cases,
it's just more less robust, less easy to follow, I'd say. We can also have a custom reaction. So on the bottom on the connected, we have the ping which
is a custom reaction which allows us to do the
guard in and call in the, it allows us to do anything
we want with the Statechart and the transitions. And we have to discard this
event because we don't transit, otherwise, we will just
call the transit on it. Performance. 60 lines of code and 2,000, almost 3,000 line of generated code. Well, not great. So to sum up, it has a
lot of features though UML related which we don't
even cover here so that's good. Is there a learning curve? That's not great. It has dynamic allocations,
dispatch, high memory footprint. So they introduced MSM into Boost which is much faster. However, it's really based on macros. There's a different
prominence for MSM which is, I'm using this one because I find it the best to be more expressive and that's what I really care about, the expressiveness and the performance. So you have the event
which you have to define with using macro. There's another macro to define the state. There's another macro to define actions. We have to pass all these
parameter around for guards. Guard is an action, just
returns Boolean variable. It's quite a lot of
boilerplate but we do that, all of that boilerplate
is just for this guy. And this guy is awesome. It's a transition table in
which we can actually read and reason about the flow of our program. So if you compare all the solutions, this one, in my opinion,
is so much better. You can easily read it. You read it the way that you say I'm in disonnected state when I get a connect event, I call establish and I'm
going to connect him. That's really powerful to
reason about and look at. So all those macros are worth it just in order to have
that guy, in my opinion, because it gives us really easy way to reason
about the application flow. Obviously, we have more macros and one another macro thing. What about the performance of that guys? Well, it's not a lot of code, 80 line. And the performance, you can
see that there's quite a bit of assembly generated but it's not a bad assembly. It's not always about the
size if it comes to assembly. You can see we have a jump
table generated compile time so all the transitions are
just jumps which is not but if it comes to performance and we'll take a look into that as well. So just keep in mind that having a longer assembly is not always bad unless it's inline when we
want the inline version. The summary, so it's
declarative and expressive. As you notice, I hope,
in the state machine transitions table, I love it. I think that's the way to go. There's dispatch which is O(1). Jump table during that compile time. It has UML features. Small memory footprint. There's a learning curve. It's DSL based. Not everyone likes DSLs. Is macro-based. That's terrible, we don't want that. It compiles very slowly so
if you run it on Godbolt, you'll get a timeout. So in order to get the results of Godbolt to run it locally but you can still click it. I don't know, maybe Matt
will extend the time, I don't know. It's very slow to compile. And the error messages, well, you don't wanna look at them. it's NPO 98. So yeah. So, that's the reason. That was the main reason why
I actually implemented the a slightly different version of MSM using modern C++ features. At first, I started with extending MSM but that was just quite
painful because it's C++98. And in SML, we can do that. I think that's even more
powerful than the MSM because, well, we don't have macros. Yes, we don't have
gotos, that's eve better. Everything is easy to follow. It's C++14. We have the transition table, really easy to read and follow. Everyone agrees? Do you find that easy to follow? Easy to follow than if else nested switch? Yeah, I see the thumbs up, I like it. So yeah, that's great. But, you know, we don't want it just for the sake of being able to write expressive way of code in a C++ because C++ is all about don't pay for what you don't use, right? We care about the performance. We want to have the best
performance possible. We want to be able to pick how
we want to optimize that guy. But the interface is pretty neat so let's think of the interface and let's give us the capability to switch how to this, how to change the strategy
of how we actually dispatch events. So SML has a different
policies at compile time. So we don't play at
runtime for anything in SML but at compile time, we
can choose which version of the dispatching strategy we want. We can have a jump table, that will be generated
compile time jump table like with MSM. We can have a nested switch. We'll take a look into those guys. If else, fold expressions. We don't have coroutines yet so we can use coroutines at the back end. Well, that's as an option
as well if you really want. So I think that's very actually neat idea to be a to
switch the backend policy and have really nice common interface because if it comes to performance, you always have to measure, right? So it's not always that the
jump table will be the best and from my experience,
it's not always the best. So let's take a look into
the dispatching policies. If, else. So assuming that we have
some mappings and states and we have just a dispatch call which takes the current
state and the event, so that'll be the backend interface for dispatching the policy. What we can do with that? Well, we can actually call it recursively for as long as we find whether
we are in the current state. If we are in the current state, we will call the execute
which is just the transition. If we not, we'll call ourselves again with next iterations and the next state so that we'll be able to call it at the end possibly. If we run out of states, well, we don't call any transition and probably it's just unexpected. And that would be very well optimized in, This is in Clang. It gives us exactly the same
code in the MSM as before so that's fantastic. Switch, we have the same case and we do yet another trick of the
switch, nested switch. So in default, we call a dispatch again which will call with the next ID which we'll call the switch again so we have this nested switch
which is generated that compiled time for us
and if you have the case in which we care about,
we call the execute. So is that optimized? Well, yes it is. So, it's optimizing Clang and GCC and it gives us the same
result as the switch did. Notice that on the left side, I highlighted how to switch the policy. By default we don't
have to pass it through and the policy will be the best one we use for the compiler in a way that compiler could
actually optimize that guy. We go to jump table. Jump table is not, not straightforward but
it's not difficult either. We have the states, we
have the dispatch table kind of type for their function pointers and we just generate an
RAII for all the events and after that, we just jump
according to the current state with the event. So, that will give us more code, less than MSM actually but it won't be fully optimized neither in Clang nor in GCC but from the performance perspective, sometimes, it's much better
than the inline versions. It depends. We have to measure those guys. And yet another solution
is the fault expressions which is like addition for C++17 and what we can do here, we can just fall through all the available events
with the available IDs and we just execute when in matches. It's pretty much the same
as recursive version, just not recursive. So that's pretty handy and full expressions are actually very well optimized as well. So, it gives us the same output
as if else or a switch enum. So, yeah, that's handy. Nice to have. So the summary. The main benefit of
using libraries like SML is to be being able to
build an expressive. We have this UML transitions been visible quite well for us, the declarative flow. Being declarative is
always better than not. We can customize it at compile time. Awesome, that what all
C++ is about, right? Customizes compile time. It's inlined. This part is the one, fantastic. Has fast compilation times. You'll see that in the results
in the benchmarks in a sec. It has additional features
which we don't talk about but it's obviously libraries so it's much more powerful
than just the transition. We can't compare that to the variant which is just a way of
using for the transitions but doesn't have anything else. There's a minimal memory footprint. So the size of connection
is just one byte. There's a learning curve. Obviously, it's a library and it's DSL-based
which is a good and bad. Depends on how you look at it. So, if we sum up the solutions, we can quickly get through them as depending on how the
state is represented, how the transition table is represented and how the transition is represented. So, state might be represented
by Boolean variable in if else, enum, maybe
a class, maybe an union, maybe a function or maybe
even a type like in SML. And transition table might be per state which means that object-oriented
design using per state or you can have it global
where you see it all of them. And the transition is
implicit or kind of explicit. Implicit means that you have
to look through it somewhere in the code to find where you
actually do the change state and explicit which is
usually better than implicit means that we can actually take a look into one simple source code, one file, a few lines of code and we see everything is happening. There's no surprises afterwards. Okay, so, sorry. Let's take a look into benchmarks then. How does it compare? Yeah, I was talking about
for almost 50 minutes about those solutions, we compared the Godbolt
results but we have to measure to be sure which one is the best if it comes to the measurements
we'll going to take. So in order to do that,
we change our actions and guards to have the, to verify, to access a memory so that it won't be totally inlined and the idea is that we
randomized the events and we run the process event on those guys with GCC and Clang. The result would be for Clang
but with GCC, it's the same. Basically, besides the variant and we have to use Clang because we have codes in Clang, we don't have them anywhere else for now. So lines of code. The less, the better. SML is really good here
because it's like optimized for being expressive. Design pattern, state
pattern is not the best one. We have to write a lot of code as usual with the object oriented
design, it's a boilerplate. MSM, although, it's really expressive, it has a lot of boilerplate code to write with those markers. But besides that, it's quite even. Assembly lines. So, that's what's we
compared before in a Godbolt. We see that there's like naive switch and if else and SML since it's using switch or if underneath in those cases, a very
well optimized an inlined, Statechart. If you care about
performance, don't use a guy. It's not the best one. Coroutines and variant. Yeah, they're not terrible. Not ideal either. It depends obviously on your case. If you care about the performance, probably, you want to use them but if you're not that
bothered, then you can. So that's the main kind of benchmark which shows the run-time performance. As I said have been longer assembly doesn't mean that it
will be faster or slower. So you can see MSM had
really long assembly lines because it generated
jump table compile time but actually it's quite fast. Variant is quite fast as well on Clang, not necessarily in GCC. Statechart is slow but we
didn't expect it to be fast since it generated so much code. SML, switch, if else,
they're very well optimized because they are really easy to follow for the compiler to optimize. State pattern, well, if
you care about performance, you don't really want to
use virtual dispatch, right? Although, it might be
devirtualized sometimes, most likely, it won't,
you will have the heap and shared pointers and all
that badness has to deal with. So, you don't wanna do that if you care about the performance. Coroutines, coroutines are quite decent. Probably, they might be better and it's just first stages
of them being introduced so maybe at some point, we
can actually improve them. Instructions per cycle. Well, the more instructions
we can handle per cycle, the better. It just shows how good we
actually utilize the CPU. If you have less
instructions, we may not use, we might have not as good results here but it shows that all the solutions are
more or less the same. Coroutines, a bit smaller
than the other ones but having a value around
two is really good to have or keep in mind. Branches. So, it's interesting how many
branches actually MSM has because of the jump table. Besides that, inline versions
have more branches as well in comparison to all the others but it doesn't mean that will be slower because it's all about
how many misses you got. And as you see with the state pattern
and we throw dispatch, well, we don't really hit it very well. Compilers are not really, CPUs are not really good in
predicting where we are going. In other cases, it's much better. Statechart, since it's using
the virtual dispatch as well, it does have quite a few misses but it still not as bad as the state pattern. Compilation time, that's what
I was referring to earlier. MSM compiles quite slowly in comparison to any other solution. So that's not good. If you really have to be productive, you can be productive with MSM by having declarative way of using it but you pay for the
compilation times a lot. Really, a lot. SML compiles quite fast. And the release, it's pretty much the same as MSM is very slow to compile. And the size, because we
care about the size as well. Size is really important for us to, if you care about performance, you probably care about the size. We have some restrictions on that. MSM produces quite a huge
binaries in the debug mode because of the debug symbols. As I told you before, it's all MPL based so we produce a lot of code in comparison to other solutions. But in executable, in the release mode, they're really small in comparison in MSM, in comparison to the debug mode. So that's really good which means that everything
is just thrown away. So, yeah, we have like 10 seconds left so the mission is to embrace zero-cost state machine libraries and with that, we have actually
zero seconds for questions so thank you. (attendees applauding)
That library looks promising, but I am a little worried about the maturity and support provided. Looking at the issue list it seems that the author more or less ignores any issue raised.