Kevlin Henney - Functional C++

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] okay right let's talk functional C++ I'm sticking with the oldest new theme for this conference and there has been a kind of a reawakening interest in native compiled languages in recent years both the old ones as a continued interest in see it forms the underpinning of most open-source lower level detailed operating system level stuff but also there's a lot of code out there that already exists that is like this but again people also find the need to squeeze that a little bit more out of their platforms so sometimes when they want their control they feel like taking control in their sense directly at the platform level unfortunately most people actually haven't got a clue how to take control so they're probably safer in higher-level languages but when it comes to C++ there is this interesting question of if you're going to work with C++ then are there benefits to using a functional style and I hope to put forward the idea that yes there are and there are some very specific things that we can do to simplify things but also improve performance as well as well as improve as it were I guess there's two aspects of performance developer performance the ability to reason through a piece of code but also runtime performance so things that are relevant probably possibly yes I'll refer to this a couple of times the 97 things bug but let's just start off with a quote from Michael Feathers who was on in the session before quite a useful way of looking at a distinction between something like object-oriented programming and functional programming the if you like most paradigms have the idea of trying to organize structure in gramming and there are paradigms that don't organize structure we don't really worry about those of us called you know legacy code that is a paradigm in its own right but the idea of imposing structure on a system and there are different philosophies of how to impose structure and modularity is a very common feature of all of these but the nature of the modules their philosophy their intent is different in each case so what I quite like about this one is it says right io makes code understandable by encapsulating moving moving parts the idea the assumption there is that we're going to change state although I'm going to say actually that's a different story there is it's not a guaranteed assumption that we're going to change state the let us assume that we look at it from this modular point of view containment I want to contain something so hides the complexity of change I acknowledge there will be change but I don't want to have that spreading through the code then functional programming takes a different philosophy it says actually its primary philosophy is to reduce rather than contain the movement is to reduce the movement and these two often give very different styles now obviously we can sort of see that in a language that is supposedly multi-paradigm why C++ we should be able to step from one space into another or all the way in perhaps but to sort of take a little bit of comfort in a different space and this is quite nice a sort of testimony from the Facebook moments team working in C++ but then saying okay to keep our C++ API boundary simple we adopted one-way dataflow the whole idea of one-way dataflow means no re-entrance II and this simplifies simplifies reasoning but it can also simplify performance the minute you have re-entrance T you have to start worrying about well is there if I've got concurrency and re-entrance e do I get the possibility for deadlock and so on the idea here is very much a low lock approach let's just have the data flow if we need to change something we don't actually change it directly we tell somebody downstream here is the value that you want a much more isolated approach the API consists of methods to perform fire-and-forget Mutai and methods to compute view models required by specific views the benefit of a fire forget approach conventionally if you implement something like the observer pattern and any typical MVC approach is that you end up with a call bank on an event to change some state and then somebody else wants to know about that states that they go to the same place there is this idea of fire forget mutations is like well you just say something happened you never look at the thing again okay from this point of view this is actually a kind of a neat trick in functional programming both pure and impure the idea of will pretend there's no state change if I can't see it it doesn't happen okay this is the kind of like the child's view of the world yeah if I do this and I'm not influenced by any effect of it then that's as good as it not having an effect at all yeah so you can actually see there's a modularity idea here of narrowing the view on the other hand if I can call something and then my next call is to call it and say did you know did you get that steak change let me check your state then clearly that doesn't follow this fire-and-forget approach so the idea here is very much narrower interfaces when I notify of an event I have no way of going back to that object to do anything else curiously this is almost like the idea we used to joke about of a right only interface yeah I can't read it I can only sort of say something happened and there's the basis of reactive programming in many ways so we have that view to keep the code understand when we write functional style code converting raw data objects into a mutable view models by default and importantly here the immutability provides consistency easy to reason about as we identify performance bottlenecks through profiling we added caches to avoid recomputing unchanged intermediate results now this is a really important idea and one that it's a kind of a letting go that many C++ programs have a problem with because there's the idea of like well that'll never work there will be in efficiencies and now I need to put through all the code acknowledgement to these inefficiencies and clever workarounds instead of doing that you say after more programming is the next best thing to magic what do is as we know in the real world there's no such thing as magic magic is all about illusion you support the illusion of this being immutable but it's just a view on something and what you do is you say well that view that illusion is actually quite slow to maintain we'll tell you what let's add another illusion that's out of cash let's do something else that makes the programming model relatively simple from here and then we compromise on the other side by putting the complexity there so there's a little balancing act that may we may want to undertake um so the resulting functional code is easy to maintain without sacrificing performance so this is kind of like having the best of both idea so just a quick poll so I know because I've got quite a bit of code and quite a few slides so I get a feeling for where you're coming in from how many people have at any point use C++ let's start off with that okay good how many people are currently using C++ okay how many people have any kind of familiarity with C++ 11 or 14 okay right okay so if you haven't touched C++ for a while yeah it's gonna be interesting I'll be I'll be gentle though okay so let's start off with um John Carmack who is sort of famed from the ID studio doom Wolfenstein all that kind of stuff and Hiro really nice little blog a couple of years back on functional programming in C++ and he made this observation which I think is a generally useful one no matter what language you work in programming in a functional style provides benefits you can't necessarily embrace the whole style you're not going to go full Haskell and C++ but there's the idea of if you can take some of it you may find that you may find a number of benefits you should do it whenever it is convenient you should think hard about the decision when it is inconvenient I like this kind of resistive idea of actually if you're not going to do it think really hard why not what is interesting and sometimes people assume well maybe Java is a better language for doing this in actually it isn't necessarily there are different strengths and weaknesses in both so if you're familiar with Java or familiar with c-sharp you will find that I actually the sweet spots for functional program you're going to end up in different places because these are not the same kinds of languages okay so I'm gonna start off with a very simple idea of just sort of how I'll take you from take you from kind of 1990s through to the 21st century in terms of sequence bus let's just start off with the sort of the gentle stuff basic functional abstraction it's getting cold you have a heating system you turn turn on the heating and turn it off so this is an example I used here last year and I was talking about functional programming you already know so yeah you can turn it on and turn it off now what would be really useful is if you could turn it on and off according to a timer so it comes on before you have to get up in the morning okay and then it goes off sometime sometime later now there's this interesting kind of question of how we're going to implement this and we may immediately go well you know I've got I've got a way of doing this we're going to focus on using the command pattern and it's kind of classic form and we're going to pass C in a command and we're going to run the timer we can see it so this is when we want it to go off and this is what we want to have happen and so then promptly we create a command so we have a an interface while class that is pure virtual function and the dots probably just just a virtual destructor and that's it so we've got to set up we have a ceremony of setting up a class and then we have the joy of implementing it this is how you do turn-on if you look carefully this is pretty much all housekeeping the actual activity is here so far what we've had to do is construct declare data I'll declare inheritance can't with a name override a virtual function and all for that line and that's just turning on and that's turning off we run out of screen okay let's resize so here you have this kind of interesting thing that this is boiler classic boilerplate code a lot of people get into this and this is object orientation object orientation involves me writing lots of stuff like this I've got bad news for you now it doesn't but this is very familiar and if you're being paid by the line of code this is wonderful okay now some people have got that kind of a kind of unreconstructed see background in them I think I now know how but a better way of doing this what I'm going to do is I'm going to use function pointers and void pointers void pointers basically it's a way of saying to some but it's a way of talking to the type system and saying thanks for all your help but trust me on this one I'll take it from here okay but there's also an interesting thing here and this is a kind of an interesting normalization issue if you look you've now got an argument to a function pointer that takes whatever and or whatever that you want to pass in which will be a pointer to the heating system those two are correlated those two are not coincidental you can't just pass any function in and any data they have to there's a hidden dependency that tight dependency hasn't got away it's just not checked and you better be right about it when through things are bound together in such a dependent way we have a really interesting name for it it's called an object so it wants to be an object so the command was in the right place just the wrong syntax that said let's have a quick look at how we'd set these things up I mean yeah there's a bit of casting notation and I'm always important to use the standard keyword cast notation instead of the parentheses the C style casting notation there are two reasons for this one a cast is an ugly thing it should look ugly okay don't make ugly things look beautiful in your code okay so the point there is I want you know it should not don't just I just a cast don't worry about it it should look ugly I dislike the fact that a number of languages have followed C's idea of discreet parentheses it should be anything but discreet it should shout in your face the other thing is it's easier to grab for if you try and grep matching parentheses in a code base you get most of the code base back okay so it's but it is light it's a lot lighter there's a lot less Sammy in fact you can tell I'm using a larger font than the classic command approach on the other hand we can use this this emerged through the boost libraries was effectively part standardized in the technical report C++ known as technical for Tier one and is now part of C++ the typical standard has been for five years this is a generic general-purpose function wrapper you can pass some function pointers you can pass in lamdaur objects you can pass in function objects anything that supports on a function call operator so the idea is we've now got this perfect decoupling okay so I sprinkle abstraction actually setting the thing up is really easy we can basically say I want to bind so this this is partial application in C++ I would like to bind the heating system turn on the function pointer to that member function to the heating system okay so basically it is a deferred call so the binding creates a function object and that's it we're done we don't have to define anything else we've already had everything there or we can go ahead and use lambdas which is actually even shorter and there we go we just passed in a lambda here's a piece of code I want you to do this now a lot of people think oh great I'm doing functional programming I've got some really bad news for you lambdas are really old they're also the first form of object-oriented programming that we encounter very important paper William cooks on understanding data abstraction we visited he cites lambda calculus is 1941 it's actually 1932 so well done C++ 11 for getting in on the act seventy-nine years after lambdas were invented okay programming like it's the 1930s use the idea of being able to pass a piece of code around as an object is not uniquely functional indeed a number of languages have had it notably small talk eighty and even in fact things like algal hellhole 60 one of C's precursors allowed you to pass a block of code okay so this idea is not is anything but new and is anything but unique but clearly functional programming takes it further that there is the joy of C++ and syntax that is actually a legal piece of C++ code and is a lambda that the the capture Clause captures context it captures nothing takes no arguments and does nothing and that's how you execute it so you know you can either service a certain joy to the C yeah this is legal syntax it's not just me wandering across the keyboard looking for matching current parentheses okay so we need to do something a little more serious here so the key thing here is yeah lambdas sort of formed some part of it the idea of being able to pass around higher-order functions is an important element and that unifies a number of disparate aspects that already present in C++ brings them under one hood we can do partial application this is great stuff but really when people are talking functional programming it's normally about referential transparency and pure functions and you get effective you end up with one from the other referential transparency is simply the idea that if you are calling a function with a particular set of arguments you get basically the same values back every time you get the same results more importantly that values continue to have the same value that they always have this is sort of an enduring effect so you know this is a big Assam puts this under apply functional programming principles and this whole point about mutable state you want to reduce it you want to eliminate it how to express this kind of concept in C++ cost okay so here's a so here's one of the first points is that some people have been lazy about comps but the interesting thing is I was at a very fortunate position when I learnt C++ in the last century that I had already encountered functional programming so the idea of being able to explicitly mark something as immutable or please don't change it too much and of course this is C++ if you really want to change things of course you can you know we know that you know any language that offers you memset in the library is ultimately going to say you know if you're really trying you can go for it but you have to try the idea here is that by default on the compiler on my side that said if we had our time again it would be really nice if we were able to have the opposite semantics I really think that the idea that in any programming language the idea that your variables are modifiable by default is a bad move I think you should ask for it yeah by default there should not be modify land you request it specifically but that's it we can also use const to qualify member functions on things and really say no this is going to this is going to be unchanging or at least support the illusion of not changing we also passed around things by reference to avoid any copying costs and then we also have our value references to optimize the cost of copying that's a C++ 11 feature and allow and importantly allows you to go around saying you know what I'm going to return this huge great vector as a temporary value and not pay a copying cost this is quite important it allows you a much more streamlined approach because I see a lot of people sort of when they're writing certain codes or I'm not going to return that as a value because it just makes the code it could be really inefficient well in many cases it wasn't any way but this absolutely makes sure of it so the obvious aspects of supporting the illusion that the value you've just created has been moved around and is unchangeable is quite important so let's say let's pick on a little example that sometimes use that I sometimes I use it for abstraction reasons but I want to sort of just get a feel for what this would look like differently if we have a date class that abstract yeah the standard day in the year and I've defaulted the copy constructor and the copy assignment operator it'll do the right thing but I want to make sure that they're visible so we know what's being what we're relying on and there's a bunch of other stuff I'm going to worry about there and then the ones that people normally expect ya get month get day month set year set months it's a month and this is their one of those things where people go into automatic ok but let's be very clear just because you have a getter does not mean you have to have a setter some people think you do I've actually seen a coding guideline that says for every data member that you have you must have a corresponding getter and a setter it's just like oh what happened to my encapsulation you know it's daddy's buried beneath the bits but this is also a habit that people have even when they're not doing that and we see it across many languages the idea of I've got to get I got a setter and there's a beautiful rhyming comparison there get say they're both three letters long they rhyme in English they're kind of natural opposites except they're not the opposite of set is unset or reset it is not yet okay so the first thing is do you really need do you really want to just change the day in the month that doesn't make any sense just the day in the month is that really what you want to do or do you want to just change the month or it's unlikely so the idea let's start off with the idea that you want to change everything if I'm going to reassign a variable I want to change everything and I'm going to provide a I'm gonna provide a member function for that so it sets everything all at once and so therefore to set it to today's date we'll do that now what's this function going to have to do it's going to have to validate that that is a legal date which funnily enough is what the constructor already does that has to validate it's already a legal date so we've got duplicate code well maybe we can factor it out actually we don't need this one at all it works just fine without it if we take the view that we absolutely minimize even eliminate all of the modifier functions in the public interface that leaves only one that we've defaulted assignment if we take the view that assignment is the same as rebinding then we end up with this approach the constructor does the validation if it succeeds then we get a valid date object if it does not succeed we get an exception in other words it's already done the validation all the validation is in one place and then it assigns across it assigns across a valid object there's no need for any other validation if you're ever concerned about any performance this is probably an inline inline code so therefore we've already done it we already had that capability available but not many people were aware of it C++ 11 allows me to mess about with the syntax of it which is quite nice and there's that whole idea of like well we have this is already a solved problem but one more thing and this is a not unique and distinct to C++ I actually find it in other cases people have this terrible naming habit get I actually found an article I wrote in 1995 complaining about the fact that people use get it's not you phenomenon but it is just as irritating as it was back in the mid 90s you really don't need to get stuff this is not object-oriented assembler importantly if you're trying to think functionally then don't use imperative names get is imperative I find most honest that people so I'm doing it functional program why are you getting things there is no get it just is or is not okay in English by default get does not mean I'm asking you a question it means I'm changing something okay the default meaning of get is have huge great side effect it does not mean asked question politely and make it Const that's not the meaning in English okay yeah if I get that and drop it on the floor side effect okay if I get money from a cash machine huge great side effect than disappointingly on my bank account if you get married big side effect in your life your fundamental state change okay these are not merely questions they are not constants they are side effect inducing this idea that get is a query is a sort of an assembler like anomaly that we've let creep into our naming conventions we don't need it just so what's interesting there is I've made this both simultaneously more object-oriented and more functional which is pretty good just by eliminating three letters okay now sometimes people say well you know that thing you said where you said I wouldn't want to change just the year or something like that well maybe I do okay well how do we normally do this how do we do this with fundamental types well we do this by moderate owning a modified value if I take five and I add two I do not change five I return seven so therefore what we're going to do is we're going to say this is like a projection I'm going to take the current date current date but with the Year changed then I will return a new date and we can write it like that or we can just say actually I don't need to re if it ends up being the same year then it's the identity operation so I'll skip the construction step and I'll do that the point here is that this kind of like builder style notation is both better object orientation and fits with the functional paradigm so I'm arguing for the fact that sometime people are have ended up and ended up in a very strange object-oriented territory which is why it seems such a great distance to do functional programming I'd argue this in fact I used to argue that this kind of approach was always a good idea from a narrow point of view and obviously we can extended right now this is fine this is all good but there is a big question here it's not so much question about books it's a question about containers because I mean little things like dates and other small value types it's fine but what about containers if I've got a million elements of something can I just copy that around and we might be a little reluctant to just say well yeah this whole vector with the fifth one changed huge great copying and memory and allocation cost okay so let's address that first of all let us imagine a hypothetical library the FTL rather than the STL as opposed to faster-than-light it's the functional template library what would that look like okay well it would look fairly similar fairly fairly familiar in similar we'd have iterators we'd have the ability to go from the beginning of something to the end of something but actually I'm going to say the only modifier we're going to have is the assignment the rebinding that's it everything else is pure query so how is that going to work well first of all let us have a go some of the types that are more obviously unchangeable let us imagine that we have an immutable set and if the set is immutable that basically means we can initialize it set up the set with all of these values during the constructor is the only time that anything is going to change because officially the object does not yet exist until the end of the constructor so at that point you're going to do something you're going to create the data representation because it's an immutable set there's a number of nice properties here you can sort it and you don't need a ridiculous tree structure you can have a structure that has high locality you don't need to worry about hashing you can have a structure that has high locality because you're never going to change it so therefore you can use an array this is nice this works for a lot of cases where you just find the passing around small sets or you have a large set or map or whatever of reference data we're not going to change this we're always going to look this up or we change it so infrequently that memory rebinding costs are not an issue so that's a solved problem so in other words you identify data structures based on your usage the idea is that normally people don't think of their collections this way they normally think I've got a collection and I happen not to change it I'm going to say use a different kind of collection not changing is a fundamental property you can take advantage of we can also do it for an array just again a reference based thing it's fixed size at construction time and we're able to do that and it's got all the usual operators and everything is clearly constic wala fide which is one of the reasons I wish that everything were constant I D fault it saves a bit of noise in the code but there's that idea very very simple idea right so I've sort of sort of solved the problem because if you if you think carefully if we're all sharing the representation of something then if I copy it around I don't actually need to copy it do I we can just share the same underlying representation because there's nothing you can do to your copy that makes my copy any different in other words they share the same and that's brilliant but it doesn't deal with change for that we need to do with persistent data structures and it's not persistent as in database and called persistent data structures because their previous state persists to whoever's looking at them so I'll give you a very simple example here let's actually go and look at something like a vector what would that look like the vector here has same operations as the array but it's also got pop front pop back these can be done in constant time and it's a very simple illusion and what we're going to do and I'll come to that in a moment but if you look at those functions they return a new vector you don't pop front and actually change the state of the vector you looking at you get a new vector back but one point about that I don't want to confuse the naming with the standard the STL ones which clearly have a different effect you know was they have an effect but there's another thing it goes back to this idea of not using imperative naming I need a more sort of noun phrase or adjective or phrase pop front what does this look like when the front is popped yeah show me the popped front version of this vector okay so how does that work we've got we've got this underlying representation a shared shared array and then we've got this range described here what we're going to do is just sneak in like this so let's imagine that's the representation in memory I've got one vector pointing at it I've got another vector pointing at it that's a cheap assignment it's just a pointer assignment there's no actual copying involved so already we're ahead of the standard okay this is already cheaper and then we're going to say you know what I'd like the pot front version of that book that's it no copying involved again you just move one down same for popped back so in other words as long as these are the only operations and the STL classically is defined in terms of the operations that you want this is really cheap pushing hmm not so much that how are we going to do pushing pushing different view need a different structure let's use a link structure singly-linked it's fine a B again same same issue same same simplicity and same cost in terms of copying the popped version is easy and the pushed one is also easy and this is the clever bit about persistent data structures the clever bit of a persistent data structures is not just that you can exclude stuff and it cost nothing is that you can add stuff and you might say well that that you've changed memory yes I have that functional programming always changes memory don't get the illusion that something is for free that the physics will tell you it isn't always work is done but this is the neat bit is you can't prove that piece did not already exist in memory there's nothing you can do in the code to say well you know he actually just went out and found a pre allocated one or I actually allocated one just on demand so this is the basis of this now this might look like a relatively familiar data structure if you're nerdy enough to know anything about Lisp which I happen to have a kind of an interest a long-standing interest in I'm no Lisp programmer but I did observe a number of years ago I have a deep fondness for the list model and simple elegant and something with which all developers should have an infatuation at least once in their programming life you if you haven't yet had that kind of oh my god Lisp is the answer to everything moment I urge you to go out and find that moment because you do there's a sort of an elegance to it just like yeah this is really nice and then you're back in the real world but it's it's kind of this that it gets your thinking go and I even developed a a thing called a list which I was very proud of as a clever name as a singly linked list like list that was immutable however let's just call it this let's keep it simple so here I've got a list now it's got pop front it's got pushed front so it's exactly like I showed we've got an internal link or rest of it so this is all going incredibly well however if you've watched carefully you may notice there is a small memory management issue there is always a memory management issue there has been a memory management issue in programming since the 17th century Shakespeare was in fact a programmer he cleverly masked his discussion of code because he didn't really have the language to talk about the modern jargon that we have he cleverly mastered inside place turns out that Hamlet is actually a tragedy of memory management ok and and we have these different philosophies so what we can see is the Hamlet has a particular view yeh from the table of my memory I'll wipe away all trivial fond records Hamlet's into garbage collection ok this is this is and this is for him a tragedy because we might say well yeah surely you know C++ doesn't support it well actually it does C++ 11 accommodates it it does actually have explicit accommodation for varying degrees of garbage collection but there is only one small problem it's optional if it's optional it means it's not under the control of the programmer it's under the control of the compiler vendor now I don't mind having a garbage collector where somebody says you know what I'm going to disable it and that's under program of control the problem is I don't even get that choice if you're lucky maybe you get it if you're not well you're unlucky and you have a lot of memory leaks you cannot write a program based on maybe you'll get lucky and manage memory and maybe you won't you know that it's a fundamental architectural decision and unfortunately this hedging solution is not a good one now we can find out from the iron man you know this is the hard this is hardcore this is hardcore allocation or sorry collection standpoint safety if it's strict then that will do garbage collection if it's preferred that will actually do actually enough if all your objects are immutable and don't contain resources there'll be no destruction but the problem is if it's relaxed which is the default in C++ and in other words what you're going to actually get then you have a problem in others you cannot write a piece of portable code to rely on the presence of garbage collection only on the its absence which means we have to turn to a philia who is very much the kind of classic old-school C++ programmer she's got the hardcore attitude she's definitely tis in my memory locked in you yourself shall keep the key of it so you better delete it right or else okay it's your problem not mine it's not the runtimes now in C++ there is accommodation for this we can sort of shift we can shift fro from using a roll pointer to a shared pointer we have to keep in mind that by default SharePoint err assumes it's a single object rather than an array so we can deal with that and that's fine but there is a reality here that's observed by Robert Murray a long-forgotten book but still rather a gem C++ strategies and tactics there is this idea that the code complexity increases certainly if you have to write your own use counter class standard has a shared pointer which is great but he makes this other point you know there's processing time involved there's going to be if you're dealing in a in a threaded environment then that increment and decrement is also going to take it pay a cost it turns out that it a for functional programming and multi-threading you really do need a garbage collector it is actually more efficient and it's safer but the problem is we don't have the full option so let us just say that we try and replace all of our pointers in our linked structure with shared pointers we solved the problem and introduced another one okay so let's just take a list of anything it doesn't really matter I'm going to fill up that list fill in the question is how big how big can we make em I mean how big how big is a useful list what obviously depends on your domain a couple hundred items you're good few thousand about a million mm right what actually happens here the curly brackets are quite important the curly brackets show you the scope of the object when chain goes out of scope right here it's destructor is called its destructor will trigger the destructor of the shared pointer the shared pointer will decrement the count let us assume there is no sharing going on the representation sharing is a very useful technique again it reduces copying but let's just assume we have a single chain and everybody all the elements have a reference count of one so shared pointer its destructor goes one goes down to zero so that means the next objects destructor goes and it points to the next object one goes down to zero so it calls the next you end up with a recursive call to the destructor that blows up the stack for only a couple of thousand for a list only a couple of thousand long this is bad news this is why I said if you're going to do functional programming really you're it's not that you can't solve this problem in C++ and there are lots of C++ programmers are all too happy to sort of say oh I've got a technique for this I'm sure you have that's not the challenge the challenge is can you do it without having to have a technique and make the code more complex that unfortunately is a bit of a an annoyance here is that it stands in our way we have to account for it explicitly and work around it and we can accidentally walk into it so persistent data structures are the answer but you may find you have to do a little more work in them than in other cases but one of the reasons people are also interested in functional programming in C++ is particularly the presence of threading and the management of threading and so john Cormack again makes this observation and this is just a general observation and gain on its own is enough to justify a more disciplined model of state a large fashion the flaws in software development are due to program is not fully understanding all the possible states that code may execute it so therefore if you have fewest ates and less state change is a trivial observation if most bugs come from state changes then if you have fewer state changes you will therefore have fewer bugs that's actually the past the premise of functional programming but as he says it's an amplifier you throw threads at this and you amplify any misunderstanding or uncertainty you had okay as he says almost at the point of panic if you are paying attention so how does this come about well what is the problem of concurrency as as this tweet observes take me down to concurrency city where Greene pretty is grass the girls there and are its race conditions now why do we end up with this kind of stuff and it turns out that it's not so much a case of the this will be locked in your memory it's more a case of there is something locked in your memory there is an assumption and you may not know you have it and it is partly to do with education and is partly to do as in your background and the api's and languages you've used in the past unless you have only gone through the functional path you'll have very particular assumptions about things so a few years ago a couple of times running workshops I used this kind of simple word association game and so word association as in I say word the first thing that comes into your head so snow white up down yeah so that so basically what you're trying to do is get past people's normal filters so I did this I said concurrency everybody in the room said threads even the guys in the room who got me in there to talk about non threading approaches because it's in there it's an assumption that means threads because that's that's the the raw ingredient I said threads they said locks or synchronization this is one of the things I love is that way using a way using threading yeah we want things to go faster okay so why are you using locks locks are the anti threat the whole point of a lock is to eliminate concurrency if your goal is to have concurrency then a lock is exactly the opposite and this becomes a question of a bottleneck classic bottleneck indeed there was this lovely observation by David Bruton Hoff perhaps we shouldn't have called the mutexes after all he said you know what it sounds too innocent it sounds too technical we should have called them bottlenecks because then programmers would have to talk to each other you know I'm just going to add a bottleneck in my code and then realize as they said it how stupid that sounded so there is this question that these are clearly antagonistic so how do we get here well a few years ago I found myself drawing up to clarify to people what was going on a simple quadrant diagram and your people have quadrant diagrams because you know it simplifies the universe there are only four things that you ever have to worry about as two axes two elements on each axis or two directions on each axis let's talk about the mutability of state okay the state shared between threads well the state that we actually have so mutable or immutable and then the sharing axis I got ahead of myself unshared or shared so the state that you have can be shared between threads or unshared now the interesting thing is it's not obvious when I describe this where the problem is so let me do some color coding here okay the bit that's in red is really really bad you do not want to program in here it's the synchronization quadrant okay do not try and avoid programming in there where did most programs exist in that quadrant yeah notice 3/4 of the diagram is free of most of the problems and yet we've ended up programming in there I used to wonder why this was and then it became obvious to me because for many years most people's languages and programming models and programming backgrounds were all in the unshared bit because if you only have a single thread then your program is thread safe although occasionally I do observe the 0 threads is about the right number for some code one is probably too many they would be much safer if nothing executed but their observation aside there is this idea that for a long time all of our code was over there naturally now if you're doing JavaScript it still is effectively you know so you know that there are languages that do not have any model whatsoever of concurrency at all there is no threading in there so so naturally we're over here well which half are we going to be in historically functional programming was often associated with certain costs and so there is this idea that conveniently and according to how people thought and coming up from assembler the history of programming languages all the rest we were naturally in the top left-hand quadrant so when threads arrived we naturally moved right into the next available quadrant which sadly puts us in the synchronization quadrant so this is kind of one of the things people have ended up by default assuming I have state that changes therefore I need to lock it whereas we need to actually question you have state the changes but now the world has changed do you want that so the simplest one is if stuff is immutable then it's always shareable if you cannot guarantee or sustain immutability then you want isolation don't share it in other words if I have state that a thread owns and associates with but it never passes it beyond the scope of that thread then it's perfectly safe to change it because nobody outside can see it and that's nice and easy alternatively I have a concept of strict ownership here is an object it is mutable this thread has it and it's going to pass it to that thread and release any claim to it whatsoever so therefore we have an object that will only ever be accessed by one thread at a time ok and we may go via a locking system but the idea is that when we're saying we're trying to eliminate locks is we're trying to eliminate visible locks in the code base in the application base you want to use other primitives some of which may actually be locked free but that contain these locking mechanisms to achieve isolation now what is the simplest kind of code to work with well sequential what we would like is for our code to be sequential now this sounds that our odds with the idea of yes but I want concurrency the thing is you're very good with working out bits of sequential code the problem is when you get to bits of sequential code interfering with each other synchronously that becomes the issue so what we want is code that looks sequential wherever you are in the code base it feels sequential that's the programming model you want so therefore you don't want to see lakhs you don't have to know that you are sharing something with somebody else it should feel sequential therefore the way we're going to try and achieve that is through a synchrony a perfectly asynchronous system in theory has no deadlocks and if you have a completely weight free system you can end up with no deadlock there are some issues it sometimes is I'll come to that question in a moment there is this question that sometimes you do actually have to wait sometimes the real world actually tells you we do not have data so waiting is a natural thing for a piece of code to do the problem is that way most people approach concurrency is they take the natural stuff and then add loads more noise lots of weight states that kill this stuff I'll quickly handle one question two questions even other persistent data structures in C++ copy from closure they don't technically exist and closure came after persistent data structures so by the work just to clarify closure is a lisp like JVM language and it happens to popularize some of these ideas but persistent data structures are older and there's some really good stuff out there on those I think Bartosz Milewski did some really good demonstration articles here will it be possible to find your slides somewhere yes absolutely after this talk do I think their only reason to use C++ over Russ today other than we already have code base written C++ and we already have developers who know C++ and we have an organization that is comfortable using C++ but he's still unsure about the future of rust and so the usual sociological reasons okay so in other words some of the fluffy stuff okay so what are we going to do what we're going to do is I want to introduce some basic asynchrony so imagine that we have a function that is a long-running function and we're going to get a result from it I've got a choice I can have it long-running and then do something after it or I can do something before it and have it long-running the problem is I cannot do both at the same time so basic C++ 11 stuff just basically says okay let's decoupling execute the function asynchronously pick up the result in a future so now I owe you it's basically a virtual proxy for a value and then later on you can get it you get the value at that point you wait there are a couple of issues with the C++ 14 version of async but I won't go into those but that's basically the idea that that's good I actually wrote up a library in a proposal for the C++ down many many years ago there was a I would say a little more elegant and actually didn't have the problems of acing where everything was a function everything looked like a function so thread was a verb it was not a thing you didn't have a thread you threaded something and you could choose from a thread pool or a raw thread and you'd say I'm going to take a function object and I'm going to give it the breath of life and it's going to run off and you always get a future and a future is executed as a function so it yields a value consistently every single time you call it or blocks until it can and that model actually has some still has some benefits but it's not implemented so how about we take another look at the primitives okay so we've got futures I'm going to leave those to one side there's this observation from Russell winder about shared memory instead of using shared memory specifically shared memory immutable state we can use processes and message passing and process here just means protected independent state not necessarily an operating system process any highlights Erlang as which are languages embodies what's known as the actor model and akhom which is the language that embodies the communicating sequential processes model CSP now as it happens this is something that I studied many many years ago or it was part of what I've studied that and the actor model and it was the basis of something that in the arkham language i was known as a channel and i used to really value channel based computing that's become a little more popular recently a lot of people think that go invented channels it didn't it's just finally brought into the public's attention so channel based computing a channel is basically a cue but the idea is generally you don't share channels around but it is essentially just a cue and we can build up a cue quite simply i'm going to go through these slides reasonably quickly but just so you know that there's stuff there and i'm going to build it up using basic synchronization primitives not going to try and make it lock light unless you have a really strong reason to do that I'm going to leave it at this level so the basic thing I can send something in a channel and I can try and receive it that's a non-blocking receive I'm you're just going to use a regular Q type and sending is just pushing into the Q try receive if the queue is empty you return false if it's not empty then you pop a value and you did indeed return true now clearly that's not very thread thread safe it's very very sequential so let's just refine that a little bit we will add in one of those mutexes this is an example of I'm not going to add mutexes everywhere I'm just going to add it I'm going to treat the channel as my primitive so I'm going to add in a key there and so therefore we get a lock guard with automatically does scoped locking so that's safe and then we do the same thing a little bit more choreography this is why you don't really want this in your main code and therefore we've got that and I'll put a blocking receive in as well so that does all of that and we have to add in a condition variable and that means we have to do clever things with receive but the nice thing about the condition variables is that they will wait we passed in a lambda there they will wait until the actual condition of interest is true you can pass in a predicate so the idea is I don't have to keep looping just to check is that condition true is it not empty is that okay so we have that guarantee so we can go ahead and pull from that okay so that's the basic stuff so now we need to apply it to a real world problem fizzbuzz clearly an obligatory duction work if you're not familiar familiar with fizzbuzz who is familiar okay as a few of you not right it's a simple counting game it's actually a simple drinking game one two fears any number divisible by three is replaced by fears four buzz any number divisible by five is replaced by buzz fears seven eight if this were a drinking game and I said nine I would have to take a drink that's that's it but you've got to keep on processing when you hear anything divisible by three and five so 13 14 bits 16 it's only so so it's not exactly the hardest any problem you're ever going to encounter but you can do some stuff with this obviously what we're going to need to do is set up you know I've got a very functional I've got no if statements here is just one expression so fizz fizz bar fizzbuzz fizzbuzz or a the string conversion of that and then what we're going to do because now we've got a source of fizzbuzz we need a server I'm going to set up a server a whole fizzbuzz server how useful is that so here's a fizzbuzz server it receives integers on the input channel and the output channel is going to return strings and it just loops forever serving fizzbuzz requests okay so we've got a strict separation first thing here is the strict separation between your core logic and the coordination logic what is also interesting here is that you can actually test the logic of this without using threads you might want to put in the termination condition but for brevity I've not included that the idea here is that from this codes point of view the world is sequential it keeps the life nice and simple okay so we've got that and that's that's that's kind of easy I can fill up a channel and then just check what the results were afterwards I don't necessarily need threading so there's this idea it looks single threaded wherever you are now we can tidy up the syntax a bit and oh yeah obviously I need to set up mate so I offer about that kick off a thread loop through one to 100 send result receive printed out tidy at the notation we can use bit of streaming you know people are familiar enough with streaming as a C++ idiom we don't have to use the send/receive names we can add a few bits and pieces in there there's a couple of points about try receive which I'm going to skip that we can actually put in something there that automatically does that but the next thing of interest is once I've got channels there's nothing specifically functional about channels on their own they offer you isolation but there's nothing functional yeah pipes and filters however once you've got a channel you can set up pipes and filters the idea behind this is that concatenative programming is a form of functional programming and it basically ends up looking like a pipe some filters more as John Pearlie observes the basic reason UNIX pipes are so powerful they form a rudimentary string base concatenated programming language so I'm going to take a not a very big problem a very simple one in a moment and just you know a pipe pipes and filled small you have a source you have a filter you can compose another filter whether each of those arrows in between is going to be a channel and then you've got the sink at the end okay where you actually end up doing something with the result and you just progressively transforming data each one of these things will basically run until you offer some other kind of termination condition but they are all isolated in effect now I'm going to take this piece of code which is normally see I'm going to turn it into nollie C++ and what I'm doing here is I'm trying to find the next Friday the 13th okay given a date when is the next Friday the 13th so when does anybody who has a superstition need to get worried and so this gnarly code is not obviously a stream streamable example but it's small enough I can fit it on one slide so what we're going to do is we're going to generate days the first thing we're gonna do then we're going to filter out filter in select only the 13 to the month then felt filter in all of the Friday's and then we're going to display it so we can do that just by start date there's a channel I'm going to push it through that channel we're just going to generate the next day generate the next day keep on generating the next day the next thing we're going to do is we're going to only include things that are the 13th of every month we take from the input we pass on to the next one the output then we're going to take that and we're going to filter in only the things that are Friday's and then at the end we get to display it so it's a very simple compositional approach again the constraint of the pipes and filters means that every single piece of code is automatically locally sequential and easy just to plug in other bits and pieces so this observation from brandon rhodes is quite a simple filters can be arbitrarily change they are easily reused this is the composability that we kind of want and we don't have to experience the threading just as you don't experience threading when you use the command-line pipes and filters model you just aware of concurrency going on but you don't have to experience all of those locks because those are hidden from you so I'm going to close with the actor model observational grande drags and rescue multi-threading is just one damn thing after before or simultaneous with another one of my favorite observations about that actor base concurrency is rather boring it's just one damn message after another it is a function based approach it is the idea of being able to take a message take a message take a message and again everything feels sequential where you are you are just receiving and responding but you never see a lock so to contrast this let's imagine monitor objects the kind of standard approach people do let's imagine a phone book we're going to provide an interface you can update it's an up cert provide a name and a phone number you can drop an entry in the phone book and you can also ask the I'm going to use in the C++ 17 optional type there you can also ask for the name lookup this name and it either returns the corresponding number or it returns an empty optional if the number if it's not present okay we've got a key we've got entries and then we go through and we build up and we have the standard lock based approach lock and lock and yeah all that good right so that kind of works but it forces everybody to block all of the time whenever they're waiting and there are reasons that you might actually want to do this there is another approach but I'm going to skip that walk through there is another approach known as active objects but for reasons of time I'm going to leave that out okay because the code is a little bit complex and I want to keep life simple so I want to cut the actors sometimes the only reason that slide is there for active objects is that sometimes people confuse active objects with actors they are actually different programming models so the actor model what is the actor model the answer model quite simply is based the function I'm gonna have a phonebook and it's going to take a channel a channel of any which is a variant type that takes any value whatsoever I'm going to be able to pass it messages an entry that has a name and a number no entry and has a name and a find that takes a name and also takes a response channel in other words the value comes back you're basically saying I'm going to send you this request I'd like something back please the code itself structured is simply there's our entries notice that we don't have private data this is only a function there is no side effect in an object that is visible to anybody else this is all contained within a running function so it's local state we take a request then we check it is this an update if this is an entry then we update it if it's a no entry then we do a drop otherwise it's a lookup it's a find and depending on what we find if it's there then we return an entry if it's not there we return no entry yes I've recycled the messages but what we're doing is we're sending back the results so there is this very simple idea that we now have a channel of communication and this I tested this code single-threaded then I tested it multi-threaded and the allows you to do that very very easily so what we've got here is a very different way of looking at something that may already be kind of familiar and I'll just show you that in in action before we close I'm going to set up a directory a channel for that I'm going to run the thread phone book that's the function I'm going to run that then I've got another thread that is going to send in requests and we're going to send that's what it looks like in call I'm going to send to the directory please find me Thomas Anderson and we're going to we're going to declare a variable unfound I want to show you that when we're not there's no it's a new directory or it's a new phone book so that it's not going to be there we respond unfound that'll be no entry there is no entry for Thomas Anderson then indirect we've got another thread we add in update Thomas Anderson phone number one now we go and find Thomas Anderson and saw the thread we will end up with the result being found that's great so then we're going to add in Trinity we're going to update with Trinity then we're going to update with Morpheus who does obviously offer us the answer to everything offers as the pills 42 then we're going to withdraw Thomas Anderson because now we have neo and now when we go and find neo we do actually we find that he is the one and so basically we've got three threads going here and we have a very disciplined model of requests and response and this kind of structure is relatively easy to create just using the light light amount of code that I've explored here okay so it's time for me to wrap up as opposed to just over run managed to get through most of the things and hopefully there is a kind of a sense here of what is known as Austan any in the arts community but also it's obviously quite obviously a russian work and it's used specifically to refer to the idea of defamiliarization the artistic technique presenting to audiences common things in an unfamiliar or strange way in order to enhance perception of the familiar the idea here is that c++ has this kind of quality when you look at it from a very different point of view there is a particularly imperative classic view of c++ that people have taken without necessarily questioning what the language can actually offer and that there is a greater simplicity simplicity to be found particularly in the modern c++ if if you just look at c++ 98 you can't do some of the conveniences that i've described here but a c++ 11 onwards really allows you to sort of more fully embrace techniques that we've been able to see in other languages i hope that has been of interest and of use and i'm going to be around for a coffee to answer questions thank you very much you [Applause] [Music]
Info
Channel: Build Stuff
Views: 32,007
Rating: 4.8496242 out of 5
Keywords: Kevlin Henney, Functional C++, Pattern Language, Software, conference, Build Stuff LT, vilnius, lithuania, developer, programmer, TheOldNewThings
Id: CIg6eyJv4dk
Channel Id: undefined
Length: 61min 48sec (3708 seconds)
Published: Tue Jan 03 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.