ElixirConf 2021 - Adam Lancaster - Pattern Matching; Good In the Abstract.

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] [Music] um as you've guessed from my accent i'm italian no i'm joking um i'm from london i work at a travel startup there called duffel if you like holidays and really clever and kind people you can also work there too because we're hiring so um you can follow me on twitter if you want to see me rant at businesses for doing things wrong or you can follow my blog that i barely update as well so um today we're going to be talking about pattern matching and i'm asking this question is it good in the abstract so when you first start elixir you see something like this and you're like cool i know what that does right then you see this and you're like hang on that's backwards right uh and the reason is it wasn't variable assignment that you saw it was pattern matching right and it's a brave new world pattern matching is an abstraction around two things right it's like variable binding and control flow and the idea here is that if a pattern uh you sorry you define you have a pattern that you match against some data if it matches you can bind zero or more variables so that means that those variables become available in in the scope that you're in um but crucially if the pattern doesn't match then you affect control flow in some way right so that maybe you raise an exception or maybe you fall through to another case it kind of depends on the context of where the pattern matching happens so you've probably all got some great examples in mind of where you've used pattern matching and it kind of like helps simplify things so this is like a simple fizzbuzz implementation and what i find kind of interesting about this is if we kind of compare it to an implementation of the similar idea which doesn't use pattern matching and again it doesn't really matter about the details of it you can just get a picture from the shape of the code like that we when used when we use pattern matching we had like a much flatter kind of structure than what we do when we don't have it right and here's another example from in elixir itself this is how reduce is implemented and again it's not important that you kind of read and understand all the code that's on there but let's look at a version that doesn't use any pattern matching and once again you can see the shape is like a lot more nested so there's this property of pattern matching that kind of like flattens out conditionals which i'm claiming at least helps like us understand the code a little bit better which is cool here's another example um this is getting the best friend's best friend's name but only if the user is an admin so what i like about this is it's creating a picture of the data that we're matching against right and more than that that picture is drawn with syntax it's like the same syntax that you use to construct the data right so construction and destruction use a very similar syntax and what i mean by that is if you know how to draw like a literal map in elixir there's a good chance you're going to understand how like what a pattern looks like as well because they're very similar right so a pattern is this picture it encodes a lot of information and kind of construction and destruction are the same syntax um and that there is also like an another kind of related idea here which is that if we try and imagine what how we would get the same kind of data without using pattern matching we'd probably use functions right and you have two options there one is to define like quite a specific function something like my best friend's best friend's name right and that that feels like that's going to bloat the api surface area of user a little bit because it's like overly specific in some way the other option is that you define more functions that you kind of compose together and so all of those can have like trade-offs whereas like you can see with the pattern matching you can get quite succinct and like informationally like dense i guess a picture of what's happening yeah and the other thing is that when you look at this you understand what's actually happening right there's no abstractions here you know that user has an admin has an admin key on it that friends is a list you know and that they have friends and that's a list and you get a real good sense of what user actually looks like at least in some of the cases right cool so we love pattern matching i'm sure you do too you've probably used it everywhere but there is a problem with it we don't talk about it too much in the elixir community so this is the pipeline operator if you've used elixir for more than about four seconds you probably know what this is um and it's kind of a right of passage in elixir i think to take it upon yourself to improve this right so i'm no exception i did the same i've written a pipeline library and here it is the idea here is that pipelines have some state you thread that state through steps in the pipeline okay so here you know we start with one we can add two we can times three it's all looking good now it's probably a separate talk as to whether or why this might be a good idea but um oh yeah just behind the scenes this is kind of what's being created so we create a struct that has the steps in them the steps have actions right um now so what's kind of cool about this is you can define then executors for that function right and so one such executor we might want to define is something like with trace now there's a lot of code on here so don't try and read it all but if we break it down it's quite simple right we get the steps out of the pipeline and we get the state out of it if there are no steps just return the state we're finished if there are some steps get the current step execute it and sit and then if we error bail out now don't continue with the rest of the pipe and if there's not an error continue with the rest rest of the steps updating the state as you go right but then what's cool is you can now add logging in between like before and after that step runs right so you might add some logging before and logging after that times how long it took to execute that step which means you get in this situation where you can really easily add tracing to your elixir functions by just like making a one-line change which is kind of cool right so great you deploy that you get it going everyone loves it and you're a hero it's fantastic then what happens features with a capital f right the next thing people want is undo and rollback so if you've ever used sage or heard of the saga pattern this is that right so the idea here is that one of your steps in the pipeline can fail when it fails you want to run a compensating action for that step to kind of like undo it conceptually and then you want to step back through your pipeline running an action like an undo action for each of those steps that has already been run okay so to get this going we have to do two things first of all we add an on error call back that's like oh your pipeline's failed let's undo so we minus two instead of adding it and then the next thing we take our pipeline we take what was the list of steps from that to this so this is a zipper data structure okay and the idea here is that that left hand list you can see are all the steps that you haven't executed yet and the list on the right are all the ones that you have and so once you've executed a step what you do is take the head of that left list and prepend it onto the right hand side there and you move it across and what that means then is if you do need to step backwards you haven't lost any information like we had before and you can just get the head of that right hand list and move it back again and now you're back at step one right so what effectively ends up being true is that like that head of the left left list is always the current step you're on and the head of the right hand list is always the previous step cool so feeling pretty clever you know you deploy that everything's looking good but what's happened we've broken our with trace function because now this pattern match doesn't hold true anymore right and you might be thinking like oh just like chucking the guard claws and it doesn't really solve the problem right it just gives you a different error because still our function doesn't do what we want it to and the problem here is that pattern matching doesn't get on with abstract data okay so what is an abstract data type just to be clear so abstract data is a data type where you can think of multiple implementations of like this abstract idea so for example let's think of list list is like an ordered collection of stuff yeah there's many ways we can think of implementing that for example arrays where all of the stuff's in like a contiguous block of memory or like link lists fixed size vectors you know all sorts of stuff so you can imagine more than one implementation of the abstract idea like another example would be shapes if you think of shapes as being things that have perimeter and area functions you could have square circle triangle right and finally our pipeline is an example of abstract data because we can imagine having pipelines and reversible pipelines yeah um and it's kind of like a key thing about these abstract data types is that you can imagine having concretions that vary over time or vary at the same time so what i mean by that is you can imagine a programming language that lets you have arrays and lists or arrays and link lists at the same time or like our pipeline maybe just evolves its implementation over time right and you have to be able to handle just one thing but you have to be able to handle it changing over time cool so why does pattern matching not like abstract data well abstract data is trying to by definition hide details from you right whereas pattern matching is trying to expose them so i know what you're thinking um no um i'm i'm proud of that reaction if i'm honest i know you're thinking and it wasn't yeah uh it's you just add more functions right so you add a next step a previous step a current step into the pipeline and then you change your with trace function and all we do now is i call the pipeline module and hide loads of implementation details in there um cool so then that means if you know the pipeline implementation changes you can change it in one place and nothing else breaks in your system right and that's cool that's probably what i would do to be honest but it just leaves me thinking like what are we saying here are we saying like never use pattern matching then like i don't think that goal is really something that elixir is going to help you with and i don't think that's what we are saying when we do that so it's like uh is pattern matching like only good in this like abstract sense right the question i want to ask is can we bring them together can we get the best of both can pattern matching be good in abstract data times so there's two ways we can do this right one is we can make abstract data more direct in a sense okay the other is we can make pattern matching more indirect so what do i mean by the first one well the claim i was making that one of the benefits of pattern matching is that it shows you like actually what's happening right like peers behind that abstraction so it's like can we bring that to abstract data without sort of losing the abstraction yeah and kind of the only way i could really think about this was more in like the developer tool space like and i think it's an important thing to talk about often we you know we think about design patterns and way to structure code and architecture and they're all really important but i think we often forget that like we can write things to help us write programs right and so i was just like imagining what if we like okay when you are writing code it stands to reason that you want to do different things maybe from when you're reading code what's good about this abstraction is when you edit the code you only edit one function and it applies everywhere right but that's not as helpful when you're reading it where you want to maybe want to see what's actually happening so it was kind of like what would happen if you could hit a hotkey and just like flash up the pattern match that's being used kind of thing and see oh like a zipper's being used in that case or whatever um i don't know maybe your editors already do this but like mine definitely doesn't i just thought it was kind of interesting to think about cool so that's sort of like indirect uh sorry more direct abstract data what about the other one so can we bring in direction to pattern matching so the key to this is we want one pattern to work on multiple different data types okay and again there's two ways that we can kind of get that first we can already do an elixir okay because all structs are maps okay any valid map pattern is a valid struct pattern which you can see here we've got on the left one pattern that's being used for two different data types a bear map and a user right so in this sense you can abstract over these like data types we're matching on already with one bit of um with one pattern but there's kind of a big limitation here which is that it forces your abstraction to be expressed like structurally which means like via the keys that are being matched on right and you can kind of see this like imagine we define an area function um that wants to take in any sort of generic shape and cool area on it we'd have to put like area as a key in that um and there's like a place for this kind of abstraction but it can also be limiting like consider this we can have two different kinds of colors right rgb and cmyk and the problem here is like there are no key overlaps here like none of the keys are the same between them which makes it a lot harder to abstract over them and say like i don't know is the color red right because there isn't a thing you can look at that's common to both of those data representations now you would probably do this right you just get the struck name out of it and kind of call a function on there and that's fine but that is like a structural abstraction you're saying what's common about those two data types is that they have a struct name right so you're kind of it can be limiting is the point okay so then what's the proper way to get one pattern to work with many data types well re-implement pattern matching of course now don't get excited this isn't like production ready or anything but i'm certainly not brave or clever enough to attempt to re-implement erlang's pattern match and syntax but i am foolhardy enough to try it in elixir so i wrote a little library um let's try and step through and see if we can get an understanding of what happens with it so the first thing we need is like a pattern syntax yeah so the top line is the valid elixir that we all know and love yeah the line below is what we're going to use as our patterns right now so it just says var and then current step which is the name of the var and then rest is stored in a variable called rest okay now because we're writing this in elixir like we can't overload this equals operator unfortunately so we get kind of close by defining our own infix operator here okay and again because we're implementing it in elixir we kind of have one arm tied behind our back here elixir can just pat and match and makes variables appear in scope like magically we don't have that luxury here so we have to return an explicit map of variables that came out of the match that we're doing right so a bit more verbose but it's tolerable cool so what does that infix operator look like it just looks like this so we call this like magic function here which i guess we'll get to in a second and if there isn't a match we raise an error and if there is we return the bindings that were made out of it right cool so what's that that is a protocol so if you don't know protocols in elixir they're a way of defining one or more functions that take one or more arguments right and they let you define new implementations for those functions for each different data type that you have which is the exact problem we were trying to solve right so we define a protocol and then we can separately define an implementation for it so in our case we could say look if the first argument that we're like pattern matching on if the data we're pattern matching on is a list this is what you should do okay um so to actually do like a simple implementation then this var function that we saw earlier in our pattern syntax returns something that looks like this okay and that means then in our pattern match like implementation of the protocol we can this is a case where the data you're matching on is a list of one item okay and the pattern is a list of one variable okay so what this does is it just gets the variable name out puts that into the map as a key and then puts the value of the you know the item that's in the list as the value so you end up with a variable of whatever the name was which is the item in the list right and conceptually you can imagine doing all the rest of the implementation you know drawing the rest of the owl if anyone's seen that meme but um cool and well what this means then is that if we go back to our pipeline operator remember we had the steps and we turned it into a zipper well we could like wrap this into an actual struct right and then that that's going to allow us to implement match for zipper differently okay and there's lots of ways we could do this but like maybe we're like look if you've completed or not it doesn't matter if it's got one element in it and your pattern has only got one var in it then it's a successful match like an away you go and we can imagine adding more cases for the other you know edge cases below right cool so if we go back and edit update with trace this is what that looks like and it looks pretty close to what we had with the pan matching right it's not a million miles away and especially if we improve the syntax bit you'd probably get a good gist of what's happening there um but yeah we we just access the bindings that get returned and kind of like that's that's the result of it so what's really nice about that then is we've got this function now that can handle both pipelines and reversible pipelines in the same way um and what's even cooler right we implemented pattern matching with pattern matching which is just a bit fantastic if you ask me but but it actually led us into a problem and to see the problem let's consider what we might try next which is allowing extensible patterns so we want to allow users to be able to define their own patterns okay now the first step to doing that is using more protocols which we're going to get to in a moment but the first change that we have to make to allow this feature okay is we need to re change what this returns okay from being a tuple to being a struct okay and the clever people among you probably have already realized what's happened this pattern match now fails so conceptually what we've done right we've changed an implementation detail of like some abstract idea which is like a pattern doing that has invalidated a pattern match which means we run into the problem that we're trying to write the library to solve which means we need the library to be able to write the library it's like you have to go and have an ice cream and sit down and think about your life and probably throw the computer out the window so what's the solution well it's not pretty but let's step through it so the first thing we need to do is understand the problem okay so we have pattern syntax like variable okay and that's like a top level pattern you could consider it because you can just assign you know a value to a variable that's a pattern but variables can also you can have like higher order patterns in a sense right patterns that have patterns inside of them so the list pattern below is an example of that because it has variables inside of them right and you can have lists that have lists inside of them right so the problem is variable outside of a list is actually a different pattern from variable inside a list and the reason is the index right the position that the pattern is in the list is important and to see that you can look at this example right the top line matches and the second line doesn't because there's no bit of data where b is in the pattern right so when you're outside you're when you're a variable at the top level like you can just match in anything but when you're inside a list it's important to know what index you are to know oh wait in that data is there something at this index that i can is there a value there that i can save right and furthermore across different kind of higher level patterns if you will um different things matter right so in the bits below here on the maps you can see it's not an index that makes that's important when you're a variable it's the key of the map right so the bottom the bottom line doesn't match because there isn't a key a in the map that we're matching again so for list patterns index is important for map patterns keys are important you can imagine other patterns where other things are important okay so to be able to enable these extensible patterns what we need to do is three things we need to be able to add new top level patterns one we need to be able to define how those top level patterns behave inside other patterns okay and finally we need to be able to implement both kinds of those patterns differently for different data types okay so the first step is to make the pattern strucks as we pointed out what that essentially means is where we used to have that top line it becomes something like the bottom line and it returns something that looks like this so this is the list pattern which has patterns and then you can see what the bands are right cool the next thing we're going to do is actually swap the argument that the argument order of the match protocol so this is the protocol that we had before and we were dispatching on the data type now we're going to dispatch on the pattern what this has done is give it given us a seam right because now we can add new top level patterns by implementing this for them okay but the tricky thing is we still need to be able to add implementations for this data as well right what we kind of need is like double dispatch we need to go and find the implementation based on the type of both of those arguments not just one of them so to get that we have to like start calling protocols within protocols and it gets a bit mental so again lots of code here let's try and step through it a little bit this is going to be our list pattern implementation of the match protocol okay so the first thing we do is we get those patterns out that's the list of patterns that you have okay and we can reduce while over it just so that we can bail out if we have a match error earlier and then this is our accumulator the accumulator is a little weird the left there is the index of the pattern and the reason we have that is because like we said before the index is important when you're matching lists because you need to be able to say in the data was there a value at the index where the pattern is right um and then the right hand side of that tupol is just the bindings that we're accumulating up so they're the variables that we get add to into that map as we find as we find matches so yeah that's just showing you that cool um okay so we call this function listpattern.match which i'm going to come back to in a second but like conceptually it's like if we get a match we just increment the index and like continue on and if we don't we bail out and then this bit just like unwraps and or raises an error if there's a match error okay okay so what's this function then well this is another protocol so this let's look at the function signature here so the first argument is pattern what that allows you to do then is implement list pattern differently for different patterns okay so effectively what that is is allowing you to implement how a pattern should behave when it's inside a list okay which means you've got flexibility to not only add new patterns but add behavior for how they act when they're inside of a list pattern cool you've got the data which you're obviously matching on you need that the bindings where like if you're if your pattern is a variable for example you'll want to add to those bindings so you'll need to you'll need them and the index for the reasons we've already mentioned right cool so you make this which is cool but there's a problem because we've already defined this pattern to be a struct right but what's kind of cool is you can just do that right like def protocol returns a module so you can just make it a struct if you need to and kind of everything works out which is cool so let's implement this then so this is implementing list pattern for variables so what this is saying is imagine you have a list as your pattern and the first thing in it is a variable this is what's going to run in that situation okay so you might be noticing a pattern variable in a list is a protocol so um again we just if there is a match we update um what that is going to return right is the data that we accessed so it's going to try and access some data in the data that we're matching against okay and if it finds something where it should like at that index it's going to return it to us which is what that value is so effectively effectively if we get return the value we can update our bindings with the variable name that we're you know using okay and if there isn't there's no match um okay so why variable in the list and like kind of what what does that mean really so um yeah so i guess to think about why it's called like variable in a list is because by the time we get here that's what we know we are it's a bit strange but um yeah i guess i'm just going to skip that bit for now but here's that protocol there that gives us a way to define how a given bit of data should behave when you pattern match it against it with a list which has a variable in it i think cool so if we actually implement this for a list all we'll do is like fetch from the list the thing at that index and if it is there um if it isn't there return it and if there's nothing there then it's not a match okay cool so a bit mental but it does do what we kind of wanted and to kind of demonstrate that like let's try and implement a new pattern let's allow matching on zippers okay so to do that all we do is implement variable in a list for zipper so recall variable in a list is saying like this is what should happen if i'm matched against by a variable that's in a list pattern okay and all we do here is like a similar thing that we did with the list we just combine those two lists together and then we fetch and if there is something in that index then it was a good match and if there isn't then you know it was a bad match cool so all of that kind of buys you this here so we could define the list and our zipper we can have a pattern and this pattern is like data now so it can be stored it can be saved as a variable and passed around which is kind of interesting and then we can finally do the matches on them and both of those matches will return the same thing which is a map of a to one and b to two okay which is kind of cool um and then finally just to kind of demonstrate what you can do like if we want to add another top level pattern maybe like empty we can do a similar thing we define it as a protocol we make it struct so that we can implement it for other protocols and we fall back to any which we'll get to but essentially we define our match implementation for empty here and all that's going to do is allow a data type to implement empty for itself okay which is like this so in a list we'd say if i'm an empty list then i've matched an empty pattern okay and if i haven't then it's no match kind of thing um and the fallback to anything is we could implement it a fallback which just raises and says like invalid pattern syntax and that that means that you know it's not valid to use empty as a pattern on any other data type so you might rely on this to like you wouldn't put empty against an integer right because it kind of doesn't make sense so if you fall back to any any uh if you match against data type that didn't implement the empty pattern then you are gonna fail and you're gonna get like a decent error um and then you can also you know implement map for mt as well and you kind of end up the same thing cool so that might have been a bit confusing if it was and you don't get it straight away don't worry um watch it again on youtube if you're watching on youtube now hello um the question i have to ask though is like is this good oh so you were thinking the same all right okay yeah um i mean it was fun to write so the original things we said we liked about pattern matching were patterns are a picture construction and destruction had the same syntax and you understand what's actually happening well our syntax for the patterns has changed right so that's not true anymore um we've added indirection which means from looking at a pattern you can't tell exactly what's happening and if you use what i just wrote you probably couldn't tell what was happening even if you read the code but um so you kind of lost a little bit there right so the question is like is the patterns as a picture enough of a benefit to make it worth having like pattern matching on abstract data um i don't know maybe right like i could imagine a world where if we could tidy up the syntax seeing something like this you know um in fact let's step through an example so imagine this is what we had but the list implementation changes now and let's imagine we change it to like i don't know map list right and so instead of an actual list it's a map with numbers as the keys and say that gives you better random access or something right the point here is like if you change that to be like an abstract match i still feel that this picture is telling you something about the data that you're looking at even if you don't know the specifics of it's not a list it's a map list or whatever you still get a very good gist of what's happening so i think if you could fix up the syntax maybe there's room for this and also you could separate pattern matching on abstract data from like being able to define your own patterns i don't think they have to be together so you could simplify some things cool so is this mental i just made all this up well closure does a similar thing okay so closure has core match and enclosure you can pat pattern match on like sequence abstractions so you might have like vectors or lists and you can use a similar pattern to match on the same pattern to match on both of those which is kind of cool um yeah they have a really cool wiki actually where they talk about the implementation that they've used and the paper that they use to speed everything up to implement it like performantly f sharp also has these things called active patterns active patterns are like an attempt to allow pattern matching on abstract data it's like destructuring this abstract data in a way that kind of makes sense and again there's a paper that you could go read if you were very very bored and skylar has extractors which are kind of a similar idea right so is there a conclusion well let's try and make like a pithy kind of golden rule i made like a nice acronym here so say it with me um so it stands for if you are using pattern matching then think about if you are crossing an abstraction barrier and then maybe don't um yeah uh if if i was to summer out i would say like if you're pattern matching or non-abstract data go ham like have a great time uh if you're crossing a barrier probably don't try and do that pattern matching inside the module the abstraction itself right and then maybe like a question for the community do we think abstract patterns are a good idea question for you so i leave on a quote i've seen this guy on the forum a few times it's kind of he must be active on it um and he says for his worth this is a question i frankly frequently ask myself is it worth adding abstractions on top of pattern matching and i share this just to kind of um express like it's nice to know that i'm at the bleeding edge of elixir thought leadership so that's all i have [Applause] uh i don't know how we're doing for time because i didn't even start my timer but if there's questions why not like cool that means it was a perfect presentation yep um no it didn't make it through code review no uh honestly i haven't tried so i think but that's a good question because it's you know it's a litmus test do i think it's a good idea no i think if we really wanted it we'd have to implement it at the language level right but i'm not qualified to attempt that so all right thanks then everyone cheers
Info
Channel: ElixirConf
Views: 1,458
Rating: undefined out of 5
Keywords:
Id: O4v1nGBmPkQ
Channel Id: undefined
Length: 36min 42sec (2202 seconds)
Published: Tue Oct 26 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.