ElixirConf 2018 - Architecting Flow in Elixir From Leveraging Pipes... - René Föhring

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hi everybody my name is Renee I work at five minds we're an IT consulting and software development shop in germany specializing and process-driven architecture and i'm just highlighting my employer here because they support me in going to events like this and I think we need this support by our companies in order to fulfill our mission to see more elixir software in the world speaking of software I'm Renee on Twitter and github and some of the software I created includes credo which is a document a code analysis tool and inge which is a documentation analysis tool funnily enough I won't be talking about either of those but rather something more fundamental which is flow and process and just so I don't disappoint anybody during my talk this is not flow the elixir library what flow should mean in the in the context of this talk is the way data flows through our programs guided by business logic and control flow representing a business process which we are tasked to implement and there are many interesting topics around this like validating your inputs at boundaries or which data to persist along the way and while we will touch on these subjects slightly I would really like to focus on the flow part so we're looking at business processes like this we are use this very straightforward example of a little image conversion task throughout my presentation this is a very straightforward process just consisting of five steps but the techniques we will use to implement this process will get you will be getting more complex over time I also brought this real-world example from one of our projects where you can see how it's all about moving and transforming data it's moving data over networks into databases and onto hard drives and transforming data which usually means expanding and reducing it and it often also means change semantics along the way and hopefully some value added in the process there's lots of possibilities to move and transform data and elixir because as somebody once said it's just modules and functions on a personal note this whole business of moving and transforming data is something that I find much easier in a functional environment rather than an object-oriented one but even in elixir it's not all rainbows and sunshine because human problems remain different developers use very different approaches to naming and nesting and formatting parameters and we have to take care that the general interfaces of our flow don't become too diverse parameters are suddenly named differently across the codebase signal just get clunky documentation gets outdated or is neglected from the start and if it's just one function that is documented or named or structured badly that's not a problem but when all the functions are it is so how can we avoid these problems well we should always aim for clarity by reducing uncertainty for the programmers we should communicate and tend both with our code and our data structures as flowing data structures oftentimes represent intent we want to create a structure that's flexible enough for changes but still rigid enough to work as a guideline because we want to take care that the code is accessible especially to new team members creating maintainable software should always be the highest goal and fortunately this is where elixir can help us because there are language constructs and best practices which mitigate dimension problems and these are pipes the with macro and the token approach first let's talk about pipes because you probably already know that one this is a pipe and this is an answer to a problem it's an instrument for building pipelines which are a series of pipes where one functions which one value is the next functions first input and this is not a new concept you're probably familiar with it from shell programming from bash or PowerShell typing something into the next process is pretty much old-school so back to our example process the little image conversion - we are going to write with its five steps pausing the command-line arguments validating these arguments preparing the conversion actually doing the conversion and reporting the results we can see how this is probably a pipeline so let's implement the mix task for it this is what the run function of our mix task looks like and if we look closely we can find all the five steps in there let me highlight those now this is all very good but maybe we could make this flow even more obvious using a pipeline and when putting these steps into a pipeline we get one of those pieces of code where I'm regularly thinking oh geez I could show this to my manager and he or she would probably get it because if we highlight the steps like before it becomes obvious by the five steps from the diagram are right there so are we done is this it let's do Q&A what's the downside of this well each functions return value has to be accepted by the next function if we look at the functions we can see that the signatures have to match the return values this becomes more obvious when we highlight them in different colors and now you can see how the yellow return value has to match the yellow function parameter the yellow return value is the result of our validate the inputs steps and if we wanted to return something else like a validation error then we would have to add a function signature matching that return so the advantages of type fly-in that the flow becomes very obvious at the top level and the disadvantage is that all the signatures have to match all the possible return values from the step before this can get fuzzy very quickly especially when we talk about error handling and this is one of the reasons why the whiff macro exists the width macro is an answer as well it is made for those complex scenarios similar to the ones we can deal with with pipes but where the api's of the different functions do not deliver exactly what we need for the next function so this is a tool of choice when constructing flows that involve third party functions outside of our control this is how our original example what you look like when we refactor it using the with macro and once again let me highlight the steps from our flow they're all there and one thing that is nice about the with macro is that we can write an else clause to catch any unexpected return values outside the expected patterns so for example suppose our validate options function might not return an OK atom but in some cases might return an error typo we can add an else clause to match all the unexpected return values and this is something we we could also have done with pipes but it would have been much more cumbersome and more importantly it would have been less visible at the top level because now we can not only see our five steps but we also see all the things that are anticipated to go wrong in our flow and the one obvious drawback here is that if two functions return the same kind of two element error or topple then we won't necessarily be able to differentiate between these two within a single else clause so which one's better well pipes enable pipelines and they are very useful for high level flow they can be ideal in situations where you control all the interfaces and where you can dictate the rules and just let it crash and burn when the rules aren't met the with macro on the other hand is a Swiss Army knife great for the nitty gritty low-level flow interactions where you call third party code or code by other teams or libraries that don't quite fit into a coherent interface so in the end this is not a question of being better but rather a better fit for any given situation and there's a third option which I'd like to call the token approach this concept is very loosely related to the context design pattern and command pattern and LP and even more loosely related to the concept of Elm architecture and the flux pattern and JavaScript and I think the best way to describe what the token approach is is to talk about popular examples from the elixir open source community which are active change sets plug connections and wallaby sessions with active change sets we're creating a change set by costing in this case a user struct parameters to change the user and a list of attributes that can change and then this change set is our token and that token goes down a pipeline we're working with this token which represents our intent to change data a block one action on the other hand represents a request to our web server waiting for a response we can modify the answer while the token goes down a pipeline we can print the response content type at response headers and send the response with an HTTP status code or we can even hold the pipeline to prevent the execution of further plucks on an individual connection here the token represents the user's intent to get the requested information as a last example we can create wallabies sessions which allow us to test our web services so in a way these are the counterparts to the plaque relation and this time our token goes down the pipeline as well we can visit the pass on our server we can fill in some text fields click a button and make assertions on the response here our token represents our session of the server so I think you're getting it by now a token is basically a struct that's hand down during the execution of a process that's handed down a pipeline but why is it called a token well the token approach stems from a boardgame analogy and I wanted to explain that analogy using this beautiful photograph of some Chinese gentleman playing a board game but then I realized that I am not really that sophisticated because I don't even know what that game is and since you should only talk about what you know I will explain it using the game of life Botkin so the token in this game represents you moving around taking different paths along a fixed set of branches and the token also represents intent to get an education to get rich to win the game and the token also represents resources which are transformed along the way like your family members your bank account your action cards etc and that's why it's called the token approach and because context was already taken by Phoenix so how do we design our own tokens and token api's which fields should be included how should we name each field so the meaning becomes obvious let's find out by designing these things for our little image conversion tool so here we have it our first token the fields on this one are the arguments supplied via the command line the fire drop target directory and the image format as well as the filenames to convert and the map for the results so this token will have all the information needed to fulfill our business case this is token API 101 you will always want to have a build function in your token API because this way you can ensure that the token you create is valid from the start in our example we want to make sure that the arc VV there's always a list which we do with a god and we always want to put a default image format next to building tokens this way you will also want to create functions for writing to them since this enables you to validate and normalize your inputs the image format is a great example for this so we will at a put format function to our API and this function we validate the given format against the list of allowed formats using a guard again but we can also normalize our inputs because some people weirdly prefer to write jpg with an additional E and using an additional function signature like this we can accommodate those users with very little overhead let's see how this works so first we build the token and the format is as expected JPEG when we use our birth format function to change the image format to ping we achieve the desired result and when we try to change it to JPEG but read JPEG with the additional e that works as well this brings us to the last point which is providing convenience functions driven by the business process so the file globe the target directory and the image format are the combined result of the second step in our business process and because of the fact that these only makes sense together we want to provide a function to write validate and normalize these options at once in this example we put in the format first because that operation as we saw can fail and then we put in the file op and the target directory after the fact now this is not super important in this country for example but the order in which validation and normalization operations are applied can matter in less trivial cases and what we want to avoid in any case is that the user modifies the struct all by himself not only does the above example look less nice than the bottom one it also denies us any possibility for validating these inputs and it chooses the order in which the values are put into the token which again might not be convincing at this example but in other examples the order in which you apply these operations might be an actual concern so what are the tips for designing your token and token API well you should always design your token around its intended use and you should design your token api's around three requirements and not around how fancy the code might look like use your token API for creating the token so it's in a valid state and use your token API for writing values toward to a token so it stays in a consistent state because you're able to validate and normalize upon writing and you should provide these common these convenience functions for common operations because if at any point you need to restructure these operations you want to do that in your API and you don't want to ask your users to update their application code so here comes the fun part let's talk about metaprogramming our own custom pipelines the leading question here is what if I like wanted like my own plaque pipeline well this example won't work it won't take us very far because we can't use the plaque macro in our command-line tool since applock takes a plug connection as a parameter and returns a block connection but what if we write our own macro and we call our macro step since these modules will be steps in our business process speaking of modules yes we are now switching to using modules instead of functions and the modules we register here have to adhere to a certain behavior so that we can call them let's have a look at that behavior a step is just a module implementing a callback named call that takes a token and returns a token now let's see what a single step looks like this is the first step in our business process that pauses are the options from the and even to the licks task and it is pretty straightforward we're just using option positive get the options from the are curvy list and set some default values and this is the complete module unabridged but we can focus on these three lines to see how we are registering the behavior and implementing the call function which takes a token as its only parameter and returns a token and naturally this is already functional we take a couple of arguments we built a token and then call our new step module converter pass options and what we get back is what we would expect a valid token with all the relevant fields filled in now this is a trivial part what we want to define is using pipelines defining pipelines using our step macro and we do this by using a builder module and using this builder module will give us our step macro and the step macro will have to ensure that all the registered steps are executed subsequently and that each of them returns a token so let's see what that macro actually does the using macro inside the step builder module looks like this and there are a couple of interesting things here first this import statement ensures that we can use the step macro inside our application inside our pipeline module and if we look at the definition of the step macro then we can see that the step macro is really simple it simply writes the given module to a module attribute this way the macro holds in the Commons at the top yield the module attribute declarations at the bottom there's just one problem if we look at the bottom at first the steps module attribute has a value converter path options and then the value gets overridden with converter validate options to prevent this we have a line in our using macro where we register the what you'll attribute with the option accumulate true this way the values of the module attribute are not overwritten but rather accumulated let's see how that works using this example we register the module attribute to be accumulated and then subsequently assign three values to it a B and C and at the bottom you can see that we get the list of attributes in reverse order since new values are prepended rather than appended to the list so we have our list of steps all we have to do now is generate the call function and as you can see here our call function calls another function called dual call and that function does not exist yet we were generated using the callback we register with a before compile module attribute the before compile callback is called you probably guessed it before elixir compiles the module and what happens here we're reading the steps from the module attribute we wrote them too and in the next statement we're using the steps to compile the body of our function and then we are inserting the resulting do call function into the pipeline module let's look at this compiled body function looking at this function it is important to remember we are building an ast in quoted form in all of these functions so how does it work well our initial statement is our token and you will see why that is in a minute and then we use enum reduce to iterate over the steps which are in reverse order and build our pipeline so it's enum reduce reverse order list starting with the last step we are working our way inside out and this is done by the compile step function now compile step takes two parameters the current step from the reverse list of steps and since it's used by a enum reduce the accumulator as a second parameter in the first line we are reading the call to the current step module into a variable and then we are unquote in the current call into the head of a case statement and within that case statement we are matching on the result of this call if we get back a token we continue with the already compiled statement if we don't get back a token we raise an error all of this might be easier to comprehend using this very simple example our pipeline consists of the three steps and that stopped in the top right corner appropriately named step one step two and step three and we start our process with the initial statement which is our token upon the first iteration of enum reduce we add the case statement for the last step remember the list is in reverse order that's why we are starting with the last one you can see here there is a call to the module step 3 the whole case statement is wrapped around our initial statement and upon second iteration step 2 gets wrapped around the already compiled statements and with the third and final iteration step 1 gets wrapped around the already compared statements and that's all there is it's a bit like those Russian matryoshka dolls which are nested into one another and it doesn't do much it calls the current steps it matches on the token if it doesn't find a token it yells at us and as a side note this is a very fast code this is really what you will T type out if you were crazy about performance and has a knack about typing repetitive things but thanks to meta programming we don't have to type it all out by ourselves and even better we can now do pretty much anything although we probably shouldn't if we look at our compiled step function again we can generate any source code we want here by modifying the case statement inside the quote block so for example we could add special logging to the results of each step we can edit right here after the token has been returned from the current step and before the already compiled statement our unquoted we could use a similar approach to add debugging capabilities or record metrics of the execution speed of each step in production we could also introduce a halt mechanism similar to the one present and pluck which would enable us to leave a pipeline earlier and we would do this by adding a new field to our token which we could call halted and then we would met on halted true in the case statement if that match is successful we just return the token effectively skipping the rest of the pipeline and here we are only executing the already compared statements when we can encounter a token with halted fault and to turn it up to 11 we could even expand our step macro to allow for conditional flows where we add another syntax to the step macro that not only states not only takes a module like in the first line here but that it can also take a do block with conditions which it checks before executing the relevant path so when should we use a start magic especially since the first rule of metaprogramming is don't use metaprogramming well you should have a clearly stated requirement for example when you have to ensure contracts are met between the business steps or when you have to be able to add things like additional steps or universal hold mechanism or metrics or reporting later on remember meta programming can help you ensure such a consistent flow but you can have a token and token API without meta programming because as we discovered it's just and functions so some final thoughts on the token approach I just told you that you don't have to use metaprogramming for everything well you also don't have to use this token approach for everything there are scenarios where you probably want to avoid it altogether for example when there are very few stakeholders and to define the data contractor seems unnecessary or when you have very many stakeholders but the requirements are still vague you're literally shouldn't sign contract not even the data contract when you don't know what you want or need also when the problem domain is very small for example imagine we have a web service and we have we're trying to make a lock-in attempt with email and password that's just one context in which we talk about this to couple of information we shouldn't want to introduce this because the overt would simply not be worth it and remember this is not a one-size-fits-all solution now when to use this approach well if many parts of your system have to talk about the same thing in different contexts this is great for example after our login attempt is successful in many systems what you then get back is an some kind of identity and you will use this identity for this user which probably contains an email and claims and a claim based security system or roles in lots of parts of your system to authorize actions to contact the user etc so sometimes the need of a contract is apparent and simply outweighs the overhead of yet another struct also you want to use this when you have several data pipelines in your business or when extensibility is a major concern when you want to add steps later on monitoring or logging or plugin mechanism or I think you get it by now so these are the three key takeaways from this talk think about the flow of your program what is the thing you're talking about and what are the essential steps make that flow as easily comprehendible as possible for new contribute new team members and your future self who hasn't looked at that code in a while and the major takeaway regarding elixir is that there are several options to cover the most common cases all the way up to creating your own version of a plaque pipeline if I got you interested in this topic I wrote a blog series on architecting flow in the legs here where we also implement all the examples I showed you what we could also do like the step micro that takes a do block and the conditions and this is published on my block at Revell op de which is also linked in my Twitter profile and for some further reading I'd like to recommend this article by Alex from Dockyard on using the with macro and this presentation by Laszlo on robust data pipelines from the last Alexa conf in Europe and that's all I had [Applause] questions yes I think on the left so gentlemen do you have the token API extracted into a library for reuse for other folks so that's typically the first question one gets when talking about this but I don't think like sort of a plug for everything makes sense because you always have those business specific use cases you always have two that it wouldn't make much sense to have some kind of macro that builds up the struct for you containing like one or two common fields and then you have to add your own business case specific fields later on anyway anyone else you all want to get to lunch early right hi I was wondering if you had any quick thoughts about if you were using this token-based design for a flow where the flow can conditionally branch that's something that using the width construct for me is a new electric developer have sometimes struggled with you know if you wanted to skip a step for example depending on a condition that can sometimes make a with construct a little bit more verbose or less clear I might have to break things out into functions so I'm wondering if you have an example with a pipeline that might be less linear how you would handle those branches using this construct yes so in the block series there's an article which shows exactly that which goes back to basically this because what this really is is that the condition token dot errors equals an empty list is actually an option to every step macro below it like step convert or prepare conversion and then if colum that and you will see in the blogpost how that works great thank you the with example that you had and you were like saying like you can't figure out which one through the air yeah there's a useful trick you can do there if you on the right side if you embed it in a tag tuple you can match that same tag to app on the left side and then it also match the air with the same tag it's a pretty common trick okay probably we should talk like sharing you can smudge if people don't know it's useful to know I'd be glad to to incorporate that hi you mentioned ensuring the consistency between each flow step and I notice that you in macro you were showing just raising a runtime error do you think there's any way to introspect on the ast paths tend to provide more compile time guarantees at each step in your flow actually conforms to the contract that the token the token pattern is expecting so I'm not sure if I'm understanding what you're asking so basically you could say this step needs a token that is like this structured like this and for example a certain value is not new or something well you would basically be able to code that using just functions and gods so I'm I get that you asked for compile time guarantee but I'm not sure how that would work yeah I mean that neither do I that I was just curious about something you get in it but I'm trying to make it up on the spot and it it doesn't work because in in the compile time in the before compile micro you just get the context of the current module which is your okay yeah I also had that slide but no matter [Music] riots here so this is where we where we get the before compiled and we get this m f-- parameter and that NF parameter only contains the current module but for what you're asking we would look to have to look to other modules for the ones that are referenced and get their aresty and then have a chance to put some guarantee there like a bit like a type spec maybe where you could say huh I just want tokens like this I should take note of that maybe maybe we can do that thank you yeah just piggybacking on Connors question I think you can implement behaviors right for those modules that would take this step right is there's a each module defines a function that follows the same signature so you could define the behavior and then the compile time it would raise a warning and you can actually extend that you could actually fail compile by the way you fashion your macro you can you can cause the the compilation to fail but that's pretty involved but at least you can get the warning if you if you don't follow the contract as your of the behavior so we were just at the the insight the behavior you would modify the callback directive that what you're proposing well each of these step modules yeah they implement that call and if they don't implement that call properly then you're gonna get a compiler error warning all right yeah yeah and you can actually write a macro to fail compile is what you know in addition to just a compile warn you can actually fail compile but that's more involved in that requires a lot of explanation I'll probably look into that that is a very interesting topic so oh there are errors during the pipeline you have any tips for how you would fight figure out which part of the pipeline had an error and maybe wish if you be able to recover from it at all if at all later in the pipeline so I'm not sure if I'm understanding you correctly so if we don't raise but rather recover or what is the thing you're asking yeah so say if there was an error when you're trying to validate the options or something but that's a specific case that you can handle later on so you basically would want to have it skip some steps or you yeah one thing I experimented with was having multiple pipelines because pipelines are basically also just steps if we look at this back up slide the pipeline is also just called it also implements the call behavior and what we could do if we had something like this halt mechanism we could say that at the start of a pipeline we would implement a custom call which also which first resets the halted flag and then called the generated function underscore underscore do call and at that point you could insert something like we want to report the errors so you could have an error pipeline and go on pipeline at that point but to be honest that's one of the of the examples where it's probably the overhead is not worth it because then if you if your fingers are itching because you find metaprogramming so fulfilling then you should have to hope that somebody at your company starts an intervention okay thank you hi do you have any generic advice heuristics maybe or potentially good ideas for token design itself you showed some examples of having an errors an errors field or a halted field are those good ideas in general I know the answer is always it depends but do you have any advice anyway so I would just look at the the popular popular libraries like actual change sets for example there's a lot of stuff in there that I still don't comprehend but I think knowing josée it probably makes sense at the end so he would probably the person to ask because for example a change set has a field that's called action and there you have insert update etcetera and when you call the update function with an insert change set it breaks but I'm not sure when when that would happen but that sort of is is one of the I'm rambling because I don't know the answer I don't want to say it depends have you thought about what this pattern enabling ways for people to inspect what the step pipe playing is gonna look like at any point so an example of this is in Phoenix with plug where you can have plug here use the plug macro cost many files and you may want to drop in at some point say okay what is the plug pipeline look like at this point in the code when things can be added you know this is a two-part answer for one you want to implement an expected for inspect protocol for example to hide implementation details so you can use i/o inspect more easily and look into the total and for the question what is the the struct looking light like at each point in time I would refer you to kratos source code because creator also follows with this approach and if you use mix creator or dash debug you get all the output of what the stuff looks like at any given point in time sorry buddy so but so say I wanted to know which modules were on deck to be executed I guess but as added by the step macro so I can definitely look at the shape of the token at any given point but if I want to see okay for this if I'm here what is that module attribute that I've accumulated all of these different different step things that implement call how many of those do I have and what order are they going to be called so basically in do you mean inspecting the pipeline to see yeah will be executed that's a well you have all the information in the pipeline if you start building it and then you would simply implement I guess something like the inspect protocol for that module so that you can list them out there's no more time for ya any more questions we definitely have to eat [Applause]
Info
Channel: ElixirConf
Views: 3,671
Rating: 5 out of 5
Keywords:
Id: ycpNi701aCs
Channel Id: undefined
Length: 42min 5sec (2525 seconds)
Published: Thu Sep 06 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.