Elixir Umbrella - Microservices or Majestic Monolith? Georgina McFadyen - Elixir.LDN 2017

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] [Laughter] [Music] hi everyone I'm as Claudia said my name is Georgina and I'm a software developer at 8:00 flight London and today I'm going to be talking to you about the elixir umbrella in this talk you'll learn how to create an umbrella project how to release an umbrella and also how to make changes to it so you can run it in a distributed fashion and along the way we'll look at whether it's a good fit birth the Micra service architecture or whether it's actually just a glorified monolith mm-hmm so first of all what is an umbrella project so sometimes our project can get really really big and the elixir mix build tool allows you to break down your code into multiple apps with the aim to make the project more manageable as it grows so elixir lets you define one root project with lots of different sub applications underneath which together make up the final solution to your project so to create an umbrella app you just pass an umbrella flag to the mixed call and that will create you a root project which looks something like this and structure so you have your apps folder where all of your sub allocations will go top-level config folder and a top-level mix file where you can stay project wide dependencies it's not really any use without sub applications so we need to define some sub projects so to do that we navigate to the apps folder and then you can create your annex of projects as you would normally and at this point it's worth noting that you can pass all the usual flags when you create your mix projects here no echo supervisor Phoenix projects so once we've created two suburbs our structure will look something like this and if we drill into those sub applications it'll be a very familiar structure because they're just ordinary annex of projects the one difference is that mixes clever enough to know that you've generated the project in the apps folder so it knows it's part of an umbrella and as such if we look at the mix files for these sub projects the build path the config path and so on quite upwards to the root of the project and similarly if we go back to the root of the project and look at the config it's going to a grenade all of the config for all of the sub applications and this gives you some flexibility in terms of running your tests for example you can run all the tests and all the sub projects from the root of your project or else you could navigate to the particular sub app you're interested in and run the tests just for that project so maybe when you're developing a single project you can develop it and get faster feedback that way sometimes we have to we have dependencies between our subs so we want to have one sub app use code that's in another one and umbrella projects make it very easy to have child dependencies and where are dependencies within the umbrella application we could just use the in umbrella tree flag and elixir nose just to look within that project for it so this is useful if you don't want to host a library on git or hex or something like that so in summary umbrella applications help you break down a large complex system into smaller sub applications unlike sort of traditional micro services everything is hosted in one github repository so you eliminate the overhead of managing pool requests the order that protocol should be merged in the order of deploying dependent systems that's all taken away because everything's in one place and it's easy to run your tests from different levels and have dependencies between the apps so let's see how this paired in a kind of side project so me and a colleague decided to explore the umbrella and create a sub project a side project which we called Ferris and Ferris can be summarized as a learning portal where you could say blog posts and you could search for different topics there was basically just two use cases this is a snapshot of the landing page so if you wanted to save a blog post you would click the posts and that would take you to a page like this and you can tell I'm not a front-end developer from the styling and you put in the blog post that you want to save to read later and press save and at that point it would save to a post request database and all of your save blog post would be shown to their to the user underneath searching look similar you put in the term you want to search for and the number of results you want back and this is where it got a bit more interesting cuz when you press search we actually call out to various different sub applications which in turn will go and hit external api's so for the purposes of this talk just imagine when you press search it's going out to Twitter and it's going out to Wikipedia gathering results and displaying them back and as it happened we stored the results for the search in memory just for fun just to explore gen server as a way of an in-memory cache so our umbrella structure looks something like this we used a Phoenix project for the front end and we call that ferris front end we had a post service for the blog posts a DB service for the Postgres persistence various search sub applications and then in memory database because if you imagine when you hit search or post it's going to come in through the web app in a controller those controllers then delegated to the relevant sub application so we had dependencies in our web project of the post service and the search services and that allowed us to then just call code in those services directly like at the bottom here you can just use them as ordinary libraries so this is a relatively simple project why did we even use an umbrella well we wanted to see if it was a sleek solution for micro services or whether having the code in one repository would actually bring in different problems and even though it was a seemingly simple project there was actually a lot of takeaways and lessons learned and I'm going to share some of them with you now so the first lesson is around breaking up your project it's actually really really hard to know at the beginning of a project how best to split up your architecture and it's even more difficult to get it right first time and so you know how do you know the boundary of each supplication there's different ways you can slice it so we were using a web application so probably the most naive way we could have split the architecture would have been to have a web sub app with just a phoenix layer and a core sub app with all our business logic and this way our web layer would be very thin it would just have our controllers routing and maybe authentication manipulating headers and new requests and so on and then all of your business logic like your posts and searches could be on a core and you could separate them with folders so you've got less of a hard boundary than different sub applications so this could be a good starting point if you're not sure how your projects going to evolve on the other hand you could decide to split up your application for responsibility perhaps if your domains a bit more well known you can already identify different services and then there's the question of persistence should you have one database service that handles all of your database interaction or should you have a database bundled with each sub application that needs to use some persistence so by no means am i saying this is how you split your architecture but just to recap this is how we ended up doing for us we kind of went for the per responsibility split and we decided on having one database service purely to try something different in the past when I've been on like traditional micro services we've always fought to have a database per service but a lot of materials I was reading around elixir work advocating the use of one because you have one place to start ecto one place to manage pooling and manage migrations and so on so I thought I'd give that a try so one thing was really clear is that umbrellas really lent themselves to change so if you found out you were writing a kind of sub application and you didn't need it anymore it was really just a case of deleting that sub app from the apps folder and it was we also brought in existing projects under umbrella so the Twitter service was already written and we just pulled it under and it's just a case of moving folders around and then changing the paths in your mix file and config to point upwards so it really was a flexible project structure to work with the next lesson was around elixir versions so in your umbrella you can define a different version of elixir per sub project if you want now when you define a version of a dependency you're actually stating a requirement and you're saying this is the version that I'm willing to work against in this project and under the hood elixir will call a function called version which will check that your requirement can actually be satisfied so most of the time this works fine but occasionally I found I was getting an exception around invalid requirements and I figured out that this was because I had this kind of scenario whereby I have two sub projects that are dependent or ones dependent on the other the webapp was stating it wanted to use one version of elixir and it had a dependency on a post service which was using a higher version so it wasn't able to satisfy that dependency so this is maybe something that would happen if you're pulling in existing projects under your umbrella maybe the existing projects been around a little while and it's not using the latest version and so something to bear in mind and my advice here is really to strive for consistency if you can't afford to upgrade your versions and have more running or more less the same it's probably worth doing and the next thing you've gotta be careful of is circular dependencies where you have two sub apps that are dependent on each other in other languages I've used it's not always been that obvious if I've had a cycle and teams have often resorted to using like code analysis tools or it's only in investigating a weird bug that people have realized oh actually there's a cycle here and as developers we want fast feedback that makes us effective and elixir gives you really fast feedback for this so under your umbrella if you end up with a cycle you basically can't do anything because mix tells you straight away that you have to break this cycle before you can go on and you have to down tools and sort it out and if you do end up in this situation it's probably best to check if you've split your services to a to smaller level and maybe the two things that are dependent on each other should actually be combined and they may be related another thing we had to battle was transient dependency clashes so under our umbrella we had we have a mix file per sub application where you can define the dependencies that each project needs and I was using Jason to pause my responses from the Wikipedia API call so I wanted to use poison and of course I wanted to use the latest version of poison 3.1 now do you remember it's a web application so we have a Phoenix project as well so if we skip to the Phoenix dependencies so our front-end dependencies where you bring in Phoenix there and when you bring in a dependency it's going to bring in all of the dependencies it also needs so you end up with a ton of transient dependencies being pulled in so if we now go to the Phoenix source code and we look at what the Phoenix brings in we can see that poison is also listed there the second to last one but it's using slightly different version it's not using 3.1 so unfortunately you end up with an exception looking something like this where it's telling you quite clearly that the versions of poison are clashing between sub applications but thankfully elixir has a way that you can handle this you have to explicitly override the clashing dependency and you do this using override true flag and that tells the project that it should use the version stated here and ignore any other versions so the overhead here is that in your front-end app where you bring Phoenix in you have to explicitly state you want poison and a version that you want to use as well as well as dependency clashes we ended up with namespace clashes so if we look at our search structure under each search service we had a module called search and they look pretty similar on the same interface but they were going to different service external services so when I ran my Wikipedia tests everything passed when I ran my Twitter tests everything passed but when I ran from the root of my project I started to get some failures so the first thing I did was go back to Wikipedia run them they're all good go back to Twitter and then they're all good so I was like what's happening here so when running from the root of the project the apps seem to be running in alphabetical order so the Twitter test would run first and thus the search module for Twitter would be loaded then when it went to the Wikipedia sub at the Wikipedia tests would run and it would call search and it would actually find the Twitter search module which had already been loaded so in fact whichever modules already loaded would basically win so I was in a situation where my wikipedia tests were running against my twitter client so the way that we worked around that was to actually explicitly namespace the modules so that we had Wikipedia search and Twitter search the next battle was around cleaning up test data in the database so we have a few services that actually interacted with the database either directly or through dependencies so we've got the database service which were directly accessed directly persist and fetched from our post quest database we have the post service which had the database service as a dependency and we had tests for post service which ultimately would put data in the database and we have our web front-end which has a dependency on the post service which has a dependency on the database service so again we had tests that were also bidding data in the database and I wanted my database clean at the end of my test run so after a bit of research I realized I could use the sandbox the ecto sound box to create transactional tests so each test would start a transaction and then at the end of the test the transaction would roll back thus emptying the data that had been persisted in the database and that was unfairly trivial to set up you have to state in your test config that you want to use the sandbox pool and you have to get an explicit connection in the tests that will interact with the database so you can do this in a setup block and at that point it sounded like everything should be set up to run transactional tests but unfortunately sometimes I was running into ownership problems so they process when you start it in your tests is only started for your test process so when another process comes along it actually doesn't have a connection to use or it's not authorized to do so it's after a bit more research I realized that the sandbox was running but not in the mode I needed and I needed to explicitly state that I wanted to run the the sandbox and shared mode so that more than one process can use the database and to do that you just update your test helper and where you state that you want to use shared mode and then the test round and cleaned up after themselves as I wanted something else to be aware of is the boundaries that your tests cross so another example we have is almost like a full scenario test where our controller tests invoke their another service like the Wikipedia Service and that she goes externally to an external API as well and it's easy to forget how far through your umbrella some of your tests are actually flowing and we found this a bit difficult when tests started failing because with this scenario for example if Wikipedia a blip which it did have a few times with the Wikipedia test was start failing but also our web test would start failing and then you sort of on a bit of a chase as to where does the problem lie it wasn't too focused so we wanted to rein in that path a little bit and we started to do that by substituting external systems so here's an example of one of our controller tests we do a post on the search end point and give it the term elixir to search for that's then saved in memory and then we do a get on the endpoint which would look up from the memory and the results that had been retrieved and we assert then what's displayed back to the user and the search that we invoke looks something like this whereby you can see that the Wikipedia search is invoked and that's the bit that actually goes to like the outside world so we decided to extract that line which goes to the outside world under the hood and wrap that in a module and we called that module third path third party search and so we also wanted to provide a fake implementation so that depending on which environment we were running in we could decide which implementation we wanted so we created another module could fake search which just returned I can result and then we configured our config such that if we're running in the test environment we want to use the hard-coded result but if we're running in dev or production and so on we could use the third-party search and actually go to the real client and elixir allows you to do that so in your config you can just state what app you your you're in and then a key value pair so the key is search so in the code where you see search it would substitute the fake search in test and it would substitute there real search in dev and then you can just look up using application get ends and use that key as search and in the substitution will be done for you and that just left us to update the assertions because we're returning a hard-coded result here we just had to update our a sir to reflect that but it cut out that external call which shortened the route and put the tests more back in our control because obviously we're not in control of how Wikipedia will change so the takeaway here is just to be aware of how far your tests are spanning it might be fine that you go to the external systems or span multiple services but just have an awareness of how far your routes are actually going the last lesson or tip is around documentation so sometimes troubleshooting took a bit longer than normal it was a bit of a slog to sometimes get ideas on how to solve some of the problems and many tips were found through scouring and github issue lists and just finding blogs that like the community had written and there were some cases where tooling work was kind of playing catch-up with umbrella projects and you'd have a great long description of how to like deploy vanilla and Excel project but you know line or two for umbrella projects and then if the example didn't work you were back to trying to find you know a blog or something or somebody had done which explains thing a bit a bit better so now we have our a working umbrella we want to deploy it so at first I looked at xrm which is the elixir release manager but it straight away points you to distillery and it turns out you xrm does support releasing umbrellas but you can only do it in the kind of traditional micro service structure whereby every sub application would be its own release and you'd have to release them all individually and then kind of make sure that they can orchestrate and connect together distillery however sounded like it gave a bit more flexibility so I went with that and added that at the root level mix bar and once you've pulled that down you have access to various release tasks and the first thing you have to do is create your release config and by default all of your supplications are bundled together as one release so here you could say it's a little bit like a monolith you're just releasing as one lump all your functionality to one place if you prefer to release each sub app individually a little bit like EXR em you can pass the release per app flag and we generate your config and you'll actually see a lot more in that file and you end up with a separate release for every single sub app that you have and the really nice thing is you can do various combinations in between so here I define two releases one that had three sub apps and one that just has Wikipedia sub app so it's really up to you as the developer or your team to decide what sort of topology of releases will best suit your project so once you've decided on what structure you want you can actually build your release and start it up so you build it using mixed release and you pass the EM flag and that will determine which config is picked up so here the prod config exs will be picked up and once it's built you'll get various options on different ways you want to start your application so you start getting excited at this point because the text is green as well and green always means good so you start it up but of course it fails because programming is hard so you know you have to this is because this was a web app and then you'd have to start troubleshooting all of your release problems and I can see here that it was falling over because it couldn't find the code reloader for phoenix and that's the kind of dev tool that runs so in development when you're changing things in your web folder you'll automatically see the changes on your browser so that you don't have to keep restarting the server and we don't really want to deploy that to production it's really something that's just a dev dependency so we have to do a little bit extra and state the mix ends as prod and that will include that will only include the production dependencies and exclude the dev dependencies and it will also optimize the beams so that something that we want for production as a side note if you've got separate releases you would do this for each release and you can just pass in the name of the release that your you're working with so at that point start the app again and of course there's another error because programming is really hard and it's around the assets and so you may need to minify your assets but also as stated here you have to create a Phoenix digest and that gives a unique name to the hash of your assets so that should your assets change the hash will change and therefore your users will see the up-to-date content even if their browser is caching and at that point everything worked so it actually wasn't too difficult to get a release out the door so to summarize and releasing an umbrella you as the developer or your team have the control of how you want to subdivide your releases up and you really can choose to have it like a traditional monolith or traditional microservices and do what suits suits your needs best so at that point we were like okay well we're using a lick sir let's try and get distribution into the app so that's where you have like multiple nodes connected and so we had to think of a use case for this but first of all let's just take a step back and kind of see what like a cluster is in elixir so if you imagine you open up an ie X in your terminal it's an interactive elixir session and you can access any code that that IX has access to if we open up a second session it has no idea of the first one and they can't actually communicate they don't know of each other's existence by default but we can start them in distributed mode which will allow us to connect them and you do that by starting a session with a name and that gives it an identity so here you can see I've started this one with the name Batman and then the prompt has the name name in it and the session knows its own identity so if we do note itself it will print out the name and then if we start another session and we call it Robin Robin knows who he is as well but they're still not connected they're still completely separate but you can call no connect to actually make them aware of each other and then if you do no dot list you'll see that Batman now knows about Robin and note M connection is transitive so if you go back to Robin and list out the nodes that they're connected to it will include Batman even though we didn't explicitly connect them together so basically any time a new node joins a cluster it will have visibility of all the other nodes that are already in there so we can just define a node as a system running along VM with a given name so going back to ferris we decided that the search was maybe a good candidate for distribution because when you press search we could actually fire off requests to different servers and have Wikipedia for example searching on on one server and the rest of the app on another so if you recall we have this kind of setup and we decided on just two releases one with Wikipedia and one with everything else in now if they're distributed they need to be a bit more clever about how they communicate with traditional microservices you really define a protocol on the front as to how different services connect and often HTTP issues because it's a well established protocol and you can use ace so however we were using elixir we didn't want to bring in any external dependencies and cause elixir runs on the Erlang VM we decided to use our PC remote procedure calls to call methods on remote nodes and then collect the results so here's the changes we had to do to make Faris distribute it first of all we had to create our Wikipedia note so we navigate to the Wikipedia sub project and we start an IX session with the name and a cookie we'll come back to the cookie in a bit but if you now remember the name wiki at localhost you would then go back to the entry point of your system which for us was the web app and we can configure that note name the wiki at localhost as a child note and we actually have a tuple here where we put the node name and then the module name because if you remember we had namespace clashes so we're here we stayed the module that we want to call in the code which we'll see in a minute so then we had to update our controller to actually create the cluster so here the first thing we do is connect the nodes so you can go to your config and get the child nodes and then if you had more than one here you deter it through them and you can explicitly call connect to create your cluster then we have to update the code such that our PC issues so that the two nodes can actually talk to each other so here again we go and get our child nodes and then we call core which is defined at the bottom where you can see the RPC call where we pass in the note which is wiki at localhost the module which is Wikipedia search the function that's on that module we want to invoke and any parameters that you want to pass to that function at that point you can really sure the rest of your applications so we started our front-end and you have to use the same cookie so if you read the airline documentation it's saying the cookie is adding security and it's a bit like having a passphrase or something like that when a node is joining a cluster they're having the same cookie is basically it's saying I know the secret word so let me in and at that point you have Ferris running in a distributed fashion so there's actually very little change needed and it was quite isolated as well to allow the search to run in a you'd weigh in real life you'd probably be deploying to something like AWS but the process the steps are exactly the same you would build multiple releases so here we have two releases you would deploy the first one so you would deploy the Wikipedia note you would note their DNS that AWS gives you and configure that as a node in your web app and then you would deploy your web app configured with that node and you're in the same position so coming back to the title of the talk is the umbrella project a good fit for micro services and these days a lot of projects are very large and very complex so the ability to break that problem down and actually work on different pieces independently can be really useful on the other hand our monolith has a really negative connotation it's often used for legacy code something that's been neglected and degraded over time and is now monstrosity and can't be updated but if we decided from the beginning to use the monolith and we were aware of how it could degrade and we were disciplined about keeping it well segregated such that you know it didn't degrade in that manner we might earn the right to call it a majestic monolith so given the fact you could write really really terrible microservices or really majestic monoliths where does the umbrella actually sit so in terms of development you could say the umbrella was a little bit like a monolith because it's all in one repository however it does separate the applications under the apps folder so there is some kind of harder boundaries than a monolith I actually found working with a single repository really beneficial one of the client projects I'd been on which used not elixir but traditional micro services from the start we ended up with such an explosion of github repositories that was really difficult keeping on top of pull requests like merging them in in the right way some services you never even really got visibility of whereas I found with everything on one repo if it changed spanned multiple sub projects I could make them all in one go and I could also see what other people like my colleague was doing because everything was in one repository so you only had to like look at one place and you kind of became quite familiar with it in terms of the code communicating with itself well a monolith has direct access microservices you have to define the protocol from the beginning and the umbrella although you need a protocol of you're kind of doing distributed stuff if you're not you have direct access to the code just through the library dependencies with the in umbrella flag in terms of languages a monolith is typically one and micro services have the ability that you can use different languages which could be useful if one of your services is doing something that a particular language is really good at solving and with the umbrella you're kind of tying yourself in two languages on the beam but we all know that elixir underlying our super course so it's really an advantage and in terms of releases this is where the umbrella starts to shine so a monolith always shipped as one the micro service says are shipped as separate pieces which you then orchestrate together and the umbrellas really as you defined so we saw how we can split them in different ways which i think is good that that's in control of the team to decide so in summary I found the umbrella really interesting to work with and I felt they had the advantage that everything was in one repository like the monolith but you still had separate boundaries within your sub apps so the coupling was still fairly loose between the components and the flexibility and deployment was a plus because you could deploy everything to one server or multiple servers and even changing the umbrella to be distributed was very little in Packt so I definitely think they do work well as a micro service architecture but they have the advantage of being on one repository like the monolith so for me they actually combine the best of both worlds so perhaps I should have actually named this talk majestic micro services but it's really up to you and you should give them a go and see where you think that they fit best thank you very much thank you thank you any questions yes look at that Wow yeah I start from you thanks fact talk that was really clear just one thing to note not really a question but a comment one other gotcha with umbrellas which you didn't mention is that config is global so we ran into this problem because we're using a global logger application and we wanted to configure it differently in different apps but then you you have one of those configurations will win and clobber all the other ones so it took me a while to debug that please okay thank you how a very useful talker thinking just one quick question in your examples you use the two level of nesting so you had an app with a suburb I presume you can actually sub up the suburbs as well and I think we work in the same fashion I don't think you can have an umbrella in an umbrella which is essentially what you can have multiple Phoenix apps in your umbrella so if you've got like multiple front ends or like different views and like admin or yeah to web so entry points you can do that but I don't think you can nest the umbrella structure on a sub umbrella researcher thank you that very good talk thank you oh I'm not quite got a question the end of this but one of the things I've not really seen discussed in umbrellas or any of these security so especially if you're using the umbrella kind of approach it looks really attractive and you you've got no security unless you add it yourself calling across boundaries so I wonder I suppose the question is did you come across that here but my thought would be imagine I'm designing a typical front-end system maybe it's got a it's got a web bit and it's also got a payment processor and it looks really attractive to build that all as a single umbrella but I'm left with the possibility that if somebody were to find a way of doing a sort of local code execution somehow my web app they can call any module anywhere inside the umbrella so they've got full access to call my payment processor module that might also be within that wonder if you've given any thought to that or many suggestions but something that comes to mind is because I think you get this in any any project or even if you had everything in one Phoenix you'd have put maybe the same risk but if you were say deploying to AWS or something you can shut down security at that level so you only you can't have external connection to the server that you're deployed in so you could do something like that perhaps but I'm not sure you can because I mentioned you had imagine your hosts on one AWS knows your payment processor and your host on another your web interface because you've set up a cookie between the two if somebody were able to do things that you didn't want them to do with your web front-end they do have the ability to call straight in to all of the modules including all your private functions in the other AWS node and that's one of the things that I see obviously can build internal security but it's not so it comes out of the box test it that's that's just the way airline works I mean it's uh you know you share the cookie it's more for partitioning than a security mechanism there's nothing there okay and any questions oh so let's take the last one hi thanks for your talk so our team we've been using gen stage as under an umbrella for our separate applications is we already spent about a month just working out the infrastructure to deploy it and just getting around the testing testing part of it and we keep cycling back to the same kind of question of just a bit of a self-doubt of are we building it in the right direction as opposed to a monolith because I mean are our applications are each so small doing such a just such such a granular vaq fine-grained function and so I mean every few days we keep questioning it is should this all be wrapped again up into a few applications so I'm just wondering where do you kind of draw the line and yeah I mean I was hoping that I guess is so new that there hasn't been many projects actually gone into production yet so so we're just kind of on that edge there so just wondering where you would kind of draw that line that we were tiny and really distant delegating to another one I don't know the answer but maybe if you've got I mean I think there are some advantages over an umbrella versus a monolith in the sense that if you've got quite distinct areas of your application and so it's such that you've got two different suburbs they can be like in your team you can split the work quite effectively and you're not trading on each other toes and you can like work in parallel and and so on but at the end of the day is really just a glorified folder structure so yeah I think it's probably is a bit trial and error but I do like the fact that you have sort of harder boundaries than a monolith so even though you can basically achieve the same by having a monolith which you've break up in different folders it's not like strict and your abstractions or leaked potentially and you're not going to be miss area letter to that whereas earn and under an umbrella you can't really do that because it's either in one cell back for another okay thank you so we thank you Judy around of applause for [Applause] [Music] [Applause] [Laughter] [Music]
Info
Channel: Erlang Solutions
Views: 12,047
Rating: 4.9245281 out of 5
Keywords: elixir, ruby
Id: jhZwQ1LTdUI
Channel Id: undefined
Length: 42min 17sec (2537 seconds)
Published: Wed Aug 30 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.