Episode 7: Effortless Scaling With Automatic Clusters

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

I've really enjoyed this series so far. Thanks for making these.

👍︎︎ 5 👤︎︎ u/meta_stable 📅︎︎ Sep 19 2017 🗫︎ replies

Writing your own protocol for establishing the connection is pretty easy too! Made one that writes into a rabbitmq exchange, because gossip didn't work easily in a docker network.

👍︎︎ 2 👤︎︎ u/narrowtux 📅︎︎ Sep 19 2017 🗫︎ replies

These videos a great, on behalf of the community, thank you :-)

👍︎︎ 2 👤︎︎ u/MartinElvar 📅︎︎ Sep 20 2017 🗫︎ replies

Thanks for putting this together

👍︎︎ 1 👤︎︎ u/firl 📅︎︎ Sep 22 2017 🗫︎ replies
Captions
hi I'm Erin and this is exploring elixir in this episode we're gonna be taking a look at clustering with elixir so when I first started getting involved with beam languages myself the idea of being able to just spin up a bunch of yemm's and have them all connect to each other and then use past messages between them seemed like complete magic and to be honest it wasn't really clear how to set this up how to manage it or even how it worked so we're going to take a look here at kind of how this I won't gets put together so we're all used to running our elixir applications like this during development ie X - ass mix and this is the way to launch an elixir app and this starts of course the beam VM now if we go ahead and give the VM a name or the node a name we can do this with a dash - s name command line option or we can do it with name in the difference being that with name we have to provide a fully qualified name so that would include some sort of hostname or IP address along with it but we're going to use short names and it's good enough when we're doing it on a local machine when we do this it's it tells the VM to automatically start a small daemon in the background called EPMD which is the airline port mapper daemon and what this guy does is he just sits there and waits for VMs to tell them hey I'm here and this is the name I'm I have and this is the port I'm listening on so if we query EPMD right now it's running and it's on port 4 3 6 9 and you kill it actually as well and we can see that we can't connect to it anymore it's not there but as soon as we start one VM with a node name it's going to automatically start again and as we can see it already sees our note their name node 1 it's listening at port 42,000 671 no this is configured this is all just you know zero configuration Auto magic kind of stuff so the different VMs use their EPM DS locally and across the network to find each other and when they find each other they can then connect to each other and and this is quite easy as well so let's spin up a second node and we're going to connect one to the other in fact what we're going to do is we're going to monitor four nodes coming and going so I've got a little monitor command here or a function and what it does is it uses the net kernel module from aerelon and it calls monitor notes true and what this does it just tells the the beam that whenever a node event happens so when a node joins the cluster it leaves the cluster let us know send us a message and then there's a monitor cluster which is just a little received loop and it's going to receive note up and know down messages so let's start that on both of these and it says UK we have no nodes that are connected to us that we can see we are alone and and getting the list of nodes is equally easy so you can do this using the node module and it looks or no dot list will display all visible nodes but there's also the idea of hidden nodes where you can have nodes that are connected to not the whole cluster but just one specific node in it and they don't show up by default when you just do a query for all the names so we've got our two VMs now we're monitoring them we're able to list nodes and if we look again EP MD and ask for its names EPMD is aware of both of our nodes and again you can see they look at their on different ports listening on different ports there so we can simply ping one node from the other and that action alone will actually cause them to come together in clusters so we're simply going to call ping which is and as we can see here it's part of the again the node module in elixir we're gonna call pain we're gonna give it the name of the other node so it's Note 2 at localhost and we can see immediately that we got a no joined event which is great and and on the other node we also got a no joined event and they can both see each other now and that's it that's all there is to to clustering and if we had a third node here we won't bother monitoring I'll just ping node and let's ping node to again get our pong back and again on both of the other nodes we see that there is a node that has joined and if on node 3 here we ask for visible nodes we can again see node 1 2 and ourselves of course 3 so now they're connected in a three-way mesh so this is like really simple and very straightforward now there's a few downsides to using EPMD in this way which I'll actually cover in the extra episode in a couple of days and have to do with some scalability limits and security but we'll go into those details in another episode here for now though you may be looking this going well that's really cool but how do I know what the other nodes are called and how do I manage this and typically the way this is done is you set up in your application configuration the fact that you have a node 1 and node 2 and node 3 and what host they're on and you you just put this in the configuration and then they can connect to at least one other node maybe tell you to note about one other one and then you can build up your cluster this way but it really does kind of require some foreknowledge of where your nodes are and where they're going to be and this doesn't always map to how we deploy these days we may be deploying very dynamically using kubernetes we might be on a cloud environment when ec2 or similar so we may not know in advance what our nodes are called and we may not want to know so how can we get around that that limitation kind of removed that last very make it completely magic and just have nodes that are available to us as soon as they appear so we'll get out of all of our nodes here we'll start from scratch again we're going to be using a library called Lib cluster and this is a great little library it simply asks something on the network you for the nodes to be here when the node are there or when they're not and it has different so different back-end strategies and we'll take a look at one here so I've got the configuration for it for this application anyways here and you said if what a call topologies and the statistics describes how you're going to get lists of nodes essentially and and the strategy I've used is gossip so this is a UDP based protocol and just as the basic gossip protocol but there's also several of the ones supported and you can use kubernetes with this there's an ec2 module available there is a pull request in a fork repository or from a forked repository of Lib cluster then once dns-based discovery service discovery so there's several different ways of finding your nodes and then you can optionally provide connect/disconnect list node functions as well as define some customization of the child spec which we don't really need to do for kubernetes in ec2 you obviously will need to tell it a bit about your accounts or where the kubernetes is and how to query it and that's what the config is for for the UDP gossip this is good enough for a local network or they're all in the same kind of network segment and we don't have to do any configuration so it's like truly magic and we you don't have to like I said use the connector disconnect here you can just let it use the normal provided functions for this but we've overridden these ourselves and with a connect node and it is connected node function which we can see here and so they just simply write out to start out here some information and then they call net kernel connect and net kernel disconnect which is what it does by default if you don't provide a character disconnected but this shows you can do whatever you want you can kind of get information much as we do with monitoring a bit earlier we get information whenever Lib cluster detects that there is a node that should be connected or not and you can then make some decisions do I want to connect this node do I not want to connect this node and then take action from there so let's see this actually in action this is all there is just these few lines of configuration and then all we've to do is just make sure that loop cluster has started now know this would happen for you automatically as soon as you have it as a dependency elixir will also start Luke clusters application but we've purposely turned that off in my mix study excess so I have to start it manually so we can see it in action so let's bring up our nodes again and we'll bring up our auto cluster and then we'll start the auto cluster let me just grab that great so we can see we're monitoring again we're seeing some heartbeats now and this is the gossip protocol in action I'll start on the other ones and you can see that immediately they're finding each other and joining each other and if I cut one of those and the notes departed no two at local is gone I no longer see it if I bring it back up and we start the auto clustering it joins again so it's even more magic that we don't need to know the names of the nodes and this is absolutely fantastic and it's blue classes a great little tool to be able to build automate automatic building of clusters up especially using something like as I said kubernetes or EAC - or DNS service discovery now there's one other little fly in the ointment potentially and that is that we may not want to be sending EPMD or deploying EPMD with our application might even not be able to depending on the platform that we're deploying to and this still the cluster still by default uses EPMD just use what comes with the beam so there is also a way around that and and I've got an example of it here and this is districts I can't take any credit for this code it's from a really great blog on the airline solution site I'll link that in the description below but you can also see it in the comment on the top of the source code here and it simply implements a discus module and then it also includes a service underscore disk which is a little bit of magic that we're gonna see in a moment it's not just a implementation detail and this underscore dist actually does the listening the Select they accept accept connecting up and close for instead of EPMD and then we have a client that start links registers nodes etc we've got a little bit of a hack in here and that it decides what port it's going to listen on based on a number that we provide as part of the node name that's just that we can run multiple VMs on the same ocean they won't be using the same port there's not the only way of doing it there's other ways but that was the way they used in the example and it's good enough for for demo purposes so what we need to do is we need to tell the beam that when we started that we don't actually want to use EPMD we'll go kill it here what we want instead to do so we can say the tell it by just passing in start EPM defaults and what we need to do is tell it about our proto or our proto dist so the prototype for distribution that's our service and then the EPMD module is the client side of it and that was our disk client and then we just start our application as you would a normal so dude and I don't need to give it a node name here let's do that and we'll also call us one node two and we'll call this one node three again so we have our three nodes we will start our auto clustering it works just as before but now we're not using EPMD at all it's direct VM to VM connections with no external process in the way so they're lying and they're only beam really provides a lot of flexibility in how we actually build deploy and work with our clusters and you can make them truly deployable and absolutely magic so thanks for watching this I hope that you learned a little bit about - how to create a cluster and some of the options that are available for you in doing so I'm hit the subscribe button if you'd like to see more and we'll see you in the next episode
Info
Channel: Exploring Elixir
Views: 4,356
Rating: 4.9689922 out of 5
Keywords: elixir, erlang, cluster, scaling, automation, ec2, kubernetes, clustering
Id: zQEgEnjuQsU
Channel Id: undefined
Length: 13min 10sec (790 seconds)
Published: Mon Sep 18 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.