SCALING GOOGLE CLOUD FUNCTIONS

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hey welcome back and in this video we're going to deep dive further into google cloud functions and look at more of the non-functional elements of this so in particular we're going to look underneath the covers we're going to see what environment variables and headers and ip addresses that actually run underneath the hood in the google cloud function runtime environment so what we're going to do is send a whole bunch of traffic to our function up in the cloud and then we're going to do some interesting things we're going to see how it performs how it scales up and down and see what that looks like in a dashboard now we're going to start crashing the application as well right have a little bit of fun see how our active instances go up and down and then we'll actually even look at things like deployment right what happens to our instances when when google cloud functions are being deployed right how does it manage to keep our sites highly available and with that we're going to get started so the first thing we're going to do is we're going to open up our express application that we used in the previous video so if you haven't seen that video before or you don't have experience of google cloud functions if you look in the top left hand corner or right hand corner i can never remember which one it is but if you look in that corner what you are going to see is a very detailed introduction video but to be honest if you understand node.js applications already then you'll follow along fine but if you do want that detailed introduction as i said the video link appears in the top right hand corner okay so let's get started so the first thing i'm going to do is open up vs code so if you look at my screen i'm just going to type in uh code dot in my application and then you will see this very simple express web server that i created in node.js the last time all it does is by default it's just gonna send back the words hey that's it um if i pass in the query string the parameter message then it's also going to return back whatever message that i pass through on the query string so that is the simplest web server you can you can imagine um for that to work in google cloud functions there's a couple of things i kind of want you to be aware of is if you see here you see there is a sort of dev dependency called functions framework and that basically allows you to run google cloud functions locally on your machine before you do a deployment again it's all covered in the other video so because i've installed that you'll see it calls mpx functions framework dash dash target hey so all i'm doing in this particular case is when i type in npm run start then what it's going to do is rather than kicking off node.js as normal it's going to run my google cloud function in a similar way to what it would run in google cloud functions on the cloud okay so let's let's do that quickly so if i go back into my terminal and i type in npm run start as i said i would uh we just kick that off you will see it it's now running in localhost uh 8080. and we'll just expand that out if i type in localhost 8080 um you will see it has come back with hey as i said and as i said before if i was to pass in uh a query parameter called message and then i could put anything i want here so i could put in worth and then hit return and then it's going to return back worth so what i'm going to do now is i'm going to deploy that up onto google cloud i have pre-installed the google cloud shell sdk again i cover that in my other video so that it by default is on my machine already um so all i need to do to deploy that function is just type in gcloud functions deploy and i'm going to give it because it's the same function that we did in the last video i'm going to give it the exact same name so it's going to be gcp funk dash hello two so next thing i need to do is specify the entry point for the function so in this case it is dash dash entry dash point and i need to put the name of the function that's going to be executed whenever my web server is hit so in this case it was hey so if we go back to my vs code for a second you can see that i'm exporting my express application as the function hey so it's basically whatever name i give there is what i'm going to be using as my entry point so in this case it's hey um i'm going to use the runtime i'm going to set the runtime to node.js 14 because that's the version that i'm running and it's what i want to run and because google cloud functions can support multiple different types of google cloud functions so it can support things like http servers it can support things like cloud pub sub um invocations i'm just going to specify that i want my trigger to be http now to be honest actually that is the default but i like to specify anyway and then the last thing you need to do is you need to specify what the project is so in google cloud you you have this concept of projects and every function that you have has to be associated with one in my previous video i created a project already called chuck hello world test two so actually if we were to go into our console for a second so if we go to console.cloud.google.com that will bring up our google cloud dashboard and there you can see that chuck hello world test 2 project that i previously created in the other video if you don't have a project then you would have an empty dashboard when you did the sign up and then you could just click on and create a new project but as you can see i've already got this project called chuck hello world test 2 already set up and you can see i've got gcp funk hello to already up on my dashboard so all i'm going to do is set that project to to chuck uh hello world test two and to be honest if i wanted i could change the name of this function right so um right less rather than calling it gcp funk hello to let's call it gcp funk hello three and then we can see it's gonna be a different entry and we'll have a brand new clean function there so if i hit return it's gonna take a few minutes or so to uh deploy that um it's got a few questions there so the first question it's got is do i want to allow unauthentic identifications that basically means that do i want it to be a public access and i'm gonna say yes and and then it's going to go off and do the deployment alrighty so you can see that is now deployed and if i want to test my functions work in okay i can just grab the url from here uh i will copy that and then if i go up into my browser i can uh paste that in and then hit return and you see it's returning hey as we did before and then i can obviously uh pass in a message uh whatever i want and it's gonna display so we'll put in woof and wolf comes back so we've deployed our application and what we're going to do now as i said is we're going to have a look at what happens underneath the covers so the first thing i want to cover is this idea of high availability and self recovery so as you can see at the moment my function is running within a region which in google's cases is us central now google has multiple regions throughout the world but in this case i'm keeping this to us central but i could run it in uk i could run it in any other country and google has a full list of that now what that does mean is my function will remain within that region it won't go outside of the region at all so i don't have multi-region availability but i do have single region availability but what google will also do for us it will manage the running of the underlying containers i don't need to worry about that so if things like my container fail or if i need to deploy a new version of my function then google will manage all of the management of the containers underneath that will manage the infrastructure and it will be able to handle um basically restarting that container and getting it uh deployed onto another machine and actually even has multiple availability zones within uh a region so what i mean by that is they have several data centers which are a certain space apart um and therefore even if you lost one of these data centers or these availability zones then there will be other data centers that are available and google will automatically pick up your container your function and it will run it in another one of these availability zones so that gives you quite a lot of high availability right and all that's handled for you you don't need to pick it up and redeploy or move it around it the google will handle that for you but what it also does is it's able to handle things like deployments so let's say i have a function and i'm going to make a change whatever let's say whatever i want to do there then google for example will handle the deployment of that function for me so rather than you know stopping the function and then my function dying and and nobody been able to access my service what it can actually do is it will ramp up the number of instances i have it will deploy my new function and then ramp down the old one and then it will do a cut over from my first function to my other function and you can and we'll prove we'll show that working but you can kind of even see this on the dashboard so if i go into let's go into vs code for a second and then i let's let's say i decide to add a new uh uh end point here so let's say i have something called forward slash hello and then what i'm going to do is do a res send hello here we'll get rid of this first line here because we don't need this but if we have a res send hello so now if i go forward hello it's going to return hello and i actually come back into my terminal and we just deploy this function again so we'll give it the exact same name so what we're doing is we've got the we've got a new function that we're going to be deploying or a new version but it's going to replace the existing function that is deployed in google cloud so that's going to take a couple of minutes to do that and then what we will do is look at the dashboard underneath that and see how google ramps up the instances and ramps down uh the instances to be able to manage that deployment process okay so now that it's deployed if i were to come back into this if i hit return you see it's still returning wolf and if i put in another value you see my function's still working but now if i uh rather than calling uh my default if i do forward slash hello uh you see it's now coming back with hello so my function is working so we're going to do now is we're going to open up the dashboard so we can see gcp func hello 3 there and we'll have a look at actually what happened on the dashboard when we did that deployment so first thing to see on my screen there you can kind of see the it says version two deployed at five fourteen fifteen right so that is when we ran our g cloud function deploy right um and and kind of look at this right so in this one here i want you to look at invocations per second on our dashboard so at 509 i made a call to uh on vita web browser and you can kind of see there and then at 5 15 after the deployment happened we we made another call and and just so everybody's aware you know i made another call sort of 523 there so we've we've had three calls via the browser there that's fine so if we scroll down a little bit further we're going to ignore execution time and memory utilization for a second what i want you to look at a here is active instances so at 509 we were currently running with one instance so you can see one instance was running so it was the number of active func instances of our function and then look here so at 5 16 right we were running two instances so that's roughly when i hit the uh the the browser and why are we running two instances at that point well because over here at 509 we were running the first instance the original version one of our function right that's still there to to satisfy any incoming traffic and then what happens is we've deployed version two so what google does is it leaves version one running and then it spins up another version which is version two so both version one and version two of our function are running at the same time yeah and then what it does is it redirects traffic incoming traffic to our new version version two but version one is still running and then later on what it will do is spin down the original container the original function and we see that here so if we look at now um we're now back to running so this was logged at 524 so you see at 524 we were running one instance over a container so you can see in between that time between 515 and 524 it eventually said okay my new version two of my function is running fine i don't need to run the original version one anymore and then it just spins down the old version and all traffic is now going to version two so that's how it does deployments so and and that means that you get high availability so if you're deploying a new function it means that your system as a whole your server as a whole isn't going down the old function is there it's running and it's uh and it's running in parallel to the new function that you've deployed and then it moves all the traffic across to this new function and once all traffic is handled by the new function and we know that it's working well then google spins that down and and the new function takes over so that's kind of cool right so that means that this is not something i have to manage myself whether even in kubernetes and i'm using uh rollout deployments etc that is all handled under the hood and that sort of maintains that level of high availability and even if i go back uh years before right we were deploying services uh manually ourselves then you would have to sort of time things and move things around right all of that is fully automated and handled by google as i said before what we want to do is get underneath the covers and actually see what's going on in the cloud on with google cloud functions so to do that what we really want to be able to get access to is things like anything that's passed in the headers the environment variables or even things like the ip address and then we can use that especially when we're starting to do things like crash the application and see what goes on underneath so the first thing i want to get access to is the uh environment variables so i will just uh like the same way as we did before is if we pass in m as a query string parameter then what we will do is uh return whatever ever environment variables that are running in the container at that time so to access that we're just going to call process.m and then that'll get displayed to the screen the next thing that we want to get access to is whatever headers are in our express application whatever is passed through from google so to do that we will just modify this and we'll write request.query.headers so to do that all we need to do is do a json.stringify and then pass in rec.headers and that will return that and and actually even before we sort of go any further if i just come back to my console for a second if we hit npm start and then we go to our local machine um you can kind of see that if i just hit localhost 80 here it's saying hey as we did before but if now if i put in m equals true it's going to come back with all the environment variables associated with my machine again i'm not going to go into details what's going on in my machine because actually what we're interested in is what's going underneath the hood in google cloud functions so we'll do that when we deploy to the cloud but just to prove that headers is working as well if i type in headers equals true you see it's coming back with my connection you can see my user agent you can see my host name et cetera so again we will look at that when we deploy to google cloud now there are two other things i want to be able to do so the next thing i want to be able to do is have access to the ip address so by having access to ip address i'm going to be able to tell that when we crash the application or when we scale up and down where it's running or whether i've got a different instance of my container so to do that what i need to do is just kill this for a second and i need to create a new dependency so i'm going to use a package called public ip so i'll just npm install that and we'll save that into my package.json so that'll take a second or two to run and then if i go to my vs code and we look at the package json you will see i've got this public ip package now installed now that that's installed what we'll do is we'll just declare that at the top there so calls public ip equals require public ip and then what i want to do is grab the ip address on startup so we'll do that as a an async function so we'll just start this up and what we'll do is we'll capture the ip4 address and the ip6 address um so we'll just do an await public ip dot v4 i'll capture the ip6 as well so ip6 equals await public ip dot v6 that's fine next thing i need to do is just declare my variables so that i can access them a bit later on in my app so do that and then obviously i need to call this function as well and now what i'm going to do is just very quickly let's copy this is send the ip4 and ip6 addresses uh whenever i uh ask for them so we'll just put request query ip4 and then we will send ip4 back and then similarly we will do the same for ip6 so we just made that change we'll change ip4 to ip6 and change ip4 to ip6 uh if i restart my application so if we do an npm run start and then we go back into my browser you'll see my machine is still working which is good and now if i change that to ip4 equals true it's returning my ip address so that's working fine now the last thing i want to do before i deploy to google cloud is it would be cool in the same way as when we did a deploy and we saw the scale up and down it'd be cool to see google recover our functions or recover the containers automatically underneath so the nice way of doing that is rather than sitting around waiting for it to crash is i'll just make it crash so how we're gonna do that is we will go back into vs code and we'll add one more uh query string check and this time we're gonna look for whether it says crash so we'll just put crash and then what we'll do in that case is rather than sending anything we're going to do a process.exit so that'll just kill the process underneath that and again if i come back into here so if i do an npm run start and you can see it's returning the ip4 again but this time if i change ip4 to being crash equals true help if i can type and then you see it's exited and then you see it's killed my application there so that is our application built so what we're going to now do is just get that deployed across to google cloud and then we'll see what's happening underneath the hood so to do that what we're going to do is just call our google cloud function deploy i'm going to be lazy i'm just going to do another run over here so i've just pulled it from the history of my my shell so we'll just click that and then it will take a couple of minutes to get deployed alrighty now that that's deployed what we can do is just grab the url from our console there and then if we just bring up our browser once again and just test it works for a second so we'll just click here and you'll see it's returning hello three and again if i put in message like it did before it will say wolf so that works okay and of course if i uh put in uh hello uh then that should work as well and it says hello now what we're gonna do is just have a look underneath the hood and have a look at some of those headers and environment variables that we've seen so first thing is let's pass an end equals true and then we'll see what environment variables are there okay so this is pretty cool stuff actually so i don't know what half of these mean by the way but but what you can see is is there's it tells you as an environment variable it says what the function target is that's kind of cool i don't know what gcf block runtime is but you know there we go um you can kind of see where uh node.js is installed or or where is no path is so you see workspace node modules so that's kind of interesting that the working directory that we're in is in workspace that's kind of cool um you see the nodem environment variables is set to production um the home folder is forward slash root uh you can see this debian front end so actually there's something really interesting that we that we know there for the underneath the hood here is a debian uh operating system that's powering this container it's probably google distrolus that would be my guess other interesting things is it's running port 8080 um you see the service name is funk hello three uh we've done a few deployments already and you can kind of see here it says k underscore revision is four so it's tracking at the version of the function as well you see the runtimes node.js 14 you see it's uh http um and then there's all the the kind of path variable so that's kind of cool we've learned something about google cloud functions we know it runs in a container and we know that it's a debian operating system underneath and it's probably google distrolus um now the other thing that we passed in was headers let's have a look at this uh so headers equals true and that's pretty cool stuff so in the request headers you're getting things like the host um so there's u.s central chuck uh hello world 2 uh.cloudfunctions.net interesting you're seeing things like the user agent so that's all the stuff that i'm passed through um there's some really interesting stuff that's been passed through here uh for example so you can kind of see it's passing through my ip address it's passing through uh the lat and longitude of of uh of of where i live you can see my uh my my broadband's been provided in ipswich so you know please don't be creepy people don't go try and find me or whatever so the next thing that i passed through as well was i passed through uh ip4 so let's call that and there's the ip address so we know this is running on 107 178 232 173 and that's going to be useful for when we crash the application which we're going to do in a second when i we crashed the application what i'm hoping to see is the my application dies i'm hoping that google is going to automatically recover that which i would expect and i would also expect next time i hit this address to see a new ip address allocated so let's do that now let's change um ip4 to crash equals true and we'll just run that and we get this error could not handle the request so that has been crashed now that's cool and then if i go back and we run gcp hello three uh you see it's back up and running so i'm getting the hey and again if i put in ip4 equals true i should be getting the ip address and lo and behold i'm now seeing a different ip address so google has automatically under the covers taken my application you know even though it's crashed it's it's dealt with it underneath the hood spun me up a new container and then served up the request again and i didn't have to do anything underneath that and you can see that it's got a new ip address associated with that already so what we're going to do now is we're going to have a look at our function and the dashboard and then see what happen when we crashed it so we'll just click on front hello three um interestingly as you can see we're running version four that was a version we uh deployed at six pm um and and if you remember when we looked at the environment variables or it could have been headers i can't remember but remember i pointed out the it said it was running version four and there you go it's corresponding to what was passed through there so there's a correlation between uh the version that's deployed and uh on the container and what is displayed in the dashboard here so let's have a look at the invocations uh per second here um so you can kind of see that it's running okay at this point but let's let's come down here and this is probably the interesting part so there we go really super interesting remember we deployed this at around six o'clock right so remember earlier in this video it said at 5 23 we were running at one instance because we had we deployed at 516 it went down to one instance and then sure enough when i deployed it at 6 p.m there you can see it's went back up to two instances to handle that deployment just like it did before um but then if you look here it's uh at 604 it went back down into one instance so it scaled itself back down and then we were clicking around if you remember and then i hit my crash function and look it scaled down to zero so i crashed that there was nothing running at that point but good old google underneath the hood automatically just restarted that container so i i lost access to my service uh for a couple of seconds um but it recovered it almost straight away and and the users underneath didn't see anything because google's just just fired it back up but there was a brief period of time where it wasn't running now i guess what you could do is is increase uh the number of instances that you've got and maintain a minimum amount but i wouldn't bother um because you know google is handling this underneath the hood but but as you can kind of see um we crashed it it died and it automatically recovered there okay so that's what happened in our dashboard which is kind of cool what we're going to do now is we're going to have a look at how google starts handling once there's a bit of load so what we're going to do is we're going to just start throwing some load at our service and then we'll have a look at the dashboard and see how it's responding and uh what happens when we crash things as well so this is going to be a little bit of fun to do that the first thing i'm going to do is we are going to go back to uh my machine for a second we'll just clear our console and we're going to use a uh performance testing tool which is for free and if you're running a mac it should be installed automatically if not you can go and download download it it's called apache bench so what to do to perform a tests with apache bench is you just type in a b so that's which stands for apache bench uh we're going to send through so minus n1 000 we're going to send through a thousand uh requests over a period of time and we're going to send that to a url so i'm just going to pick uh one of the urls that we had here i'm just going to copy and paste it i i think um we'll make it do a little bit of work so i'm going to send it to the ip address um url so we'll paste that in there and the last thing i need to do is i forgot to set the number of concurrent connections so in order to scaling you can tell apache bench to run a concurrent number of connections um so we're going to set it to 10 so that will be sort of 10 instances pinging at the same time to this one ip address and then if i hit return um then it will start running these requests now if i come back uh so we'll just give it a second you see it's done 100 requests already if i come back to my browser for a second let's see what's going on there so i'm i'm running i'm still getting ip address 12. but look i've got a new ip address return back it's 149 and let's refresh again i've got 181 refresh again 149. 149. 149. um and then two to nine so my requests are getting clearly distributed across a bunch of instances at the moment it's done 800 of these so we know for a fact that there is more than three or four instances running uh in the back end because we saw these different ip addresses being returned so that is now being completed and what we can do now is we can go and have a look at the dashboard and see how google automatically scaled up and down our instances to cope with the demand alrighty now that we've run that test let's have a look at how google was handling that test underneath the covers there so if we just kind of scroll down there for a second and we're going to take a look at our active instances so actually before we do that let's just look at the implications per second so first thing i kind of want you to see is the 616 when we ran that test which was a little while ago you can kind of see our invocations per second spiked right it started suddenly we were like sort of uh 16.77 uh invocations per second on average so we really just hit it very very fast and hard but but that's okay handled the traffic right google just handled it straight away there was no problem it was just like yeah you're sending me a lot of traffic i'm not going to fail this i'm just going to start responded but have a look here what happened very quickly google uh really spun up the number of instances super quickly so at 6 15 we were running one instance and then you know very quickly we're running 17 instances at 6 16 so it automatically just went whoa there's a load of traffic that's coming in and it just scaled up the number of instances to handle that and then of course once it was done i just started scaling down that traffic so at those instances so at uh 618 you see it's running three instances and then of course at 620 it actually scaled itself back down to zero so because it was getting no traffic so that's kind of cool google automatically scales and handles that traffic and that's great right because you don't need to worry about that i i guess there is a risk involved in this as well right which is and i think that risk is going to be that um if it can scale instances automatically like that then we could be in a situation where we're making ourselves a little bit subject to denial of service attacks right you you saw what i sent i sent 1000 requests um with tank and current connections and we scaled up to 20 instances you know what if somebody just left that running against something that was a google cloud function we could get a very expensive hosting bill quite quickly because it's just constantly going to be uh responding and scaling up a bunch of instances that maybe is 20s or 17 instances is a lot so what we're going to do now is we're going to redeploy the function but this time we're going to limit the number of instances that google can scale to and we do that by setting the max instances value so if we just set the parameters dash max instances and we set that let's say let's put it to three so it doesn't go over the top so we can scale up handle that bit scale down but we're never going to get to like crazy 17 values so we'll just return that that will go and do that deployment will take a few minutes and once that's deployed we will then uh run that same test that we ran two seconds ago and run run a thousand requests and then see what happened underneath the hood there all righty now that that's deployed we can we can grab that very quickly and we can just test that our application is running so if we just uh paste that in uh yep it's returning hey and then we can just check the ip address so we'll set ip equal ip4 equals true and then of course that's returning the ip address that's that's fine um let's let's run that test again um so what we're going to do is we'll just run our uh apache bench and we're going to send through another thousand requests uh 10 concurrent connections uh and then we will see what happens within the dashboard so our benchmarking is kicked off again so it's done 100 requests if if i hit uh refresh again you see it's switching between at 21 and 39. nothing else is appearing so i think it's probably doing a pretty good job of limiting but once that test is finished we'll go and have a look at the dashboard again and we'll see what happens with the max instances and see if actually google did keep to that alrighty so the test is complete now which is which is cool but actually what i think would be a lot of fun is because it's going to take a few minutes for the dashboard to get updated in so that we can see what happened um by my clock it's sort of 6 30 at the moment so what we're going to do now is we're going to run the same test again but this time what we're going to do is we're going to crash our application like we did before so remember we did the crash equals true and then it killed the application and then if we run it again it google automatically recovered it you can kind of see that or with your ip address what we're going to do now though is we're going to uh run our apache bench again but we're going to randomly just keep crashing it and then we're going to see what happens in the dashboard when that happens so let's come back into this we're going to run apache bench so it's off running you can kind of see 21 is happening there but let's start let's start just start killing containers kill container you see uh we've got a new ip address 208 and i'm going to crash it again 2a i obviously crashed the other one so that's fine let me run crash a few times crash crash crash crash crash um refresh i'm 199 crash uh 21 uh crash um 21 crash crash crash crash uh 217 you know i don't i know our test is complete now alrighty so we've ran these two tests let's go back into our dashboard now and and just see what happened underneath the hood in both cases now that we've set our max instances uh to three so i'm going to open up uh funko 3 here and i want you to kind of take a look at this so remember i ran the first test around sort of 6 30 and then i ran the second test afterwards so if we look at the invocations per second you're going to see there that um you know as before i'm running around there's around sort of 11 requests per second coming in at that perspective so this was the first test i was running and then the second test uh is taking about 14 requests per second so they're my two tests there and you can kind of see this after i ran the first test the invocation per second went down but then i ran my second test fairly quickly after let's have a look at what's going on in the uh and actually it's probably quite interesting there remember i was crashing the application and i was just hitting it like crazy there you go i was actually crashing around 0.28 per second at one point so you know you can see crashing appears on uh on this so if we let's have a look at the active instances um and that's probably the interesting thing so there was my original test if you remember we're scaled up to 17 but this time as we'd expect when i ran my test we never went beyond three instances so google is managing the number of instances so even though i'm getting the same amount of requests coming through right i was still sending a thousand requests in the same period of time rather than google scaling up to a crazy number like 17 um it kept the number to three and and therefore i wasn't sort of paying the extra cost of having all of these containers running um in parallel and and those three containers handled those requests perfectly and we'll we'll talk about that in a second and of course once the three containers were done right and once uh my my uh my test was over it actually scaled down again and we went back down to one and then of course uh i ran my second test where i was crashing at lots and again same thing it was running three instances uh at once you see it's not scaled down yet but that will scale down in in a couple of minutes and then it will uh it'll be back down into one instance again so i think the lesson that i think we probably all need to learn there is that although you cannot set the max instances i think it's highly advisable for you to set the max instances to a reasonable value something that you're comfortable with paying the cost of right and the way you do that is having to think about what your kind of maximum request per second would be and and performance testing maybe using something like apache bench that i use to see what your service can handle and then once you've you've done a test with that or you can use other tools like jmeter or something like that but then once you have an idea of what your services can handle set a limit right just set a limit on the max number of instances so you don't get into this crazy situation where google is just picking uh large numbers and scaling up the traffic at once right um i mean of course you can do that but i i think that could be an expensive way of running your solutions now if i was to to come back for a second into the test and have a look at um apache bench for a second if you remember the first test that we run earlier you you can kind of see we run a thousand requests and if we look at the results for this you see it was running around the 24 requests per second um and then the time per request was about um uh basically 400 milliseconds or and there's another 40 milliseconds there for that concurrency so there was 95 fail requests there which is which is sort of fine um if we then look at the second test that we ran where we had limited it again it's the same sort of numbers 427 42 there's a thousand requests and it took about sort of 42 seconds to run so even though i'm running and then the the last one where i was crashing things again about 43 seconds there was a lot of failed requests there right 646 failed requests right and that's because i was just going crazy and killing it all the time so so google was trying to handle the requests but i was sitting killing all my instances underneath but um but that's sort of fun you would sort of uh expect some fail requests in in those scenarios but i think the thing that i'm sort of trying to get across there is the um is that google is handling the failed requests under underneath the hood for you um of course when i was going click click click click click uh we were probably killing more than the three instances we had available hence the number of failures but um that's that's okay but and the apache bench tool retries like other people would so i think the key to this is is that whether i was running three instances or was running 17 instances actually it performed about the same and as you see google underneath the hood was handling the failed request and me killing the containers and it would spin up new containers so i think imagine if i had to run that infrastructure myself it gets quite complicated so it's really nice that actually i can just offload all of that to google to handle um but again the the lesson in this one is is is obviously set your max instances to a a reasonable value now probably the other question you're probably thinking to yourself is is what if i set max instances to one i could set that if you wanted but i think it's going to be fairly obvious what happens which was i'll maintain one container um when i'm doing my my bench test but you know that's that's cool and all that um but when i start killing it i'm gonna get a loss of service and actually i think i'm gonna get a loss of service when i do my deployments in fact there's probably an easy uh way of testing this so let's let's switch max instances to one so we'll do that deployment and then we'll see what uh happens underneath the hood and we'll probably actually probably what we'll do is we'll do a deployment and then we'll do a second deployment okay i've done my first deployment and if i just very quickly call my application again and hit refresh you can see it's working again and it's returning me an ip address what i'm going to do is do a second deployment and then that will allow us to see what happens with the deployment when we uh set our max instances to one the key thing is is google going to maintain uh ramp is up to two instances so it ramps one up one up and then ramps the other one back down or is it are we going to get a loss of service and actually i i don't know the answer to this we're going to find out together in about two seconds but but let's see so uh it's going to do that deployment and then we'll have a look at the dashboard and and see uh what the result is okay so the function has been deployed for second time now you see uh the the version is version seven so we'll go and check that out in a minute or two but before we go and look at the dashboards because it takes a few minutes to update i think what would be really cool is is to actually see what happens uh if we run an apache bench test and run a thousand requests at once at 10 requests per second uh at our service let's see what happens um we'll run it once uh as normal and then we'll run a second time um where i start crashing it like crazy so let's see what happens so uh let's run apache bench um we're gonna hit the ip4 thing again so if i start hitting refresh you see i'm at 227 now so the good news is i'm hitting request lots here but nothing's happening right i'm there's no other ip addresses coming back which is cool right it is staying on 227 which is great um no other instances has been spun up or spun down so we know that the google is respecting um our max instances but what you can see now is we're on 700 requests um and i guess what we'll do is once that test is finished we'll have a look at how long it took to execute that test did it was it was it running at the same speed was there any performance this distance differences let's see so actually there is you can kind of see it took 49 seconds there uh is it's had a thousand uh complete request there's had no failures which is good uh and it's performed at sort of 20 requests per second and you see 491 and 49. so what does that mean well actually it's handled the load okay which is which is fine but it's obviously getting requests at a higher rate than it can handle so it's taking a little bit longer it's queuing up those requests and then the the thread is obviously getting to it and then responding but you see that reflected you know we're running it uh sort of this sort of 49 milliseconds across all the requests it's taken 49 seconds to do those requests but if you remember if we scroll back up you see on the other tests um it was doing that a little bit faster so um in in previous tests it was basically doing it in 42 seconds so there's seven seconds of a difference there and and you see where it was 49 uh milliseconds on average before it's now uh on our test it was 42 uh at that point so i think that's just something to be aware of you could you could be super limiting and bring it down to one instance um but you're probably harming yourself um so uh it's you know you're certainly making it slower right so if you get a a barrage of traffic there that one instance is it's just going to get slower and slower and slower all right so what we're going to do now is we're going to run our test and this time we're going to start crashing it randomly like we did before so we're going to uh just run the test a thousand requests uh over uh 10 concurrent uh requests so this time we are going to hit our crash so we'll just keep crashing a few times random so i'll hit refresh it's come back with 156. uh let's crash it again it's got a new ip address let's crash it again you know got a new ip address again let's crash it again let's crash it again let's crash it again and then let's see how it handles yeah 167 we'll crash it again we'll refresh you know it's handling that fine so google underneath the hood is is obviously handling the uh uh the restarts as we we did before but i imagine what's gonna happen is when we look at the results that we've got there we're probably it's gonna probably even take even longer than than the 49 seconds that it did before because it's obviously gonna keep restarting these uh container instances so let's have a look at that and and yeah it's exactly what i said there right which is it took 60 seconds there we've got concurrency at 10 there was a lot of failed requests so it had to do quite a few retries so it took a lot longer to do that um but of course there's going to be more waiting time right because guess what i it's gonna wait until the containers start and then conserve those requests again so you know it's it's it's had a bit of an impact there with all of these failures so that's kind of it so i would say and we'll have a look at the the dashboard in a second and we'll see what happened with all those failed requests but i would say um obviously having your max instances to one is probably not a great idea okay while we're waiting for that one to update let's have a look at some of the earlier tests on the dashboard and see what happened there so i'm gonna come back into our dashboard and let's just um so i'm going to yeah so i'm going to come back into our dashboard so we'll click on gcp funk hello 3 uh and we'll have a look and see if the dashboards have been updated so uh there was the first test we ran so that was a sort of 650 there um this was with our max instance set to one so that's kind of okay we were running at sort of uh 13 requests per second uh 647 is roughly when we did the deployment so we've got that sort of one test that i did there and it was running fine so it was obviously doing one let's have a look at the active instances so good news there is actually between 6 48 and 6 uh uh 51 we've maintained one active instance so for whatever reason google is has managed remember i did the original kind of deployment um so google has has done that deployment it did it at sort of 648 but it never did that scale up and scaled down thing remember in the previous deployments um we had we scaled up to two and then back down again uh it didn't sort of do that there it's there's a sort of blankness there so probably i think what's gonna happen is is during that deployment there it's it's probably uh you've probably lost your service and maybe maybe there's a way of testing that so if we come back here for a second let's do another deploy um and then let's let's see while whilst that's deploying let's see if we can access uh our service um so let's let's put in uh hello uh so our service is obviously uh still up at the moment so it's kind of doing that deployment so it's obviously still up and running which is good um it's just not doing this sort of scale up scale down um that it did before but it you know seems to be uh maintaining the service so if i go message equals woof yeah it's all kind of working it's deploying um there may be a point where um let's let's let's see if we can get the ip address back as well there may be a point where uh it becomes available for a second when it does the the switch over but um sort of looking okay at the moment um still deploying we're on ip7 at the moment there we go it's just done the switch to 209 and then it's fine so it's not done this sort of scale up scale down thing but to be honest it's it's not harmed your service at all there to be honest so again it's it's not something i would be necessarily worrying about let's go back again to our dashboards and then what we're going to do this time is have a look and see what happened in our second test when we were failing the instances uh lots of times so good news is our and our second test is recorded and invocations per second so you see that there around 16 invocations per second and then let's have a look at the active instances again nothing exciting is going on there right it's it's maintained active instances as one so of course it's sort of uh spinning up these instances up and down as the test is going on um which which is fine and you can kind of see we've got this sort of crash rate of 0.18 per second but uh it's it's maintained uh sort of active instances as one so we've not had an overly scary loss of service there's not like instances of just shut down out of nowhere so you know but what i would say there is is obviously keeping your max instances to one is not a great idea because you've still got that slight delay when google has to recover the containers whereas if you've got max instance these set to a higher value you're gonna have um a more available service so there we go i think we finished this video now i just just kind of summarized the google cloud functions is a really really uh powerful solution and service the thing i like about it the most is that it handles everything underneath for you i mean in this video we have scaled up and then we've killed the service we've ran loaded it um and we've seen how google cloud has handled this right and and the key things from this is this is stuff that you would need to be manually doing yourself if it wasn't a serverless solution you would have to be setting the number of instances you'd be running clusters et cetera and you'd have to pay for the cost of those clusters you don't need to worry about any of that because you're only paying for the instances that you're using right and the amount of compute time that you are using so you can get yourself a really scalable service that can scale up to a massive load if somebody somebody suddenly hits you with a lot of requests at once then you will automatically scale up that or google will scale up that for you handle that load and then scale back down so if you're running something that's a really peaky service maybe it's like a world cup final or maybe it's like a graduation day or results day or um maybe you only you get a lot of traffic between 5 p.m and 6 p.m then you're not going to be running a cluster to handle all of those requests right and and have capacity sitting waiting to be used and you paying for that you're only paying for what you use that point you still have that capacity available to you but you're not managing any of that that's all handled under the hood by google and then similarly it handles all the resiliency there so if your your service crashes or if there's a data center failure or something then it'll automatically recover there might be a slight teeny tiny loss for a second but it's going to switch over and it's all going to be handled so i think um in in whole i think our google cloud functions you know function as a service is a really great solution low cost and and if you're running node.js um type applications is really a great option for you now in a future video i will look at things like google cloud run um and the reason i'm going to look at that one is because uh you get a little bit more control of things at the container level whereas of course with google cloud functions you're kind of restricted to running express services in that case so if if you're wanting to do something a bit different maybe you're wanting to run something like fastify or something like that or maybe you've got other things you want to run in your containers or if you want to uh have a really portable container that you can run multi-cloud and not be kind of hooked in to to google in that sense even though i don't think there's a lot of lock in here it's it's how i'd write applications anyway then then google cloud run is a probably a great option and actually gives you a bit more control in some of the deployment stuff as well so we'll cover that in another video but anyway i hope you've enjoyed this video hope you've sort of understood some of the non-functional elements and it's made a lot of sense to you and we'll speak in the next video

Info

Channel: Chris Hay

Views: 123

Rating: undefined out of 5

Keywords: chris hay, chrishayuk, google cloud platform, google cloud functions, cloud functions, cloud functions gcp, scaling cloud functions, apache bench, gcp

Id: 6zmXG588v2c

Channel Id: undefined

Length: 56min 34sec (3394 seconds)

Published: Mon Jul 26 2021