[MUSIC PLAYING] STEPHANIE WONG: Hello, everyone. Welcome to Cloud OnAir, live
webinars from Google Cloud. We're hosting webinars daily. You are watching "From
Engineer to Engineer," a show where you can hear from
Google Cloud customer engineers about how to solve complex
problems on GCP based on real stories
from the customers. My name is Stephanie Wong. I'm a developer advocate
at Google Cloud. And today we'll be talking
about networking 104, everything you need to know about
load balancers on GCP. And today, joining me is Ryan
Przybyl, customer engineer and networking specialist. So just to mention,
you can ask questions anytime on the platform. And we have Googlers on
standby to answer them. So take it away, Ryan. What can you tell us
about load balancers? RYAN PRZYBYL: Thanks, Stephanie. So if you've seen any of the
previous videos we've done, you know this is
part of a series. So so far we've covered VPCs. We've covered a lot of routing
and peering, some firewalls. And now we're on to number
104, which we're going to cover load balancers today. Just as a note, I've been giving
some thought to session 105. And I think I'm going to
cover DNS for that one. So if DNS is something
that interests you, stay tuned and we'll
get to that next. So this is our
product family when it comes to load balancers. The easiest way to think
about these is really, are you dealing with
external load balancers? I.e. things that
touch the internet. Right? Outside of Google Cloud. Or are you dealing with
internal load balancers, things that you run inside
the cloud environment? The other sort of
dynamic to look at these is, are they global in nature? Or are they regional in nature? So all these products kind of
fit in these various boxes. Load balancers are really
important across the entire GCP suite of services,
whether you're running Compute Engine,
Kubernetes Engine, you're doing cloud storage. Whatever it is you're
doing, load balancers can be part of the solution
and part of the network architecture. So I'm going to
cover some of these. I can't cover everything in
depth in the time we have. But I'll cover what I can. And I'm also going to show
you how to actually set up a load balancer as well. So let's get started with
network load balancers. Right? So these are load
balancers that are really operating at the network layer. Right? So when I say that,
they're operating at layer three and layer four. Right? These are external
load balancers. Right? So we talked about
external versus internal. These are facing the internet. And these are regional
load balancers. So what I mean by
that is they operate within Google Cloud regions. So in this example,
I'm showing a number of users that are
running a number of apps that are running inside GCP. In each of these they're
running in a different region. So you can see MyApp.com
is running in US West. Test.com is running in Europe. And Travel.com is
running in Asia. So those are various
regions for Google Cloud. In each case, I
have a load balancer running in that region
that is receiving traffic from customers on the
internet and pushing them to, in this case, MyApp.com
or Test.com on the backside. So let me go over
sort of network load balancers at a very
high level so you can understand the features
that come along with these. So they are regional. They are highly available. They do run across
multiple zones. They are designed to really load
balance TCP and UDP traffic. So as we'll talk about next,
our layer seven load balancers, these are really operating
at the network layer. Right? So when you're dealing
with higher layer protocols like SSL or
HTP, all that stuff just passes through here. Right? So if you're running
things like SSL, you are back ends need to
be able to terminate that, because the load balancers
aren't going to do it for you. One of the common
questions we get asked is about client side IP. In terms of the
network load balancers the client IP is preserved. So when we talk about the
layer seven load balancers, that isn't the case. And we get into the concept we
call x-forward for headers, in and ways that you
can manipulate that. But in terms of this, the
network load balancer, that client IP is
actually passed directly through the load balancer. So it's not
terminating sessions. It's not changing anything. Because of that, you can
actually use the VPC firewall constructs that we talked
about in our previous video to actually enforce
security policy. So if there is a regional
IP block that you don't want to allow, or something that
you want to white list, you can use the firewall
rules to actually control that because the
client IPs are being preserved so it passes through. In terms of how does it actually
load balance the traffic. Right? It can use either a two,
three, or five tuple hashing mechanism. So if you're using
a five tuple, it's based on source IP, dest
IP, source port, dest port, and protocol. If you're using a
three tuple, it's source IP, dest
IP, and protocol. And if you're using
just a two tuple hash, that's just basically
source IP and dest IP. It also maintains
session affinity based on the IP address. So session affinity
is something that comes up a lot when we
talk about load balancers. So just know, in terms of
the network load balancer, it's really basing
that on the IP address. It does support
HTTP health checks in terms of the back ends. And these are very
high performance, highly scalable devices. Right? They can handle a million
plus requests per second. So it's not something you
really have to worry about it, like how scalable
are these things? STEPHANIE WONG: Yeah. I think you've touched on
a bunch of things here. You're able to still
enable that security by using the firewall
rules that you would apply to the VPC subnet itself. RYAN PRZYBYL: Yep. STEPHANIE WONG: So
that doesn't change. And then also, this
is kind of the way that you're able to scale
your entire environment. RYAN PRZYBYL: Yep. Exactly. And because it's
a network layer, it's very transparent in
terms of the traffic that's coming in and getting
pushed to the back ends. STEPHANIE WONG: Right. RYAN PRZYBYL: So
a lot of customers really like that function. So I'm going to move into
sort of our layer seven load balancers. So what we call our HTTP load
balancers or our Google Cloud load balancers. First, I want to talk
about DNS load balancing. This is a topic that comes
up a lot with customers, to say why don't we use DNS
load balancing, as far as how our preferred mechanism,
our recommended way to solve this problem. So the challenge with
DNS based load balancing is kind of what I've
illustrated here. Right? So if you're running
multiple back ends, and they're running in various
regions, what you have to do is you have various DNS
records that are pointing to these various regions. There's a lot of
things in there that are outside of your control. Right? So, for example, you
could have a failure in, say, San Francisco. Right? You've got to update
the DNS potentially. You could have people that
have stale DNS entries. So even if they are
pounding away on this and it's not available, they
don't have the most current DNS records. There's lots of things that
are outside of your control, per se. Right? And when things
fail, guess what. You've got to go back
and update this stuff. So, for example, if
San Francisco fails, you have to make a change
so that you redirect all the traffic to another region. So it's not saying
that you can't use DNS based load balancing. It's just not the way that
Google has chosen to do it. We looked at this
model and said, when you're running
something globally, how can we do this
that's different? How can you improve
on this model? STEPHANIE WONG: Right. RYAN PRZYBYL: So
this is Google's sort of answer to this. So this is our layer
seven load balancer. And this is a really
powerful construct. Right? I always like to tell customers,
this is like an Indy car that Google built that we
just gave them the keys for. The reason I say that is
because this is the same load balancer that we use
for Gmail, for YouTube, for our search engine. We didn't build this as a
product to say, hey, Cloud. Go sell this. We built this because
we needed to just to make Google function. So really it is this
Indy car that we built in Google infrastructure
that we've said, hey, go take it for a spin around the track. STEPHANIE WONG: Yeah. I was about to
ask whether or not this is built on the
original infrastructure that we're using previously for
our billion plus user products. RYAN PRZYBYL: Exactly. So the way this
works, and the way this is very different
is to start off, you have one VIP globally. So in this case, if
you're running MyApp.com in that regional
construct, you're at a different IP
for every region. In this case, you're using
200.1.1.1 in this case. Right? So whether you're in
Asia, whether you're in Europe, whether
you're in America, you're always
hitting that same IP. Right? So what happens is
when you come in, we're advertising a
global set of blocks. So the various ISPs, the
way that you get to Google, or when I say you,
the way customers get to Google to sort of access
your MyApp.com is they're going to come in from
the internet somewhere. So it's going to hit one
of our load balancers. And our load
balancer is actually going to look at all
the back ends you've configured on the backside. And it's actually
going to figure out what is the closest one to it? So maybe you're running
one on east west coast, and you have somebody
come in from Europe. It's going to push it to the
west or to the east coast. If you're running
it in all regions, then if it comes
in in Europe, it's going to land it in Europe. So net, net it gives your
customer better performance, because that actually
get routed to the back ends that are closest to them. STEPHANIE WONG: Right. And this kind of reminds
me of CDN as well. RYAN PRZYBYL: Yeah. It's very similar. So our CDN actually kind
of lays parallel to this. When you have CDN
content, this is going to be the first thing you hit. And then it redirects
it over to a CDN to serve what the
customers are asking for. But this is sort of in
concept how it works. You can see Maya, Bob,
and Shen in this example, they're all hitting the
same IP, and they're all getting pushed to serving back
ends in their various regions. Now, this is probably one
of our most popular things that we sell in terms
of load balancing, because it just is so powerful. Lots of customers, even if
they're running their own load balancer stacks love
the fact that we can have an AnyCast VIP globally
that they can leverage here. So they might stack this
load balancer on the edge and then put their load
balancers behind it. STEPHANIE WONG: I see. RYAN PRZYBYL: When
we talk about DDOS, when we talk about all
sorts of other stuff, this is what sits at our edge. This is what helps protect
against a lot of that stuff. So, again, it's a very
powerful construct. STEPHANIE WONG: And
just to reiterate, you mentioned a single AnyCast IP. Why is that so beneficial and
such a differentiator for us? RYAN PRZYBYL: So
the big thing there is you don't have to deal
with the DNS piece of it. Right? So in the case I showed you
before, you have an IP address for one region, an IP
address for another region. An IP address for
another region. If a region fails, you
have to update DNS records. You have to make sure those
DNS records get propagated out. Right? STEPHANIE WONG: Right. RYAN PRZYBYL: In this
case, you only have one IP. Right? So you never have to do any sort
of updating to the DNS records for this to function. You're always hitting,
in this case, 200.1.1.1. Right? So no matter where
you are in the world, you're always
hitting that same IP. And you're letting Google
handle what region to push it to that's closest to your
customer to give you the best customer experience. STEPHANIE WONG: Right. In terms of our load
balancer, and you've mentioned this because we have
regional load balancer as well, but we have high
availability built into it. RYAN PRZYBYL: Exactly. You can think of these as
many, many processes that are running on Google's edge. Right? So as part of how we manage DDOS
and how we sort of handle that, you can think of
these as a process. So when we need more of
these, what do we do? Well, we spin up more
of these processes. So it's, in a way,
I don't want to say infinitely scalable, because
nothing is infinitely scalable. STEPHANIE WONG: Right. RYAN PRZYBYL: But it's
very highly scalable, because as we need more,
we instantaneously spin up more of these load balancers
to handle the incoming load. STEPHANIE WONG: Yeah. And this is all based
on our Google front end. And that's kind of the
technology of that. RYAN PRZYBYL: Yep. And sometimes you hear
us use the term for GFE. It stands for Google Front End. That's our internal terminology
for our global cloud load balancer. STEPHANIE WONG: Yes. RYAN PRZYBYL: So I want to get
into sort of the data model. How do you actually
build these things? Right? How do they function? So on the left hand side
you have the internet. Right? This is where all your
customers are coming in. The first thing, it's going to
be a global forwarding rule. Right? How are you going
to forward these. Right? So www.stephanie.com might
forward to one back end. And
www.stephanie.com/buymycoolstuff might forward to a
different back end. Right? So when we talk about
those forwarding rules, this is what we're
talking about. How do you forward
based on the URLs? OK? This is going to direct
then to that target proxy where you actually
build the URL maps. Right? STEPHANIE WONG: Right. RYAN PRZYBYL: And then those
map to back end services. Right? So it could be a web page. It could be a shopping cart. It could be anything that
you put on on the internet. Right? So that is your
back end service. And then those back
end services really live in what we call
managed instance groups. Right? So I'll talk a bit more about
those when I set those up. But effectively you can
consider those a scalable, highly available way of
running that sort of back end. Right? So whether that be that
web server that you're trying to get to, or
that shopping application that you're running. Right? They're really running in that
on top of that managed instance group. And as we talked about earlier,
we've got firewall rules. Firewall rules are in place in
here to help control traffic. Now, in this case you're not
using the firewall rules. OK. We're at the network
load balancer. Because the client IP,
as I mentioned before, isn't passed through. The client actually terminates
at our load balancer. So it does SSL termination. It does other stuff. It terminates the
session there, opens a new session on the backside. Right? So everything is
private IPs inside. The only public
facing IPs are really what's facing out
to the internet and what the customers
are talking to. Right? In your end customers. Not you in particular. STEPHANIE WONG: And just to go
back to that for just a second. You just mentioned internal IPs. And that reminds me of what
you're about to talk about, which is internal
load balancing. At what stage does it kind
of become from external to internal? RYAN PRZYBYL: A good
way to think about that, it's a very good question. So let's take a shopping
cart application. Right? So your shopping
cart application may be published
on the internet. So you put that behind
the load balancer. And the load balancer is the
public IP for your application. Customers come in
from the internet. They hit that application. Right? Now that application
has to access inventory. It has to access a shopping
cart sort of back end. It has to access all these
other things to function. OK? It will use the
internal load balancer to then dovetail into all those
other applications running on the backside. So you're using the
global ILB upfront to access that web page
or that shopping cart. And the shopping cart is using
the internal load balancers to access all those other
things on the back end that it needs to function. STEPHANIE WONG: Right. RYAN PRZYBYL: OK. So let's run through some of
the features that are really tied to our layer
seven load balancer. So we talked about the
single global AnyCast VIP. Right? And this can be IPv4 or IPv6. Right? And it runs globally. Worldwide capacity. We talked about this. Right? These are processes that
run out at Google's edge. They're highly scalable. And like I said, we use these
for things like our search engine. So we know they're tried. They're tested. They run really well. Cross region failover
and failback. So this is kind of what I
was talking about earlier. You might have instances
running at all regions. But if, say, something
fails in Europe, guess what. The load balancer
knows Europe is not in commission at the moment. I'm going to fail that to,
say, the east coast of the US. Cause that's the
next closest region. Or if you're in the west
coast of the US and it fails, it's going to say, oh, I
know my next closest is the central region. So all that intelligence
lives with the load balancer. And it figures out, how do I
get it to the closest back end to ultimately give your
customer the best experience? STEPHANIE WONG: Awesome. RYAN PRZYBYL: We talked
about auto scaling in terms of how we
use these with DDOS and various other things
so it's incredibly quick. It's also a single point to
apply Googleable policies. Right? So we talked about URLs, various
sort of security policies you can overlay on that. We're not going to talk
about Cloud Armor here. But Cloud Armor sort of
works in function with this, if you want to whitelist
or blacklist IP blocks. All of that stuff
works in conjunction over the top of these
layer seven load balancers. STEPHANIE WONG: OK. RYAN PRZYBYL: And
again, super robust. Millions of queries per second. Right? It wouldn't function
for our search engine if we didn't have that sort
of capability in there. STEPHANIE WONG: Right. RYAN PRZYBYL: OK. So, as you said, I'm going
to pivot to the internal load balancers. Right? So these are things that
are not facing the internet. They're not facing externally. These are facing, typically
you can think of it like VM to VM inside your
cloud environment. So one of the key things
to understand here is there really is
no load balancer. Right? When you go and configure
a load balancer, you're actually configuring
the control plane. Right? Our SDN control plane. It's not like you're
building a box that's running on top of a VM. Right? Because that will
create a bottleneck. STEPHANIE WONG: Right. RYAN PRZYBYL: So by
basically allowing you to program the SDN
directly, when packets come in and they need to
be load balanced, the SDN just sort of
looks at them and goes, oh, I know that there's
four back ends for this. I'm going to pick one. I'm going to send
it to the back end. Right? It doesn't actually forward
it to a load balancer. And a load balancer doesn't
make a load balancing decision and then forward it
to one of the VMs. Right? So why we chose to
do that is, again, we don't want a single point
of failure in there. We don't want something
that can fail. When we start talking about
the SDN and the control plane, if it's failing we have larger
issues that we're dealing with. STEPHANIE WONG: Right. RYAN PRZYBYL: So the
data model on this looks a little bit different. Right? So in this case, you
have a client VM. So like I said, your
shopping cart application. You have IP addresses. Right? So back ends. What VMs are they? What IPs? What ports and protocols
are you using on them? The serving capacity
and the instance health, we'll talk about
managed instance groups when I show you how to actually
configure these things. But effectively, that
managed instance group is going to be health checked. Right? So what do I mean when
I say health check? Like we're actually
probing it to say, is the instance or
instances healthy? If the instance in the
group is not healthy, let's auto heal it. Right? So we'll actually
spin it back up. And if, let's say, you're
running two, and one fails, we'll spin up another one. And now you've got
your two again. It also allows you to
dynamically scale these. Right? So if you think
about, let's say, Black Friday is a
really good example. Right? I'm running a whole
bunch of back ends. And I get a whole bunch of
customer traffic hitting my shopping app. Well, you need more back ends? You can set the
managed instance group to scale this stuff horizontally
to build more back ends for you dynamically to allow you
to scale that traffic up. The great thing about it is,
it dynamically scales up. When the traffic goes
away, it dynamically will scale that back down. STEPHANIE WONG: Right. RYAN PRZYBYL: So you
set all that stuff up. STEPHANIE WONG:
That's incredible. Because, again, if you can
scale down to one, let's say, at minimum, then you
are achieving a lot more cost savings that way. RYAN PRZYBYL: Yep. It's a really good point. I always recommend running two. So if one fails, there's
always one available. So I kind of consider
two the minimum. But yeah, you're right. You can actually set up
just one if you want to. STEPHANIE WONG: Yeah. And here it's like, these
are homogeneous VMs. Like let's say an app server. And it gives you, A,
failover capabilities, and the ability to auto
scale based on health checks, or whatever configuration
that you set. Right? Based on VM utilization or
CPU utilization, for example. RYAN PRZYBYL: Exactly. So now we're going
to talk about some of the features for these
network load balancers. These are the internal
load balancers. Right? So just like the external load
balancers, they work on a VIP. Right? So there's going to be an
IP address assigned to them. So when traffic is coming
from your front end, like your shopping cart app,
and it's going to the back ends, it's going to go to a VIP. Right? So you may have five VMs
that are the back ends, but they're all nested
behind that single VIP. Right? So what really happens
is the SDN will say, oh, I know that VIP is
configured as a load balancer. I know these five VMs
are on the backside, and it's going to pick
one of them, send it. The VM is going to receive it. It's going to see, oh, this
isn't my IP address per se, but it's a VIP that I know that
I'm the back end from a load balancer. So I can receive that packet. I can process that packet. And I can respond
to that packet. We talked a little bit
about health checks, TCP, HTTP, HTTPS health checks. This is really sort
of what it uses to check and make sure
those back ends are healthy. In this case, the
client IP is preserved. So we talked about that. Because you're dealing with
VM to VM here in most cases. Right? So we're maintaining
VM A's IP address when you send to those back ends. Right? STEPHANIE WONG: Right. RYAN PRZYBYL: So
that's really what we're saying when we say
the client IP is preserved. It is really the VM IP. And we talked about there's
really no middle proxy. There's no actual sort
of choke point in this. Right? Because you are configuring
the SDN directly. Right? So that's how we deliver super
highly scalable capacity here to do this, where there is
no sort of choke points, or things that can break-- STEPHANIE WONG: Yeah. It's not a server. It's not a VM. It's not a physical device. RYAN PRZYBYL: Yeah. If you've configured F5
load balancers or something in your data center,
it's a physical device that you put in there. STEPHANIE WONG: Yeah. Exactly. RYAN PRZYBYL: In
this case, there's nothing physical about it. It's just a control
plane configuration. STEPHANIE WONG: Right. RYAN PRZYBYL: Right. So it's very elegant
in its simplicity. STEPHANIE WONG: It's a different
way of thinking, honestly, with the SDN. RYAN PRZYBYL: Yeah. There's a lot of stuff that
Google does that really we embody this theory
of why make it more complex than we have to? If we can simplify, and make
it more elegant and simple, let's go. Let's do it. STEPHANIE WONG: Yeah. Absolutely. RYAN PRZYBYL: OK. So that's my high
level overview of this. Now I'm actually going to take
you into configuring a load balancer. So since our global load
balancers, the layer seven ones, the Indy
car Ferraris I was sort of talking
about are probably a very common used
application, I'm going to take you through how
to configure one of those. So if you have ever
been in our console, you'll recognize this as sort
of the home page of the console. So in configuring a
global load balancer, you have to actually do
a few things before I configure the load balancer. We talked about managed
instance groups. Right? I didn't really touch
on them super in-depth. But I'm going to show you
how to actually configure a managed instance group. Because that's what you use to
put behind the load balancer, or what I used in my case. OK? So let's go and look at that. So when you click
on Compute Engine, you're going to instance
groups and instance templates. Before you can build a
managed instance group, you have to have a
template for the VM that's going to be in the group. Because the group basically will
spawn copies of that particular VM that you specify. Right? So it's not like you can have
five different types of VMs. Right? If you say VM A looks like
this, the managed instance group might be five VM As. Right? The instant template is going to
find what does VM A look like. OK? So that's the first
place we're going to go. Now, I've pre-built one,
because it takes some time to update these things,
specifically when you go and do the actual
load balancing configuration, cause like I said, we have
thousands of these processes running out there. And you could
imagine how long it takes to push all these new
configs out to all this stuff. Right? So I intentionally
pre-configured this stuff so you wouldn't have to
wait for me for five minutes and watch the little
thing spin around. STEPHANIE WONG: Yeah. RYAN PRZYBYL: But I'm
going to show you how to create an instance template. So here we go. So this is an instance template. And as I said, you're basically
configuring a VM here. Right? So you're going to
give your VM a name. You're going to
give your VM a type. Right? So whether it's N1
standard and F1 micro, if you've done anything
in Compute Engine these are all things
that are familiar to you. You can set things like
boot disks, identity and access management. These are the service accounts. All this stuff can be set. I'm not going to go too
much into detail on these. You can set automatic
firewall rules that will be created to
allow HTTP and HTTPS. And then you've got this
ability to go in here and sort of define some of
the networking parameters. So I'm going to go
specifically into networking. So in this case, I've
created a network. In this case, it's load
balanced demo network. So I would pick that. And then I would pick a
subnet that's in that network. Right? So what I'm telling
it is, all these VMs are going to live in that
VPC, in that particular subnet that I've created. STEPHANIE WONG: OK. Can you actually have a
managed instance group with VMs in different regions? RYAN PRZYBYL: You
could run things where they're in different regions. In this case, I set
everything up in US central. Right? Because this is a
global load balancer. So you can build
those back ends. As I was saying, you can
build the instance group so it actually spans-- STEPHANIE WONG: Right. RYAN PRZYBYL: You'd actually
build probably an instance group in like the
central region. Another instance group
in the west region. STEPHANIE WONG: Right. OK. Individual instance groups
in different regions. And have the global
load balancer distribute between those. RYAN PRZYBYL: Yep. So there was a whole bunch of
other configurations in here. I'm not going to go through
every one of these in depth. You can play with
these at your will. But this is how you
create a template. So, again, this is
sort of the first step. Right? So I'm going to go back. So this is the specific template
that I built in this case. So you can see I'm
running an N1 standard. It shows it's in
use by something. I have some firewall
rules set up for it. It's got an ephemeral
external IP address. Service tiers I'll talk
about in just a minute, whether it's premium or not. So these are all
the configurations I just sort of showed you. But this is the instance
template that I used. So once you have an
instance template, now we're going to create the
managed instance group itself. So here's the managed
instance group that I created, but to create your own you go up
here and create instance group. So you have the option
of managed or unmanaged. I talked about things
like health checking. I talk about things
like auto scaling. I talk about things
like auto healing. That's all part of the
managed instance group, because you're letting Google
manage all that stuff for you. Right? So in this case, I created
a managed instance group. So I'm going to give
it a name, whether I want it to live in a single
zone or multiple zones. So in my case, I configured
something in US central one. You can configure various
zones in there, the A, B, C, and D redundancy zones. And then here you see it
requires instance template. This is why I had to
build the instance template before I could build
the managed instance group. STEPHANIE WONG: OK. Got it. RYAN PRZYBYL: So here is where
I would grab that template that I just built, click on it. Say this is the template I
want you to use as you build and scale this managed
instance group. Auto scaling, because I've
got a managed instance group. So I'm setting auto scaling on. It's auto scaling based
on CPU utilization. But there's different
things that you can do. So in this case, I just
picked CPU utilization. I specify the percentage. So once it exceeds
an aggregate-- so let's say I'm
running three VMs-- once the aggregate CPU
utilization exceeds 60%, it will spin up another
one, spin up a fourth one, and add it to the pool. In this case, I've told it the
maximum number of instances I want to run is 10. So I might say here, let's say
a minimum of 3, maximum of 10. STEPHANIE WONG: OK. Good to know so you don't wake
up and there's 1,000 running. RYAN PRZYBYL: Yep. The cool down period is
actually how much time it takes to spin this up. So depending on your VM and
what you have configured on it, so if you're running a
bunch of startup scripts, or you're bootstrapping that, or
you're loading a bunch of stuff on the VM, the VM could
take longer to start up. So this is effectively
how long you want it to wait before it
starts polling that thing to get information out of it. Right? STEPHANIE WONG: Oh, OK. RYAN PRZYBYL: So right now the
default is set for 60 seconds. And then I talked
about auto healing. So this is, like I mentioned,
if the device were to fail, so if the VM fails, so one
of your three, let's say, were to fail, it won't
just go down to two. It will recreate that one. Right? Because, again, we
defined the template. So it can just go to
the template, say, I need to create another
one of these devices. Create another one. And boom, you're back to three. Right? STEPHANIE WONG: OK. RYAN PRZYBYL: But that's why
I always kind of recommend running a minimum of two. STEPHANIE WONG: Yeah. RYAN PRZYBYL: Right? So if one fails,
one's always running. But nothing's saying you
couldn't just run one. If it fails, then say you've
got a bunch of stuff on there, and bootstrapping it takes
three minutes, for three minutes you'd be completely down. STEPHANIE WONG: So
basically you would just have another VM
at least running. Or so that it could failover
more quickly than having to spin up a fresh VM. RYAN PRZYBYL: Exactly. So you wouldn't have to
wait the three minutes while it brought up another VM. So that's why we sort
of say best practices to run two of these. That's the whole kind of
purpose of the managed instance group is to allow for
failures, but still have resources deal with stuff. STEPHANIE WONG:
Have a backup there. Yeah. RYAN PRZYBYL: So in this case,
I can set the different types of health checks that I want. And that would
just click Create. And that would create this
managed instance group. So I'm going to go back here. So as I said, I've already
created a managed instance group. So we can click on it. We'll look and
see what I set up. So in this case, I'm
running two different VMs. Right? And you can see the
templates that I used. That was the template
I built initially. You can see when
I created them-- I created these
back in May-- you can see I specified US
central C and US central B. So I'm spreading these across
the VM redundancy zones in our US central
one data center. STEPHANIE WONG: OK. RYAN PRZYBYL: And
then you can see there's internal IPs and
external IPs for these. Right? Now, I do want to show you, if
you go over to VM instances, you will see these show up. Right? So these are those sort of VMs. OK? So what I did to show you
sort of load balancing, in most cases, you would build,
say, a template like that. Lets just say it's a web server. OK? You would build
three copies of it. Well, if I'm showing
you load balancing, and it's load balancing the same
three copies of the web server, I can't really show you
that it's actually working. Right? STEPHANIE WONG: Right. RYAN PRZYBYL: So what I did
is I built these two VMs. And I went and I
loaded Apache on one. And I loaded an
Nginx, basic Nginx proxy config on the other. That way you can actually see. STEPHANIE WONG: Awesome. RYAN PRZYBYL: So what I'm
going to show you-- so if I grab this 47 IP address. Right? Let me go over here,
you can see there's my Apache that I set up. STEPHANIE WONG: OK. RYAN PRZYBYL: And if you go
back here and you grab the 50, there's my Nginx. STEPHANIE WONG: Perfect. Yeah. RYAN PRZYBYL: So
now I can actually show you that it's
actually load balancing. When I show you the
load balancing config-- STEPHANIE WONG: We'll
be able to tell. RYAN PRZYBYL: And demo this
thing, you'll be able to tell. I'll show you how it's
flipping back and forth. STEPHANIE WONG: Yes. RYAN PRZYBYL: But if
it was in normal cases, you wouldn't do this. You would, say, build three
copies of your same web application so that it's load
balancing across all three, and it's completely
transparent to your end users or your clients
what's actually happening on the backside. But again, it doesn't
look like much. If I just show you
the same three pages, I could be tricking
you and really just have one thing running. STEPHANIE WONG: Exactly. RYAN PRZYBYL: So I
didn't want to do that. OK. So as I said, if
you go back to VMs you'll actually see the
VMs running in here. Right? So I just want to
point that out. So we built the templates. We built the instance groups. Now, we're actually going to
build the load balancer itself. Right? So we're going to go to this. And, again, here's the load
balancer already set up. But I'm going to go through and
create a load balancer for you. Right? So, as I said, we're
creating a layer seven, a GCLB load balancer. So that's just this. So I'm going to start
that configuration from the internet to my VMs. So these are all the
configuration profiles that you have to set
up to actually set up the load balancer. Right? So we're going to use all the
other stuff we already created. But when we talk about
the load balancer config, this is what we're
really talking about. STEPHANIE WONG: OK. RYAN PRZYBYL: So
let's go through it. So first, we need to
set up the back ends. Right? So in this case, you
could use buckets, if you had buckets set up. But in my case, I
am running services. So I'm going to create
the back end service. Right? Or I already created
that back end service. Right? So I showed you the managed
instance group, which is really the back end service. This was the managed
instance group I created, the web back ends. So in this case, I
can just click on it. But let's go and
create a service. Right? So you're going
to give it a name. But you can pick whether you
want to use instance groups or endpoints. In this case, it shows
the instance group that I had already set up. It's an HTTP load balancer. So our port number's
going to be 80. STEPHANIE WONG: OK. RYAN PRZYBYL: In this
case, I set those up, the managed interest groups,
as utilization, if you recall. Remember I said 60%? STEPHANIE WONG: Yes. RYAN PRZYBYL: So it's pulling
all the information and saying, OK, utilization. Where do I want the
CPU utilization to be? So this is sort of an easy
way to just sort of set this up using all the other
stuff I've already configured. STEPHANIE WONG: Perfect. Yeah. RYAN PRZYBYL: Right? So this is really
just, like I said, configuring that back end. OK. Let me cancel out of that. OK. So now, we talked about
host and path rules. Right? So this is where I was telling
you like www.stephanie.com is one page, and then
www.stephanie.com/buymycoolstuff is directing to
another instance group. STEPHANIE WONG: Correct. Right. RYAN PRZYBYL: Right? So this is where you're
setting all of this stuff up. Right? So in this case, you can say the
default would be any matched. Right? So anything that comes
into Stephanie.com, regardless of whether
it's backslash images or backslash buymystuff it's
going to go to the back end. So in this case, let we go back
here and do this real quick. Let me grab that. OK. So I set that and set that up. So now you can see
it populates it here. STEPHANIE WONG: OK. RYAN PRZYBYL: So you could
have Stephanie.com, a back end service for that. You could have
Stephanie.com/buymystuff being a different app
that you're running, which would be a different
managed instance group. Right? STEPHANIE WONG: Right. RYAN PRZYBYL: So then you
would just add a path rule. Here. We'll call it
web.stephanie.com/buymystuff. And then I could set a
different back end service. Now, in this case I've only
created one back end service. Right? STEPHANIE WONG: Yes. RYAN PRZYBYL: But you
could create other ones. Right? STEPHANIE WONG: If you
have other ones running. RYAN PRZYBYL: So this
is where you're mapping. We talked about those URL maps. STEPHANIE WONG: Right. RYAN PRZYBYL: This is
where you're actually setting all of those things up. STEPHANIE WONG: Got it. RYAN PRZYBYL: OK? So let me just go over and
delete this real quick. Get rid of that. OK. So now we're talking
about the front end. Right? So this is the actual
load balancer itself, that front end load balancer. So we're going to
give it a name. It's an HTP or
HTTPS load balancer. I'm going to just make this
and keep it really simple. Make it an HTP load balancer. Premium versus standard tier. This has to do with
Google's backbone. Right? So I told you I'd
mention this earlier. I told you I'd talk about it. So really what this is is
Google has a philosophy of we want to control your
traffic as much as possible in terms of the user experience. Right? So if you select premium
network, what that means is if you have something
in, say, the east coast, but the user's on
the west coast, we're going to use
Google's backbone to deliver that traffic all
the way to the west coast. Right? Because when it's
on our backbone, we control that quality. Right? We can make sure that
that packet is not getting bogged down
in the internet or being congested by things. So that's what premium tier is. Now, standard tier was designed
to help cost manage this stuff. Right? So what happens
in standard tier-- let's take our east
coast origination device. It would say, dump it off
to an ISP in the east coast. That ISP would
pick up that packet and take it all the
way to the west coast. But again, you're not
using Google's backbone. You're just using the
internet at large. STEPHANIE WONG: Right. OK. RYAN PRZYBYL: So that's
the difference here. So I'm going to
leave it as premium. STEPHANIE WONG: OK. RYAN PRZYBYL: Say it's IPv4. It's got an ephemeral address. Port 80. STEPHANIE WONG: OK. RYAN PRZYBYL: That's
pretty much it. Right? So I would review
and finalize this. So once I would
click Create, it's going to take a few minutes
to create all this stuff. And it's going to have to
roll it out to all those load balancers out there. STEPHANIE WONG: Right. RYAN PRZYBYL: This
is where I didn't want you to sit and watch this
thing spin for the next 5, 10 minutes. So I'm going to go back. But I will show you this is the
load balancer that I created. Right? So we talked about how
to set up all this stuff. Right? HTTP, here's the IP that
was reserved for it. STEPHANIE WONG: OK. RYAN PRZYBYL: So we'll
grab that in a second. I set it up as premium tier. And this shows you sort
of all the back ends that are behind it. Right? So it shows you it's an instance
group, where it's running. Two out of two
instances are healthy. Right? So I set up all the health
checking when I did this. So it's constantly
polling this stuff to make sure it's healthy. CPU utilization at 60%. Max utilization at 80%. That's pretty much
the configuration. STEPHANIE WONG:
And that's the IP that you're
directing traffic to. RYAN PRZYBYL: Correct. This is the external IP address
that's facing externally. STEPHANIE WONG: Right. RYAN PRZYBYL: Right? STEPHANIE WONG: So users
would be hitting that. RYAN PRZYBYL: Exactly. MyApp.com would sort
of use this IP address. STEPHANIE WONG: OK. And, again, single AnyCast IP. That's the only one
that you would need. RYAN PRZYBYL: Yep. So when you're in this screen,
you can quickly look and look at your back ends. Right? So this is going to show you
my back end that I've built. Right? It's a back end service. There's the name of it. I can click on it
and get some details. It's an HTTP back end. This shows your
load balancer again, that actual front
end load balancer. STEPHANIE WONG: Right. RYAN PRZYBYL: So I'm going to
grab this IP at this point, copy it. Let me see if my load
balancer checked in. Here's what your config looks
like if everything is healthy. See this green check? STEPHANIE WONG: Yep. RYAN PRZYBYL: Right? One back end service,
one instance group, everything's working. Right? So you get this nice,
green check mark. Right? In that instance group
you can look here. If you click on
this, you can sort of see graphical
representations of traffic. I've got really nothing
hitting on this. But it's one instance
group back there. STEPHANIE WONG: Yeah. RYAN PRZYBYL: And if you recall,
when I set up the interest group, I set it up for two. So I was running two. Right? So as you recall,
one's the Nginx. One's the Apache. OK. Now, I'm going to
close these out. OK. So what I did is I copied. Let's see. Where is it? I'm going to copy this IP. So remember, this is the
IP of the load balancer. Go back here. Going to run that bad boy. There's my Apache. STEPHANIE WONG: OK. RYAN PRZYBYL: If I refresh
it, there's my Nginx. There's my Apache. I just keep hitting refresh. STEPHANIE WONG: Yeah. RYAN PRZYBYL: You can see it's
just bouncing between them. STEPHANIE WONG: Wow. It's pretty much like 50-50. RYAN PRZYBYL: Yeah. STEPHANIE WONG: That it's kind
of switching between them. RYAN PRZYBYL: It's kind
of going back and forth. STEPHANIE WONG: Yeah. RYAN PRZYBYL: But you
can see, in this case, I'm trying to demonstrate it. Like you wouldn't really
set it up this way. You'd want your web page
to come up every time. Right? Or your shopping cart app. STEPHANIE WONG: Yeah. But it's a
representation of how you would be balancing the load
between the number of back ends that you have. RYAN PRZYBYL: Exactly. This is a really good
way to just visualize how it's actually doing this. And I just keep hitting it. And you can just see it
continue to bounce around between these things. STEPHANIE WONG: Right. RYAN PRZYBYL: Right? STEPHANIE WONG: And
so the front end load balancer that you just
created, that's being handled by our global load balancer. RYAN PRZYBYL: Yep. The GCLB as we call it. STEPHANIE WONG: The
internal load balancer is what we first created
using the managed instance group and the template. RYAN PRZYBYL: Yeah. In this case I didn't
use the ILB at all. I just have this. This is really just a web
page sitting behind a GCLB. Right? Now, if this was like
a shopping cart app, or this was, say, a store, like
Etsy, or something like that, and I wanted to click on
things to go buy things, that would then use
the ILB potentially to dovetail to other
things on the backside. STEPHANIE WONG: Got it. RYAN PRZYBYL: But
for purposes of this, I'm just showing you that front
end, and how it's changing and load balancing the traffic. STEPHANIE WONG: Got it. Understood. RYAN PRZYBYL: Right? But if you were, like I
said, Etsy or something like that, this would always
look like Etsy's home page every time. STEPHANIE WONG: Got it. RYAN PRZYBYL: So
that's pretty much how to configure a
global load balancer. STEPHANIE WONG:
Thank you so much. That was super helpful. I know it was a highly
requested topic. Much more to talk about in
the next networking 105. So that's it for the
presentation portion of the talk. So please stay
tuned, because we're going to come back for
frequently asked questions. We'll be back in
less than a minute. Thanks again. All right. So we're back with some FAQ. The first being whether
our load balance actually require pre
warming to scale. Because I know that's a concern
that some people may have. RYAN PRZYBYL: Yeah. We get that question a lot. And some of that comes
from legacy products that are out there in the industry. Right? Where you've got
to define things. Right? I think the key answer
is, no, you don't have to. Right? But part of this is because,
again, we built these load balancers as not a
product for cloud to sell, but for things like our
search engine to function. So as you can imagine, if
there was a disaster someplace in the world the next 10
seconds, I can't predict that. Right? STEPHANIE WONG: Yeah. RYAN PRZYBYL: And we
would get just tons of searches of what's
going on in country X that just happened. Or I heard something
great on the news that I need to research. Right? So it would be impossible
for us to sort of do any sort of pre warming. Right? So this is why these things
that were built as a product, they have some of those
limitations in it. But when we built our
global load balancer, this just wouldn't
have worked for us. It just wouldn't have worked
for our search engine. It wouldn't have worked for
YouTube when things go viral. Right? STEPHANIE WONG: Yeah. RYAN PRZYBYL: So the
ultimate answer is, no. You don't have to do it. But a lot of it's because just
the Google engineering that went into this, and
how this was not developed as a product for
cloud, but rather something we use to just make Google's
infrastructure work. Right? It is that sort of
linchpin device for us. STEPHANIE WONG: Yeah. And I think you mentioned
in the presentation that we're straying away
from focusing or basing it off a DNS approach. We're using our global
front end for the external. For the internal,
it's not proxy based. It's using our software
defined networking. So that's kind of what enables
us to be able to not require the pre warming. It's not based on physical
servers, as we talked about. RYAN PRZYBYL: Yep. STEPHANIE WONG: So the next one. OK. So we mentioned premium
versus standard networking. And you talked about how premium
utilizes the Google network backbone. And it gives us higher
bandwidth and throughput. And standard is
for cost savings. So for the front
end specifically, front end load balancer,
can you do premium tier for both global and regional? RYAN PRZYBYL: Yeah. You can set up either
premium or standard tier for either load balancers. Right? And like I said, it's
really done to sort of cost manage traffic
more than anything. Right? Google has built this
great big network. And we want you to
be able to use it. Right? So we love for people to use
premium tier, because it just gives a better user experience,
in our view of the world. Right? But we also understand
that customers are trying to cost
optimize traffic sometimes. Right? And so that's why we created
sort of that standard tier, where you can say, hey, Google. You have this great backbone. But it costs more
money than they want to spend on this traffic. So dump it off to the internet
as quickly as you possibly can. And let it sort of go
through the internet, and go through all of the
cost structure that is very complex in terms of that. Right? STEPHANIE WONG: Yeah. Depending on the use case. RYAN PRZYBYL: Yeah. But you can set it up in a
lot of the different fashions. You'll see premium
versus standard in a lot of the load
balancer configurations. Right? Specifically, the
external facing ones. Because the internal
ones, remember, it's all just
riding on Google's. It's meant to be inside
of what you can see. So it's all riding
internal to Google. STEPHANIE WONG: Right. RYAN PRZYBYL: It's not something
that ever touches the internet. So as I said early on
in the first slide, you can think of
internal services versus external services. The premium standard only
applies to external load balancing services. STEPHANIE WONG: OK. Understood. Great. Next one. So the load balancer
is high availability, or highly available. What can you do to increase the
availability of your back ends? RYAN PRZYBYL: So this is
where the managed instance groups come in. Right? This is the whole point of
the managed instance groups. Right? You're building highly
available and highly redundant managed instance groups. Right? So as I was talking
about with the GCLB, you could build managed instance
groups into every region, and it'll find the closest one. Right? And that's what gives you
that sort of highly available, highly scalable-- I like to talk about
Black Friday as a good use case for this. Right? If you're in the
retail industry, if you do anything
versus purchasing, that is the day you
have to engineer around. STEPHANIE WONG: Right. RYAN PRZYBYL: Right? Everybody engineers
around Black Friday. But that doesn't make sense. Right? Why do you want to
run 1,000 servers, or why do you want to run 1,000
load balancers that get used, call it one week a year. STEPHANIE WONG: Yeah. RYAN PRZYBYL: Right? So that's where the managed
instance groups come in. It gives you that scalability
and that protection from a highly
available standpoint where if something fails,
we spin it back up. Cause you're letting Google
manage that infrastructure. Right? STEPHANIE WONG: Yeah. RYAN PRZYBYL: So
that whole concept of using a managed
instance group is really designed around making
highly available back end infrastructure. STEPHANIE WONG: Awesome. Great. OK. I think this is the last one. How about connection draining
for back end instances to reduce disruption
to end users? Like let's say you want to move
from one back end to another without any kind of poor
experience for the user. RYAN PRZYBYL: Yep. So this is another
common question. Right? So when we do auto scaling
and things scale up. So let's use our
Black Friday example. Right? Midnight happens. A sale goes on. And you get bombarded
with all this traffic. So you scale up to 10
servers on the backside. Right? There is a variable
in the configuration that I didn't show
that basically says, what is the draining period. Right? So I think it defaults
to 300 seconds. STEPHANIE WONG: OK. RYAN PRZYBYL: So
what's going to happen is, as traffic
starts to ramp down, so let's say maybe 4
o'clock in the morning, you got your initial barrage. You scaled up to
meet that demand. Now things start to
sort of fall off. Right? When it sees the utilization
go down, what it's going to do is it's going to say, OK. All new connections, I'm
going to drain this back end. Right? So I'm not going to put any
new connections on there. I'm going to wait 300
seconds before I take it out of the pool. Right? STEPHANIE WONG: OK. RYAN PRZYBYL: So
let sessions finish. Let sessions drain off of it. And then, because it's not
putting any new ones on it, it's putting new ones
over here, once it's down, it'll pull that one down. And it'll keep doing
that and scaling down as your load decreases. STEPHANIE WONG: That's great. Because then they don't
experience a just cut off, and then we're still kind of
spinning up this instance. It's directing traffic. RYAN PRZYBYL: Yeah. It's very intelligent in terms
of like it knows, oh, I'm in a cool down period where I
need to scale this stuff back. And it does so in a
very intelligent way. It doesn't just say, oh,
I'm going to go back to two. Dump all these other five
that I spun up, and strand all these customers. STEPHANIE WONG: Yeah. So it's slowly redirecting
traffic and users to the other one. RYAN PRZYBYL: Yep. STEPHANIE WONG: That's great. RYAN PRZYBYL: And
like I said, you configure that
sort of stuff when you're setting this stuff up. STEPHANIE WONG: Awesome. This is super helpful. And thank you so
much again, Ryan. Thank you, everyone,
for tuning in today. And again, don't forget. Please visit the
Cloud OnAir website for more content from
Google Cloud experts. And we'll see you
again next time. [MUSIC PLAYING]