[UPBEAT MUSIC PLAYING] PRAJAKTA JOSHI: Hi everyone. It's a pleasure to be here
at Next with all of you. I'm Prajakta. I'm a product manager
in Cloud Networking. MIKE COLUMBUS: Hey everyone
thanks so much for being here. I'm Mike Columbus. I am a manager in the
Network Specialist team. So we work with
customers like yourselves for all things networking on
your journey to Google Cloud. PRAJAKTA JOSHI: So
what we wanted to do was to start this talk by
introducing you to our Cloud Load Balancing family. We are going to spend most
of this session getting to know five members, which
is a global HTTP, HTTP(S) Load Balancing, SSL Proxy,
TCP Proxy, and then we've got two other
flavors that are regional, which is Network LB
and Internal L4 LB. And one of the things that many
of you know but some of you may not, is that these load
balancing flavors actually underpin a lot of things,
especially load balancing on the computer engine side--
or whether it's Cloud Storage-- or whether it's
the GK side, which is the Kubernetes
side, or even yourself, manage Kubernetes on GCP. So we thought, let's start
with the basics, which was the first flavor that we have. And then Mike, give us a tour
of Network Load Balancer. MIKE COLUMBUS: Sure, thanks. So around two years
ago, we published a paper that revealed
the architectural details of the Maglev. Now, the Maglev has
been around for a while. It's our in-house
developed, all software load balancer that runs on
standard Google hardware. This load balancer
has been handling the majority of Google's
traffic since 2008 and is a key part
of our architecture. Maglevs handle traffic as they
come into our data centers. They direct traffic across
our front end infrastructure. And they're also exposed
to you as our Network Load Balancer for Google Cloud. So the Network Load Balancer,
when you create a network load balancer, we use
forwarding rules that basically
direct TCP and UDP traffic across regional
back ends that can be auto scaled using hashing. So it's a network-based
load balancer. We look at the source/dest
port IP and protocol. And you can, you
know, manipulate the hashing algorithm
as you see fit. When you create a
forwarding rule, you get a VIP, or a
Virtual IP address, that originates from a
regional network block wherever you tie that forwarding
rule to your back ends. So this VIP is anycast out
our global points of presence, but the back ends are regional,
meaning you cannot have back ends that span multiple regions. This is a pass
through load balancer, so your back ends are going
to see the original client request. It's not doing any TLS
offload or proxying. So traffic gets directly
routed to your VMs. And that's where you
can use the Google firewall to control or filter
access to the back end VMs. So in this diagram, you see we
have three regional back ends, three forwarding rules. We have Maya in California,
Bob in New York, and Shen in Singapore. And they're all connecting into
their back end resources, which is my app test and travel. But if Shen were to connect
into the US-West back end, again, would ingress that
traffic closest to Shen in Singapore because that
ranges anycast out our network. And then we would
route that traffic to the regional back end. So let's take a little
bit of a deeper look under the hood of the Maglev. So on the left here, you see a
traditional middle edge proxy load balancing design
which are typically deployed in active
standby pairs, or highly available pairs. They're scaled up as
more capacity is needed. And typically, if you have
massive spikes in traffic, this needs to be
pre-warmed to prepare for the influx of traffic. On the right hand, you can
see the Maglev architecture. So again, this is an all
software-based solution that runs on our
standard servers and it's horizontally scalable. The way this works is,
in a traditional sense, a load balancer is primary
or active for its VIP, right? In the Maglev architecture,
we announce all VIPs to our edge routers via ECMP. So we'll distribute
flows based on hashing-- so equal cost routing-- to our Maglevs. The Maglevs will then
use consistent hashing. What this means is as
Maglevs, or load balancers are added to the pool,
we'll get consistent results as that back end is selected. The Maglevs then
encapsulate that traffic based on the virtual
network ID for the customer and send that to the
back end service. And then we use direct server
return to send that traffic directly into our edge routers. So the Maglevs are not in
path for the return flow. So I want to recap what features
we have with our network load balancer. Again, it's a regional solution. So we have high availability
by deploying your back ends in multiple zones. It does not look at Layer 7. It load balances TCP and UDP. The back ends will see the
original client request. So there's no need to pass
an x-forwarded-for header, use proxy protocol to
inform the back ends of what the original client request was. And you can control this
access with our VPC firewall. We balance based on
2, 3, or 5-tuple. And you can get affinity by
using 2 or 3-tuple hashing so that way the
source, if you're using source IP, Dest
IP, you'll always choose the same back end. We health check the back ends. And this is a very high
performance solution. So we published a paper. We handled 1 million requests
per second with no pre-warming. What it does not provide. So there is no global back ends
for the Network Load Balancer. We also only support IPv4. So there's no IPv6 support
for the Network Load Balancer. No Layer 7 traffic routing,
and also no TLS termination or offload. So with that, I'll hand
it over to Prajakta. PRAJAKTA JOSHI: Thanks, Mike. So that is how-- so we had
the network load balancer. And then, given
all of the reasons that Mike spoke
through, that's how we decided to go on and
build a Global Load Balancer. Now, most of you are familiar
with the kind of load balancing that you see in
other public clouds. And the load balancers
are regional. And what that means
is, per region, you need a virtual IP
for your load balancer. And then your back ends are
simply in that region as well. So now imagine you
wanted to set up your service in three regions. And then you would need
to go put three VIPs. And then, obviously, you
need a DNS Load Balancer or a Global Server Load Balancer
to go pick one of those VIPs for your given client. And this has actually
several issues. So as an example, if
something were to go down, first of all, your
DNS load balancer has to go and understand
something went down. The second thing is your
local DNS to the client could go and cache the
IPs, and that could result in suboptimal selection. And the other thing
is, imagine you ran out of capacity in one region. The regions are silos. So the load balancer in
one region cannot leverage the capacity in another region. This obviously would not have
worked for Google Services. And so this is when
we thought, why don't we rethink how global
load balancing is done? And the idea was to not depend
on the DNS load balancer to go do load balancing for you. So now imagine,
first of all, you don't build out
the load balancer using managed VMs, or
instances, or servers. Instead, we just put all
of Google's Global Edge infrastructure and we put
load balancing logic there. And this was along with a
bunch of different systems that interoperate to do load
balancing for your clients. Which means it doesn't matter
where you put your application instances, you get a
single front end VIP. You get worldwide
capacity because now that VIP is fronting
all of your capacity across all of your deployments. And then you get
cross region failover and so on and so forth. And one of the really nice
things about the Load Balancer is that it's very tightly
coupled with our auto scaler, which means
as your traffic grows, it will scale up your
back end instances. And then as your traffic ebbs,
it'll scale them down as well. It's also-- one really
nice side effect of it is the VIP becomes,
or our global VIP becomes a very nice place to go apply
all your global policies. And the other thing is you
don't have a choke point, because these are not just
point implementations with VMs and so on. So a really nice
way to think about, you know, what does this
Global Load Balancing really look like? So often we have customers
who will start off with a single region and they
will go deploy back ends there. They will configure
the load balancing VIP. Now, it's very likely
that as your service grows in popularity, you will
need to bring up application instances in another region. And so as you can see here, once
Bob in New York comes on board, all you need to do is go
add application instances in a region in US-East. You don't need to
change the VIP. You don't need to put a DNS
load balancer because you still have only one VIP. In this case, it's 200111. And then you can
just expand wherever Google has data centers by
simply putting instances there. But as you can notice, there
was no change to the VIP. There was no change
to your DNS server. And you definitely did not
need any of the things related to DNS load balancing. One of the other things
which I spoke about was the capacity that you get. So in this case, for
instance, let's say your instances in
Singapore are overloaded. So Shen in Singapore would
be very seamlessly directed to any of the other
available instances. And it's also possible that
the California instances for your app are overloaded. And so your users could be
overflowed back into New York. And once the capacity
is available-- because, basically, the
autoscaler scaled up-- then the clients would
be brought back. So a lot of people ask us,
like, what does this look like under the hood? So because of all of our
mental models of load balancers are these boxes,
this is actually what the global load balancer
looks like under the hood. We have a tiered edge. So we do part of the load
balancing and termination at our Tier 1. And then we do some
of the load balancing, the remainder of the load
balancing logic at the Tier 2. So that's what you see, the
two tiered architecture. And then your
application instances are in the last, which
says instances there. That's where you would be
running your application software. The other thing
that a lot of people ask us is, so you have
all of these instances, and how does the load balancer
go distribute traffic? What it does is it fills
up the local region to capacity for a given client. Then it'll fill up basically
all the zones within a region. Then, if you're out of
capacity in a region, it will overflow to
the closest region. And then, within the zone
itself, it round-robins. You can totally
forget all of this because you don't need
to worry about this. This happens automatically. But the key takeaway is as long
as you have available capacity in the region that's
closest to your user, your user will land up there. You will overflow your
failover only if there's some issue, or capacity
issue, or availability issue in the region
closest to the user. The other thing that
we wanted to walk you through is, this is
what's under the hood. So what does the data model
for the Global Load Balancer look like? So you have essentially
the VIP, which is the global forwarding route. Then you have the proxy. So if you're running
HTTP traffic, your proxy will be HTTP. Or it's HTTP(S) or SSL,
or any of the other TCP and so on and so forth. Then you've got the
back end side of it. This is where you're actually
putting your application instances. And so a service maps
to a back end service. And then that could have
application instances within instance groups
in one or more regions, depending on where
you are users are. So this was a whirlwind tour. But what I wanted to do. Mike, why don't you take us on
a tour of this through a demo? MIKE COLUMBUS: I'd be happy to. Thanks, Prajakta. So today we're going
to be interfacing with an HTTP(S) Global
Load Balancer Deployment. It's live up and running. And I wanted to
take a few minutes to talk through the components
and how they're configured, and really solidify the object
model that Prajakta just spoke of. And what's really cool is if
you're one of the select few who are online during this
conference on the network, you can connect to this fancy
front end we built for you. So if we can switch to the demo. All right. So here we're in
Google's Cloud Console. We're going to navigate
to Network Services. And this is where
we'll configure load balancing DNS and CDN. In this exercise,
we'll be taking a look at our load balancers. You can see we have
three in this project. The first two
we'll get to later. The third one is what we're
going to go through today. This is an HTTP(S)
Load Balancer. So we're going to drill
in a little bit on what's configured. So in the Cloud Console, it
really breaks this object model down into three components. The first is the
front end, right? Host and path rules,
which is basically like, we'll look at the URL
request and route that to the appropriate
back end service. And then the back end service. I'll go through each component
in its configuration here. So we just spent a bunch of
time talking about a single VIP. There's two here. You might be saying,
Mike, what gives? It's because the
first VIP is for IPv4. The second VIP is for IPv6. So basically, for the domain
name, www.gcpnetworking.com, we're going to take
in HTTP(S) traffic and direct this to
our target proxy. So when you create an HTTP(S)
front end global forwarding rule, you need to
have a certificate. So we support SNI. But I kind of cheated here. This is a little
bit of a sneak peek. I used a Google Manage cert. So this is going to manage the
lifecycle of the cert for me-- super easy. I just, you know, when
I deployed the cert, I configured the domain. And using Cloud
DNS, I pointed an A and quad A record to these IP
addresses, and it deployed. SSL policy we'll get
to it a little bit. We're going to
additively do, further in this talk, some additional
things to this environment. We'll secure the environment. And the network tier is premium. This is our default tier. This means that
these VIP ranges come from a global pool of
addresses that will anycast out our global pops, right? So kind of back to the DNS thing
that Prajakta mentioned, right? One A record, one quad A record,
route clients via anycast, just BGP routing to
the closest front end. So from there, we'll talk
about host and path rules. So in this environment,
it's pretty simple. You don't need to configure
these for the HTTP(S) Load Balancer. It's kind of important, right? That's why you might want
a Layer 7 Load Balancer. What these do is that we
look at the incoming URL and direct the traffic to
the appropriate back end service based on this matcher. So you can do path matches. In this environment,
it's pretty simple. Let me zoom in here further. Make it a little easier to see. So anything unmatched will go
to our Web back end service. Anything slash secure is going
to go to our Secure back end Service. The back end
services themselves. So these are made up of a
collection of instance groups. The instance groups
can be deployed in any one of our 20 regions. In this environment,
I picked US-West 1-- it's fairly close--
and Europe-West 1. But in addition to adding
your instance groups into your back end
service, it's also a collection of configuration. So you're going to want to group
your applications appropriately into back end services
for this reason. So the first thing that's
configured in the back end services is your
endpoint protocol. So we pitched TLS everywhere. I was a little bit lazy. I did HTTP. But it's fairly common
to use self-signed certs in the back end
because, you know, your clients are connecting
into the front end proxy. You'll notice the named port. Where this is important--
and this is a configuration that's part of the
instance group-- is that this is
basically a key pair. So if you wanted the Load
Balancer to direct traffic to your instances on the
back end on, you know, a nonstandard port, you'll set
a named port in the instance group, associate
that to the port that you want to
receive traffic on. And the back ends will receive
traffic from the front end or from the Google Load
Balancer on that port. Timeout is also very important. So if you're running web sockets
or long lived HTTP connections, you're probably going
to want a timeout of greater than 30 seconds. This is a fairly static page so
I didn't muck with this at all. We enabled Cloud CDN. It's a simple check box in your
back end service configuration. So if you had, you
know, dynamic content that you didn't
want to cache, you could leave it
off for a back end service associated with that. And then, you know, enable
it for static content. The other thing I
forgot to mention is in your host
and path rules, you can also direct to back
end service or also a back end bucket. So I wanted to just
mention that real quick. The back end
service health check is configured on the
back end service. So we're using a basic HTTP
check for our back end services to make sure that
they're healthy. And then there's some
advanced configuration. So we can configure session
affinity, connection draining, or timeout. So basically how
long if we drain, or if we're going to be
removing a VM, how long do we want to keep those
connections active. We can associate
a security policy. It's a little bit
of a teaser there. We'll do that later. And also identity aware
proxy, which we'll talk about. The other thing is, you can
also add custom headers. So if we wanted
to insert a header or have the HTTP(S)
Load Balancer inform us of say, geolocation
or some custom header, we can insert those on
a per back end service. So this back end
service consists of three instance groups. You'll see everything's
healthy, which is great. You got to love live demos. We have back end. We have an instance group in
Europe-West 1 and US-West 1. You'll notice we
set the auto scaling to 80% of our target, which is
300 requests per second per VM. So auto scaling
will be triggered, in this case, 80% of
300, which is 240, assuming that capacity is 100%. This is an additional
knob that's really important for
Canary or A/B releases. So you might notice this new
web, you know, instance group. This is really like if
I want to spin up a test and direct a subset of traffic
to this instance group. I set the capacity for 10%. So 10% of 300, so about
30 requests per second will go to this
new instance group. So great for Canary
and additional control. Now, the next thing I want to do
is contrast the Load Balancing configuration with the
configuration of an instance group itself. So we'll pick an instance
group, which is US-West 1. And let me just zoom
out a little bit. And we'll edit this group. So in the instance
group itself, here is where you specify
the key pair. So we're mapping
HTTP to port 80. We could do, you know, 8080 on
the back end if we wanted to, or 8443. This instance group is built
off a template, which basically calls in a custom
image, trying to keep the boot time relatively short. We have auto scaling on based
on HTTP(S) Load Balancing usage. We can also use CPU
usage or custom metrics. But we want the auto scaler. Because we see the
requests per second, we want the auto scaler
to trigger auto scaling. We have a minimum of one
instance, a maximum of 10. And the cool down period
is relatively important. What this means is
it's a time we're going to wait before we
collect statistics on instance. So if you have long-running
startup scripts, you're going to want to know
how long your instance takes to boot and set appropriately
in your cool down period. The other thing is auto healing. So auto healing is the ability
to have an additional health check that will recreate
the VM if it goes unhealthy or it's unresponsive. Typically, your auto
healing health check, you want to relax a
little bit because it's a more expensive
operation to reboot a VM or rebuild a VM versus
just taking it out of the healthy available
pool for your back end. But that's there. And it's turned on here with
an initial delay of 30 seconds. This is similar to
the cool down period. We're going to wait 30
seconds before we health check this back end VM. So we've gone through
the configuration. Another thing I have set
up here is some load. So I want to show just a little
bit of monitoring our Load Balancer. So jump back in here. So here's our web
back end service. I have two load generators,
one in North America, one in Europe. Both are sending about
1,000 requests per second. And what this graph shows
us is that traffic arriving at our North American
front end locations are always being directed
to our US-West 1 back end service or instance groups. Now, this goes to the
distribution, right? We're always going to select
the healthiest back end with available capacity. You can see a small
subset of traffic is being directed to
our Canary release site, which is our new web
back end instance group. And then in Europe,
all those requests are being served by the
Europe-West 1 instance group. And then the last thing
we'll do is, let's just-- we can connect to the demo site. So hopefully some of you
were able to access this. So it's a pretty
simple web page. The client IP is the
original client IP so originating from
probably some NAT Gateway here at the Moscone. Load Balancer IP is
the Load Balancer VIP. The proxy IP is
basically the connection, so the actual IP address that
the VM back end will see once that connection is proxied. And these are known
ranges of addresses that you're going to want to
allow through your firewall, your VM firewalls. You can see the host name
we're connecting into. We're in San Francisco. We're connecting to US-West 1. I'd be a little alarmed
if this was Europe. And you can see the
region and zone. So with that, let's
jump back to the slides. Let's talk a little bit
about container networking. So for those running
containerized workloads on Google Cloud, I
want to talk about how we've enhanced the experience
and really made containers first class citizens with
regard to our Load Balancing offerings. So I'm really excited to talk
about Network Endpoint Groups. So what Network
Endpoint Groups are, they're a collection
of network endpoints. So fundamentally, this
maps to a port and IP pair. So as a precursor to
Network Endpoint Groups, VMs can have multiple
ranges associated to them. So we use alias IP ranges. And in that sense, a VM can
host multiple containers with VPC routeable IP addresses,
along with serving ports. So now, with Network
Endpoint Groups, we can target these IPs
along with serving port as an endpoint, health
check and load balance accordingly and
much more accurately without Network Endpoint Groups. So the diagram on
the left here-- I apologize, it's
a little choppy-- shows a Kubernetes deployment
without Network Endpoint Groups. And one of the
things you will see is that the Load Balancer
targets nodes, not endpoints, not containers themselves. So what happens is traffic
arrives at the Load Balancer. We select a back end node. And then, with
Kubernetes Engine, we have queue proxy,
which basically configures a bunch of IP tables rules
almost as a secondary Load Balancer to distribute
traffic to the back end pods. So depending on how many
pods-- so let's assume you have an equal distribution
of pods across your nodes. Well, you have a 1
divided by n, which is the number of nodes,
chance that a node outside of the local node your
traffic arrives at will be selected for
this Load Balancer. So in this case, traffic
comes into the node. IP tables will do
some masquerading and send that traffic to a
pod located on another node. It also sorts NATs. So you have these extra hops. Then the traffic returns
from the pod back to the original node, and
then back to the Load Balancer and then back to the client. So its fairly suboptimal. Now, you can turn
on a mode which is known as Local
Only, which will just target the pods
local to the node that the traffic arrives on. But even then, you're not going
to get an even distribution. On the right hand diagram, we
show Network Endpoint Groups. In this model, the load
balancer sends traffic directly to the port and IP
pairs for the containers. This is enabled with a
simple annotation, right? This was an alpha feature. But with one simple
annotation, you can eliminate a lot of hops. And I'm going to show a
quick demo on this later. The next bit I want to talk
about is multicluster ingress. So if you have an
application running on Kubernetes Engine
in different regions-- so you have different
regional clusters setup but the same
application-- you can now deploy a
multicluster ingress. Which basically, you can
register your container clusters to and create
a unified ingress that can load balance client traffic
using all that global load balancing magic to the
appropriate container cluster. So this is an enhancement to
our queue Control interface. So with Cube MCI, you can
create this new ingress that will map multiple clusters
to a single Load Balancer. All right. So here's the demo
environment again. I want to show Network
Endpoint Groups. And I'll prove this. There's no smoke and
mirrors going on here. So I have one cluster, which
is Cluster A. That is deployed with Network Endpoint Groups. So we have a simple
web application that's just returning its host name. We have load
generators, so we're going to blast a bunch
of traffic to it. And we're going to observe some
monitoring metrics on this. The other cluster is deployed
in the same zone, US-West 1A-- same number of nodes,
same application, same load testing
characteristics. So we're sending 50
concurrent clients connecting into this environment. But this is not using
Network Endpoint Groups. All right. So let's jump back
into our demo. All right. So back in the Cloud
Console, I just want to give you a quick
tour of the environment so you can't call me out later. So here you can see
our two clusters. They both have three nodes. They're basically identical. As far as the deployments
we have in these clusters, you can see our Network
Endpoint Group demo app has five replicas in our
Cluster A, which is our Network Endpoint Group cluster. The non-NEG or, you know, the
traditional app is running in Cluster B. And then if we look
at the services, both of these clusters are,
again, running five pods. And then we have ingress of
the HTTP(S) Load Balancer. So if we jump over to the
HTTP(S) Load Balancer screen, you'll see that in our network
endpoint group cluster, you notice that the back end
service is targeting a network endpoint group. So these are built zonally. So if we had a
regional cluster, you would see a network
endpoint group per zone. But again, these are
port and IP pairs. And if we jump back to
the non-NEG cluster, you can see that
the Load Balancer is targeting the nodes, right? So you know, IP tables is
involved, masquerading packets. So if we look at
Stackdriver monitoring, there's a couple of
interesting observations. So I'm capturing or
charting two metrics. The first is
requests per second. And the other is
aggregate network traffic on the back end nodes. So even though the load
testing is the exact same, you'll notice we're
able to send about 10% more connections to the
Network Endpoint Group cluster. I think this is-- don't take my word for
this-- but I believe this is because the transactions
are finishing faster so the performance-- the traffic generator is
able to pump more traffic to the Network
Endpoint Group cluster. This is key here, though. You'll notice that the
Network Endpoint Group cluster has about half
of the aggregate traffic as the non-NEG Network
Endpoint Group cluster. So this is substantial, right? This is all a factor of
eliminating those hops. But the other thing
is, like, imagine if you're running a
regional cluster where you have nodes in different zones. This could amount
to huge cost savings because you're not getting
charged zonal egress between the nodes, right? So pretty quick demo there. Hopefully that was helpful. I'll hand it back
over to Prajakta. PRAJAKTA JOSHI:
Great thanks, Mike. So what we did is
so far, we wanted to provide you a very
close look at Global Load Balancing and then our
Container Load Balancing. The next half of this
presentation we'll just quickly walk you
through all of the features. Many of these actually
have their own sessions. But the idea is
that you're exposed to a lot of these features. So the first one of those
is Internal Load Balancing because not every
service of yours needs to have a public-facing VIP. So that's the first one. You can see a very traditional
internal Load Balancer where you've got a
client in RFC 1918 space, Load Balancer VIP in RFC
1918 space, and a back end. This is also a traditional
3-tier app where the first tier is essentially public facing. And then the internal
load balancer is used for rest of the tiers. The most interesting
bit, actually, about the internal load
balancer is that there is really no internal load balancer. So between your clients
and your back ends, there is really no middle
proxy sitting and doing load balancing. We have Andromeda, which is our
network virtualization stack. And then that plants the client,
along with health information, so that the client itself can
go pick the optimal back end. So that way there's
no choke point in your network with
internal load balancer. The other thing,
you know, when you start with private
services, the next set of things that you're
going to go tackle is essentially securing
the edge because all of your internet0facing services
are going to get DDoSed. And these would be either
infrastructure DDoS attacks, or these would be even
application level attacks. So we have a bunch. Like, when you actually
secure your network, you want to do it all the way
from VPC to your edge network. And we have several great
sessions on each of these. One thing I want to point
out is one best practice is to put TLS wherever you can. We don't charge extra for TLS,
so that should be your default. The second thing is, if you have
this need for multiple certs, we have a feature that lets you
put multiple certs on a given load balancer. And then many of you
complained about the toil of procuring certs. So we do have managed
certs which are in alpha. They will be available in
beta this year at NEXT. [APPLAUSE] The other thing is, many of you
said, can I turn off TLS 1.0 because I need to pass
my PCI compliance test. So we also have custom SSL
policies available for you. Either you can outsource-- you
can select what kind of policy you want, and you can
outsource management to Google. Or then you can go create
your own custom ones if your IT department
has certain regulations around those. The other feature that
I also want to call out is as a best
practice, you should be turning on Global LB with
TLS, all of the features we've talked about, Cloud Armor. And then if you are
controlling access based on user identities,
you would also layer in Identity-Aware Proxy. So all of these work
together pretty well. And then also, if
Cloud Armor says drop traffic and Identity-Aware
Proxy says allow traffic, it's a combined thing. So it will essentially
drop the traffic. So they all work pretty
nicely along with each other. What I wanted to do now is-- and of course, we not
going to deep dive into the features
of Cloud Armor. We had several
sessions on those. It goes all the way
from IP blacklist, whitelist to sophisticated
Layer-7 type defense. But we wanted to do a
quick demo with Mike. MIKE COLUMBUS: All right. Back again. So today for this
session, we're going to be tying some security
policy to our existing demo that we just went through. So I pinned network
security here. So we're going to be configuring
an SSL policy as well as a Cloud Armor policy. And I'll show you how that gets
attached to your Load Balancers to provide additional
protection. So we'll start with Cloud Armor. You can see I have a Cloud
Armor Security Policy here. This is a simple policy, but you
can always add to these later. So this is titled, Allow
Conference Net Block. So the idea here is that this
is an IP Whitelist Blacklist policy. So the rule set
is fairly simple. We're going to allow any-- so I got the Moscone
network ranges. Hopefully these are accurate. Hopefully you get
online to test these. So basically, we're
going to allow this. So you can see some
IPv4 ranges in there, along with an IPv6 range. And then we're going to
deny everything else, right? So if you have an
application that, you know, you want to make sure you're
explicit as far as which IP addresses are
allowed to access that, this allows you
to obviously filter that at the edge. And don't worry about
it on your back end VMs. We take care of it. So once the rule's created, you
know, we've added our rule set, which is fairly simple here
with the deny at the end. The order that the rules are
evaluated is from low to high. So low is the first. So we created a
priority 100 rule here so we have some room to grow. We attached this
policy to targets. So you can attach this
policy to a single target or multiple targets. I already attached this policy
to our secure back end service. So the idea is if you are
on the local network here, you should have
access to www.https, www.gcpnetworking.com/secure/,
and make sure there's a tailing slash there too for access. If you're not, you're
not going to get access. So one way I can prove this is
I'm on the network right now. Let's open up a new tab and
connect to our Edge Security demo. And you can see we got there. The page looks a
little bit different. It's about the extent of
the customizations I did. But again, this is connecting
into the other back end service. One way to test that
this doesn't work is to use our Cloud Shell. So Cloud Shell is
basically a jump host pre-installed with our
software development kit. So basically, it's an easy way
to interface with your project programmatically. It's a Linux host. So we'll just cURL to the
secured back end service. And who thinks I'm going
to be able to connect? I hope not. And you can see I got a 403. So the actual error
response is configurable. You can send a 403, a
404, Not Found, or a 502. So in this rule, I
had configured a 403. So that's Cloud Armor. A very simple example
of Cloud Armor. And the key takeaway here
is that it's attached to your back end service. So if we wanted to
connect to, you know, the kind of the naked URL
here, it would work just fine. It's because we don't have the
policy tied to that back end service. SSL policies, as
Prajakta mentioned, is a powerful tool to
enforce compliance. So say if we wanted to
follow PCI guidance and say, all right, I only want
my clients to connect into my environment with
a minimum of TLS1.2. It's kind of like
Cloud Armor, right? You create the policy,
but in this case, you attach it to the target
proxy and the global forwarding rules. So we created a policy that
specifies minimum TLS version is 1.2 with the
restricted profile. But, you know, you can
create custom profiles and explicitly allow
or disallow ciphers. We also have a modern profile to
support a wide set of clients. Then we also have a
compatible profile, which is a little bit more loose. Or you can create your own. So for this, we created a
minimum TLS version of 1.2 and we attached this
to the Load Balancer. So now, if we do a cURL command
but specify the TLS version, so let's try to connect
with version 1.0. You see we get an SSL
protocol version error. If we try 1.2,
same error, right? We only want to support 1.3. So if we do 1.-- I'm sorry. We only support 1.2 or above. So 1.2. We specify that for
the TLS version. Works no problem. All right. So with that, I'll hand
it back to Prajakta. PRAJAKTA JOSHI: So again, the
next set of features are-- I'm going to put some
of the work on you in that, after the session,
do go and check out all of these features
because they do bring really good benefits to your services. The main thing, after you've
done putting your services in place and then you've
secured the services, the next thing you
want to do is optimize. And I think the first thing that
people talk about optimizing is for latency. So the first, we already
spoke about the Global LB, we spoke about the two
tiered architecture. But the main takeaway
is your SSL termination happens at the edge. And that saves a lot
in terms of latency. The second thing is that once
your first handshake is done, then essentially after that,
the connections are maintained. And so your next handshake,
like you're not truly doing a handshake all
the way to the end. But for all of your
subsequent connections, you're essentially
saving on round trips. So you got this just by putting
the Global Load Balancer. One of the really good
ways to just shave down more of your latency is if
your content is cacheable, you would go down on CDN. So we have Cloud CDN, which
is our native offering. And you can turn it on
with a single checkbox. So if you have a
Load Balancer, you can just turn on a checkbox,
as you can see there. And then you can enable it. There is also a public
report, which is www.cedexis.com/google-reports/,
were Cedexis measured Cloud CDN against other providers. And then you can see
that it does really well in terms of low
latency, high throughput, and availability. The other thing is
something that we've been thinking about at
Google for a long time, like, how do you go and
make the Web faster? And some of the key points
I wanted to call out. So we spoke about
the infrastructure, we spoke about the Global LB. There are two additional things
that I wanted to look at. One is HTTP2, which essentially
goes and improves HTTP. And then there is QUIC, which
is a protocol which is over UDP. And in fact, if you're using a
Chrome browser during the day, you've already used QUIC. And, you know, it does
some really cool things. So if you think of why
your latency is there, it's because of things
like head of line blocking. So there's a bunch of features
built into QUIC, which actually stands for Quick UDP Internet
Connections, that help you bring the latency down. I'm going to show you how
you would configure it. A lot of times, people
will say we support HTTP2. The first thing you need to
see is from where to where. So we support HTTP2 from
client to the Load Balancer, and then Load Balancer
to the back end. And you would need
that to run gRPC. Same way with QUIC. We support from client
to the Load Balancer. And we're the first
Cloud to actually support that, the first major
public Cloud to do that. And what it really
gives you is this. So when we turned
on QUIC for our CDN, you can see the sharp
jump in throughput because you've brought
down latencies. You can essentially
have more connections. And then what you see on the
right is how did it bring down this latency? And you can see
that the handshake looks much smaller when you
have QUIC versus without QUIC. So we have a blog
post that we posted which describes all of
these details as well, but that is one of the features
you should be looking at to bring down your latency. Now, these are some of the
features that Mike already spoke about, which
is network tiers. So when your traffic
egresses out of Google Cloud, you're paying the outbound
data transfer costs. You should by default
use the Premium Tier because that makes your return
traffic over Google's Global Network. But you may have services
which, for instance, are a free tier for you, or
you may have services that are not mission critical. And if you want to bring
down the cost there, you can actually go configure
the Load Balancer to use, just like other public
Clouds, just standard ISP transit, which is
not Google's network, to go back to your user. And so the main
thing we want to do is that we wanted you to
have choice per workload. So you can actually have this
configuration per Load Balancer if you want to. Or, if you just want to put
it at the project level, which is our thing which holds
all of the load balancers, you can do that as well. This one is, again, something
that Mike spoke about, which is when you configure
your application instances, we have a feature called user
defined request headers which lets you put things such as
client geolocation, or RTT, or TLS information, and then
send it to your back end. So then your back
ends can use that. And they can use it in
their application logic to make optimized decisions. So, Mike, recap
everything we spoke about. MIKE COLUMBUS: All right. So we tried to-- as she walks off the stage. [LAUGHTER] All right. Some of these might actually
be new, but just kind of a cheat sheet, right? So the first thing
in general, right? Choose the optimal
Load Balancer. We have a few
different solutions. And there's not one
size fits all, right? So factors you want to consider,
do I need IPv6 support, right? Do I need TLS offload? Is it an internal versus
external application? Regional versus global? Do your back ends need
true client IP, right? These are going to guide you
to the right Load Balancer. Make sure you don't just start
deploying without knowing what you're getting yourself into. The other thing is optimize
for latency, right? So we have 20 regions. Deploy your workloads in the
region closest to your users. And if you're using a
Global Load Balancer, deploy to multiple regions
closest to your users. And use modern protocols, right? Like QUIC, HTTP2, right? All these things are
great to, you know, improve performance and also
help in lossy environments where the connections
might not be great. Secure your service. So there's a few
bits here, right? As Prajakta mentioned,
use TLS everywhere. Standardize on that. Use the VPC firewall to
augment your security policies. So health checks come
from known ranges. Any of our Proxy Load
Balancers, traffic comes from known ranges. Lock that down. These things are predictable. Secure at the edge with
Cloud Armor, right? Make sure you're using
that functionality to eliminate additional
security measures have a defense in depth strategy
and offload some of that to our Edge. The other thing is-- and this is a common
one-- so your VMs do not need public IP addresses in any
of our Load Balancer solutions, right? Even though, you know,
that GFE IP, if you recall, when it connected back
into the back end, lock your firewall down
to that range, right? There's no need to expose your
back ends ends with a public IP address. Optimize for auto
scaling, right? Know the profile of
your application. Know how you want auto scale. If you have a
dual-headed service, if you have an
instance group that's a back end of two
load balancers, really think through how you
want to scale that, right? Do you want to use
a custom metric? Because the load balancers
will scale independently. Understand and optimize
health checks, right? Try to health check as close
to the application as you can. And, you know, as I
mentioned earlier, it's a less expensive operation
to remove a VM from a pool back end services and then
use auto healer to reboot that VM should it
become unresponsive, but relax those
timers a little bit. The other thing I'll say here
is that our health checks can be quite chatty. So if you're doing logging
on your back end VMs, make sure you're
aware of our ranges so you can treat them
accordingly on your logs. For the ILB or Network
Load, Balancer, there's UDP considerations. So these load
balancers for UDP-- so if you think you're going
to get fragmented UDP packets, be careful with 5-tuple hashing. The reason is you can lose the
destination port as packets arrive on or load balancer and
those packets will be dropped. So you might be better off
using a different load balancing algorithm. The other thing is with UDP--
this is a common thing I see-- make sure you're listening
on the actual VIP. So what happens is if you
put-- with the Network Load Balancer, the ILB, the
VIPs are known by the VMs. They get installed
as a Linux route. You want to listen on
that VIP on your back end VMs for UDP traffic. Use 5-tuple hashing
for the best entropy. A lot of times, people will load
test the Network Load Balancer, the ILB, and they'll send
the same source, dest port IP and protocol. And they're like, why isn't
this stuff getting distributed across my back ends? You know, you wouldn't want
to throw a variety of traffic at it just to see
traffic get balanced because it's using hashing. Use 2 or 3-tuple hashing
for session affinity. For the Global Load
Balancers, secure at the Edge. Use Capacity Scaler
for A/B deployments. It's a very powerful tool
where you could, you know, set capacity at
additional measure to start directing
traffic to new builds. Set the appropriate
connection time out. Use cookie or IP based affinity
for session of persistence. So with that, I'll have
Prajakta take us home. PRAJAKTA JOSHI: Thanks, Mike. So you know, we spoke about
everything that exists today. And the next question is,
what's next in load balancing? And we had an entire session
on it just prior to this one. But if you attended any of the
sessions in this conference, there were two main trends. One was monoliths
to microservices. And then the second
one is almost everybody has at least an on-prem
deployment in Google Cloud, or Google Cloud and another
cloud, or maybe all three. And delivering services
that span this, or even taking your
monolith, when you chop it to micro services, sure, you get
agility and so on and so forth. But it's still really hard
to deliver those services. And so Cloud Services
Platform, which was an announcement in
our keynote yesterday, is attempting to solve
those problems, where instead of giving
you point products, we're putting them all
together into a platform so you can actually
go and then build these hybrid or
multi-cloud services. I wanted to add two new
bits that we announced in the previous session. One is the Traffic
Director and then how it operates with Envoy,
which is a service proxy. I'm not going to go
into the details of it. But the simple thing I wanted
to show is this is a service. You brought in the
notion of a service mesh. So that's when a lot of this
load balancing is headed to. Where just like in the old
days, you took hardware boxes and you separated the
control and the data plane, you're doing exactly to
the same to services, where your data plane
is the service proxy and then your control plane
is your service management infrastructure. One of the most popular service
proxies that we see in use is Envoy. And our previous session
was centered around that. So if you get a chance, do
look at the Envoy proxy. And we like it because, first
of all, it's high performance. It provides a bunch
of features that would be required for your services. And then the third thing is
that also, it is very modern. So it's supporting things
like HTTP2 from the get go. So the moment you
can put a proxy next to your existing
service or application logic, the way
you build services can be fairly uniform because
now you can do it the same way. This is the
announcement that I want you to go and look at later,
which is Traffic Director. So what exactly does
Traffic Director do? Let me show you a picture here. You're so used to global load
balancing with our global load balancer. We spent a big chunk of
time talking about that. We want to bring that same
capability to microservices. And so what you can see here
is there's a shopping cart app. You've got the front end,
the shopping cart app, and the payment. But your instances
for your microservices are in two different regions. So one is in US Central, so
our data centers are in Iowa. And then the other one
is actually in Singapore. And what this means is that it's
not that they are two services. They're literally one service
with instances in both regions. And then the Traffic
Director is the one that's managing those
proxies that you see sitting on each of those instances. And that's the one that's
feeding intelligence about global load
balancing to those proxies. So it's a departure
from the traditional way where the server side
did the load balancing. Now the intelligence is
with the Traffic Director, which is feeding
your client side proxy to then go pick
the optimal back end for the next service. So this is what you see. And then the reason
it's important is this. This exactly matches
what I spoke about for global load balancing,
but in the context of micro services. So you can overflow,
you can do failovers, and your services will be much
more resilient in this case. And like everything
else, I mean, if there was one takeaway
from this conference, you already know that Google
is big on Hybrid Cloud. And so in the future,
we do want to provide Traffic Director
for your work loads which are not in Google Cloud. And then the last thing
that we also want to do is with all of this, we fit
into the CSB architecture. So you do have this really
neat and nice platform that lets you go
and build services at scale and without much toil. So it was a whirlwind deep dive. But thank you for coming, and
we hope you got something out of this session. Thank you. [APPLAUSE] [UPBEAT MUSIC PLAYING]