[MUSIC PLAYING] SRINATH PADMANABHAN:
To get started, I just want to tell you
all something that I truly feel from the heart. I really love these
kind of events, and I want to thank each
and every one of you for being here because these
events are really, really useful and effective ways
for us to talk to customers. So every time we
come to a Google Next-- like for instance,
when we came here last year, we had this opportunity
to come talk to all of you and listen to some of your
feedback about what's working well, what's not,
what can be changed, what can we do better, what
should we continue to focus on, and we take all of
this feedback and then we go back to our day jobs
and we start looking at all of this, and of course
we take a little bit of a break in between, but we go back the
next week and we started looking at all
of these lists of things that we heard from you in
terms of what's working well and what's not. And we use this to kind of be
a roadmap for the next year. So we really focus on areas
where you say we're doing well and areas where you
say we need to improve and try to make
something that really helps solve those problems. And that's why I am really
excited to come back here today and talk to you about
what we learned last year and what we've been able to
deliver in terms of things that help simplify
your lives and help build on our partnership. So with that, I
want to start off by saying, what were
some of the key things that we heard from
all of you last year? So we heard various
things from you, but if I were to distill
them down onto one slide, these are some of
the key pinpoints that you talked about. One of the things we
heard a lot was you wanted us to meet
you where you were. You wanted us to make it easier
for you to get onto the cloud or to reach the cloud. And as you might have
noticed in today's keynote, a lot of what we've
done over the past year has been in that
very area, where we talk about bringing
the cloud to you rather than kind of having
you move things over or looking at things
in a different way. The second thing that we've
heard a lot from people was, help simplify
my transition. It doesn't matter
whether you're trying to transition a whole
bunch of workloads or you're trying to transition
just small pieces of things that you're running
somewhere into Google Cloud. So you asked us to help
simplify the transition, and that has been one of
our key focuses as well and I'll definitely go into
some interesting ways we've been able to do that. We also got two really
nice pieces of feedback. You said that you really
like how we price things and how flexible we
are in terms of costs and making offerings
available in a very, very cost-effective manner. And we also heard from
you that we do really well when it comes to
security and trying to provide a nice footprint and
a nice kind of ecosystem that lets you build your security
story on top of that. And so we've doubled down
on both of these areas and tried to do more
to help make these more useful and more better for you. So with that, I'm
going to start off by talking about something
that we at Google Cloud really feel is a big
differentiator for us and it's something
that we focus on and we want to
continue to focus on, and that is our
global infrastructure. So Google Cloud has been
building this infrastructure for many, many years,
from even before when we were Google Cloud. So we've been building
these submarine cables and we've been building
these data centers and we've gotten to this
point where, this year, we're so proud to announce that we
have 17 regions that are up and we have 125 points of
presence around the globe. And this goes to
make that transition to the cloud much
easier because each and every one of these
points of presence is one additional
location that's closer to where your customers are-- your customers who are consuming
your applications which run in Google Cloud. And what we do to tie
all of this together is we have this
extensive network of submarine cable
and terrestrial cables that connect all of
these points of presence and our data centers
and various other pieces that we provide for
you together to make it one huge infrastructure
investment that delivers value to all of you. And over the years,
as we've been building these sub-sea cables and
all of these investments have been made in
the cable world, have gotten us to
the point where we have an extensive
team that's completely focused on submarine cables. And I'm really,
really proud to talk about the next slide, which
is something very unique that we did this year. So, in January of
this year, we became the first non-telco
company to announce a completely private
sub-sea cable that connects two continents. And this was the Curie sub-sea
cable, named after Marie Curie, and it connects Los
Angeles to Chile. And I'm also very, very excited
to announce that-- well, not announce, but last week,
we announced the Dunant sub-sea cable. This is again named after the
founder of the Red Cross, Andre Dunant, and it connects the
east coast of the US to Europe. So this is the first
trans-Atlantic cable that's actually, again,
owned completely privately by a non-telco entity. So this just goes to
show some of these things that we're trying to
do to bring more value from our infrastructure to you. So just looking at
the numbers in terms of what we've been doing with
respect to infrastructures, I'm going to point out
a few key numbers here. We've been spending
a lot of money in terms of making this
infrastructure that underlies our Google Cloud, and
over the past three years, we've spent over $30
billion in terms of expenses to build out Google's
infrastructure. In addition, we've also
got twenty cloud regions that we've announced. We have 125 points of
presence, which I talked about. We're going to go into
the dedicated interconnect locations shortly as well,
but the really interesting statistic here is the 100,000
miles of fiber optic cable that we have. This is, if you look at all
of the fiber optic cables that we've deployed across
the globe, it's 100,000 miles. That's, just to
put it in context, about a little bit over
a third of the distance from here to the moon. So it's a lot of investment
in terms of these cables and what I hope to
do is to tell you a little bit about why
this adds value to you and your deployments
running on Google Cloud. So all of this infrastructure
is the underlay that lets us build our SDN
ecosystem on top of that, using things that you've
probably heard about, like Andromeda and
Espresso, and if you want to learn more
about this, I definitely recommend that you go and attend
one of two sessions that we have later in the
conference which dive much deeper into this area
and about how it helps improve performance of
different workloads that are running in the cloud,
and in terms of networking, how it makes it so efficient. But if I were to point
out three key things here, the first thing is, this lets
us provide this global network and the construct
of the global VPC that I'll talk about in a
minute, which basically takes away a lot of your sprawl and
toil from managing a vast cloud deployment. It gives you a global
reach by getting you to a point very, very close
to where your customers are and it also gives you various
benefits, like live migration. With live migration,
what we're able to do is be able to keep
your VMs up when we need to do something like,
say, a firmware update that needs to be done on the
underlying infrastructure. And some of these things
really make it very easy to deploy on Google
Cloud and help your workloads keep running. And finally, because
of the way we have all of this infrastructure
investment laid out and we have this SDN
architecture built on top of that, it also
helps things scale seamlessly because every single piece
of our networking offering is built in such a way that
there are no chokepoints. Everything is built in
a distributed manner to make it very, very
easy for you to scale. And we've continued to innovate
in this area with something that we've announced
late last year, which was Andromeda
2.1, which has actually reduced our intra-zone latency
by 40% from Andromeda 2.0, and if you look at
it overall, it's about a 780% reduction in
terms of the latency going from when we originally started
and introduced Andromeda. So we talked about
this infrastructure. Now let me go a
little bit deeper into how this infrastructure
helps you build your cloud deployment. And we're going to look
at some interesting things we've done over the
past year and how we help solve various
of those problems that we talked about
in the beginning. So to begin with, I want
to talk about two pieces-- our Virtual Private Cloud
and our connectivity options. So when you look at Google
Cloud's VPC, the global VPC that we provide, it's very,
very unique in the sense that it provides you a
connectivity by default across all of your
workloads that you have running in Google Cloud, no
matter which region you're on. So what I have there is
a simple one sentence, where if I were
to describe this, the way I would do that is
you may have 17 regions-- you may have VMs running in
each and every one of these 17 regions-- but you have one VPC,
which means you have to manage one set of policies. You have to manage
one set of security. You need to do all of your
security configurations and everything once. And it really makes
it much easier for you to manage your
truly global deployment. Now the other really
nice thing about this is, since this is built in
this way, the only way we are able to make something
like this work is because of the infrastructure offerings
that we provide that gives you the kind of performance
that you really need to pull something
off like this. Now let's look at what are
some different things that you can do to connect to the
global VPC from your network. So we've had various
offerings, and this year, we talked about two key offerings,
which was the Google Cloud Interconnect offerings. And when you look at Google
Cloud Interconnect, what it does is these are
various offerings that help you connect from
wherever you are, whether it's a data center
or whether you're looking at a co-location facility. It gives you a
convenient way for you to connect from where
you are to Google Cloud. And the way you
would do this is you would connect to one of our many
dedicated Interconnect points of presence, and
once you do that, you will be on the
Google Cloud network for the rest of the way. The other interesting thing
I want to quickly touch upon before diving deeper into
Google Cloud Interconnect is we have also been
working on simplifying all of your connectivity
options, including VPN. We've had a lot of
efforts that we've invested in working
with our partners, and with our
partners, we've been able to deliver to this whole
slew of integration guides and making it extremely easy for
you to spin up any kind of VPN connectivity when you want
to connect to Google Cloud. So I mentioned about
Dedicated Interconnect. So let me tell you
what we think makes Dedicated Interconnect unique. So when you look at
a connectivity option where you need to
connect to a cloud, let's imagine that you
have a data center here in San Francisco and
you have workloads that are running in
different regions. Let's say you have one
running in Los Angeles, you have one running
in Singapore, and you have one running
somewhere in Europe. So just pick any three
data centers of your choice and imagine that you
have this deployed. Now if you were trying to
do a dedicated connectivity from your data
center to the cloud, to each one of these
regions, usually you would have to deploy
one set of connections to each and every
one of these regions. With Dedicated Interconnect,
what you are able to do is you're actually able to
connect from your data center to the nearest region. So in this case, you would
just connect to the Los Angeles region, and from there, you
can ride Google's backbone to get to whichever other
data center that you need to. The other really
nice thing about this is it greatly
simplifies the sprawl and toil that come
from trying to build these kind of deployments with
multiple connections going to all of these
different regions. So this map shows all of
our Dedicated Interconnect locations. So no matter where
you are, there are these various locations
where you need to get to and we bring you onto
the Google Cloud network right from those points. We also work with
various partners and each and every one of these
partners helps us provide you with a much more simplified way
of connecting to Google Cloud. So for instance, if
you were to connect to any one of
these partners, you get the flexibility to choose
the bandwidth that you need, and they will help you
connect from wherever you are to Google Cloud. So we talked about
how we've been trying to simplify your
connectivity options when you're trying to go from
your on-premise to the cloud. Now let's look at
what we've been doing in terms of helping you
deliver applications better. So when you talk about
application delivery, these are a few offerings
that we usually talk about, and I'm going to spend
a few minutes talking about what we've done in each
of these over the past year. So first, let's take a look
at Google Cloud's Global Load Balancing. So we built Google Cloud's
Global Load Balancing based off all of the lessons
that we learned over the years in terms of building
and delivering applications like Gmail and Google
Search and YouTube and all of these different
offerings, and you'll also notice that that's
actually true for each and every one of
these things that we look at in the application
delivery portfolio. And when you see this, we
got a lot of great feedback in terms of how the Google
Cloud Load Balancer works and how it makes it
extremely simple to manage a scalable load
balancing deployment, and we've kind of tried to
take that over the last year and go to the next step. So our Google Cloud
Load Balancer, for those of you who aren't
familiar with it, supports Global Anycast IP
load balancing, which means rather than
having to use your own DNS service to load balance
to multiple IP addresses depending on where your
traffic is coming from, you can use your load balancer
to use a single Global Anycast IP so that if you have a user,
for instance, in this example who is trying to get onto
the Google Cloud network to reach your application
from San Francisco, they would still be using the
same IP address, as you would see for a customer who's trying
to connect to your Google Cloud deployment from Singapore, but the services would
be provided by the data center that's closest to them. So this does various things,
and the first and foremost is it greatly simplifies what
you need to do on the DNS front in terms of configuring
and managing a deployment like this. Now the really cool
thing that we've done in this area
in the past year is we've enhanced a lot of
different protocol supports that we've added over the
past year, and one of them specifically that I want to call
out is our support for QUIC. So QUIC actually
is a protocol that would help you gain a lot in
terms of your connectivity time and in terms of the
latency that it actually takes for your users to
get onto this application that you're looking for. So for instance, we see
about an 8% decrease in terms of page
load times globally, and especially if
you're in an area where there is a lot of
latency or you're in high latency environment,
it goes up much higher-- it's about 13% in
those kind of areas. So we do have a great
session about load balancing that's happening later
in the conference as well, where we go much
deeper into this area and many other things about
our load balancing offering. Google Cloud DNS is built to
work with our load balancer so it works out
really, really well in this particular deployment,
where it integrates seamlessly with our Global Anycast
load balancing offering. It also has a very, very simple
scalable record management and it also supports
DNSSEC so that you can verify the integrity
of your records when your clients are trying
to connect to your setup. Now let's look at Google Cloud's
Content Delivery Network. So like I mentioned, our
content delivery system as well is built off of a
lot of the things that we've learned over the
years from YouTube and Google Search and all of
our other offerings. And what we've been
able to do here is to try and take
some of those lessons and bring to you a very,
very scalable global approach to delivering your
content across the globe. So one thing that we've
done over the past year is we recently announced support
for very, very large-sized objects, which makes it great
for media and applications where you're trying to either
deliver media or gaming. So this is again something great
that we're very, very proud of, and we do have a
session about CDN as well if you're
interested in learning more about this offering. We announced network
tiers at Next last year. Network tiers is now in
beta and network tiers is basically what makes
your global VPC work. So the network
tiers is an offering that lets your traffic
travel Google's network all the way through from
the source to your end user until a point of presence
that's very, very close to them. So by doing this, it's
able to provide you with much better latency
and much better performance than you would be
used to on a regular-- when you're using
the public internet. So last year we heard from
you that-- and sometimes there are certain work loads where
this extra performance is not necessarily what
you are looking for and you would look for more
cost-effective offerings, so that's where we
introduced the standard tier, and the standard tier
is currently beta. And usually when you
look at customers and when they use
premium network versus standard network,
our customers really like to use the premium network
for any live applications or applications that are being
accessed by their end users, whereas in a lot
of cases, they like to use the standard network for
things like a test deployment where you're trying
to build something and making changes in your
development environment. So those kind of
environments usually use the standard
network, as opposed to the premium network for
the production environment. Now we talked about
connecting to the cloud and we talked about
delivering your applications to your customers. Next, we're going to
look at various things that we have been doing
in the area of securing your deployments. So we got a lot
of great feedback in terms of how we've been
making more and more offerings available in terms of
security for our users, and in this area, we focused
on a few specific things to try and make those very, very
user-friendly and very, very easy. So the first one is
VPC Service Controls. So what VPC Service
Controls does is it greatly mitigates any
kind of exfiltration risk you may have by
providing you controls which are context-aware
and let you enforce policy for accessing
files not just based on the identity of the user
but also based on the context. So this gives you much more
visibility into your data, which is the lifeblood
of your business, and it really helps you control
where this data is flowing. Network monitoring is one area
where we got a lot of feedback from our customers, saying
that when you look at a cloud deployment and you look
at network monitoring offerings that are
available there, one big concern
for our customers was that it doesn't map one
to one when you get started in terms of what we have
with the level of visibility that you look at in
the cloud as opposed to what you look at when
you're doing something on-prem. And so we took that
and we went back and we tried to
build this offering and we announced
VPC Flow Logs, which is now generally available,
and with VPC Flow Logs, what you can do is you get
five-second interval updates. So this is almost as
responsive as what you would see in your data centers. And the really nice thing about
doing something like this is it lets you have the same
level of visibility that you would when you're
deploying things in your data center and it
integrates seamlessly with all of your deployments
that you have on-prem as well, because we partner with many
of our security partners to provide visibility
into these logs. So what you could
do is you could export these logs to
your partner ecosystem that you're already using and
have a single pane of glass where you can visualize both
your traffic in your data center as well as what
you're using in the cloud. And what helps us make
this even more flexible is that we have a very,
very rich set of parameters that we provide to you in terms
of filtering this flow log. So we let you annotate
based on things like the geolocation
of your client who is trying to
access the service, the region where
your VM is running, and various other
things like the subnet and various other
things that let you get the logs that
you're really looking for and just use those in
order to keep a close eye and get great visibility
into the security posture of your deployment. The next thing we're going
to talk about is Cloud Armor. Cloud Armor is our denial
of service defense system that we've built and we
announced in March this year, and in fact, we have
a session later today where Nick will also be
presenting about Target and how they have been using
the Cloud Armor service as well. So there are a few
key things I want to point out about Cloud Armor. So Cloud Armor is built
to integrate very, very closely with
our load balancer, so what tends to happen is
we are able to take away a brunt of any kind
of a layer three DDoS attack that would try to
hit you at the load balancer Then beyond that, we have
denial of service defense that is built to protect
you against application attacks and various
things like SQL injection and cross-site scripting. So Cloud Armor provides
you with a rules language that lets you program
things beyond things like SQL injection and
cross-site scripting, where you can
tailor your security policy to fit
exactly what you are doing on your own deployment. And the nice thing about
the way Cloud Armor is built is that a lot of
this defense actually happens at the
perimeter, which means it makes it much easier
for you to manage the scale of your setup
while using something like Cloud Armor. So we talked about all of
these different offerings. We talked about our connectivity
offerings, the security that we try to provide to
you with your networking deployments, how you can get
your applications delivered to your customer. So in this next
section, I'm going to talk about the
workload of your choice. So we have been having more and
more customers adopt Kubernetes and we have a lot of customers
who have been adopting GKE, and they're really adopting
container technologies and making them their own. So what we've been
focusing on is we've been trying to make
Google Cloud the cloud of choice to run your micro-services. So we are trying to
provide native support for every single networking
service that we have and networking offering that
we have for Kuberetes Engine. Like, for instance, this
year we have support-- Kubernetes Engine supports
Global Load Balancing, it supports our denial
of service product, which is Cloud Armor. It supports IAP
and our CDN, and we are committed to making
sure that every one of our networking offerings
is available for you with Kubernetes Engine as well. And we've also
taken the next step in terms of securing Kubernetes
workload so that you can look at your security whether
you're looking at containers or you're looking at compute. You can look at them
in the same way. So we announced
support for things like shared VPC,
private clusters, we support network policies,
distributed firewalls, VPC flow logging-- all of these are
supported natively now with Kubernetes, which
makes it extremely easy for you to manage your security
policies whether you're doing Compute Engine or
Container Engine deployments. And if you're looking to
connect your different services, I'm sure you already
heard about some of this in today
morning's keynote and you're going to hear a lot
more over the next few days in terms of various sessions
where people are going to talk to you about
Istio and Kubernetes and what they're
doing in these areas to really make it easy to
use, manage, and secure. And to this point,
we also have support for all of these things
in different ways in our networking offerings,
so like for instance, in our networking
team, we also have the gRPC under our
portfolio, which lets you connect basically your
services in terms of RPC calls and making sure that it's
extremely easy for you to build a service
without having to put all the brunt off
the effort of building and deploying the
networking aspects of this on your developers. So what you can do is
you can use ecosystems like Istio and gRPC and make
it available to your developers so that they can focus on
building the applications without having to worry a
lot about the networking and securing the network
communication parts, which can be taken care of, then, by
your security admins, who would use tools like Istio
and gRPC to make sure that this communication
is secured. We also have a lot of partners
in the areas of security and we're working with adding
many, many more every day. We keep talking to
different partners and we try to see
how we can integrate with them to make a better
offering available to you every single day. And so this is just a quick
snapshot of various partners that we have in
the security space, and I'm sure you
recognize a lot of names there, so we do
work with them very, very closely to make
Google Cloud a very, very partner-friendly offering when
it comes to network security. So with that, I
just want to recap by saying thank you so
much for all the feedback that you gave us
in the last year and it's really helped
us empower your success by trying to come
and make it easier for you to move to the cloud
by meeting you where you are and providing you these
secure and flexible yet simple and reliable
offerings that can really work for any workload that
you're looking to use. And with that, I
definitely want to also say I'm really looking forward
to talking to all of you and hearing what you have in
mind for us for the next year and what you would like to
hear about when we come back to Next 2019 next year. So now I'm going to
call Nick on stage to come and talk about what
a lot of these offerings meant from a
customer perspective. NICK JACQUES: Thank you. All right. Hi, everyone. My name is Nick Jacques. I am a lead engineer
at Target and I work on our cloud platform team. I'm really excited to be here
today and chat with everyone about some of the things
that we're doing in GCP. So just a little bit about
who I am and what Target is. So Target is a rather popular
retailer here in the US. We have 1,800 stores,
several distribution centers, a few headquarters,
locations, and then we also have one very big
virtual store, Target.com. I've been working on our cloud
platform team for about 2 and 1/2, 3 years now,
and prior to that, I was on a couple of our
different infrastructure teams dealing with enterprise
storage and virtualization, private cloud build-out, all
kinds of good stuff like that. The reason why we thought it'd
be really appropriate to share some of this with you is Target
started its journey in GCP in early 2017. So it's been a
little over a year since we really have
migrated to GCP and Earnest, so we thought that
kind of lined up with the discussion about what's
new over the past year in GCP. Just to give you a little
time table, in very early 2017 is when we started
migrating our workloads from another cloud
provider into GCP, and we were able
to complete that I want to say around
late July, mid-August, so right in time
for peak season. We had a very successful
peak season with no issues and everyone was really
happy about that. Early this year,
we had discussions around where we were going to
migrate our commerce workloads. So we kind of split these
into two categories. What non-commerce means for us
is, when you're on Target.com or if you're using
one of our apps, it's basically
anything that doesn't involve purchasing products. So if you're searching
for products, if you're swiping through
looking at images of products, things like that, that's
what we call non-commerce. And then when we actually have
you add items to your cart, check out, and then
pay for those items, those are kind of
the commerce things. Those tend to be a little more
difficult to do because there's quite a bit of regulatory
compliance involved with them. So we started migrating our
commerce workloads in Earnest right around the middle
of this year, probably around March or April, and
actually, we're almost 100% complete with that so we're
pretty excited that we have almost all the things that
power Target.com in GCP right now. So just to give you kind of
the gamut of what's in there, we have quite a
bit that's there. A lot of these services that
I list under Target.com are actually shared and our
mobile apps use those services to display content as well,
but we have everything from our adaptive frontend-- so
basically giving you responsive design depending on what
you're using-- desktop, laptop, tablets, mobile devices,
things like that-- and also our backends. Our backends kind
of spread the gamut between things that are
very monolithic-- they're kind of standalone
apps-- to some very complicated applications
that have upwards of three or four dozen micro-services. We also operate a
data persistence layer and provide replication services
to and from our data center and a platform for
logging in metrics. These last three items-- the data persistence,
replication, and logging in metrics-- are part of what we refer
to as platform services and that's part of what my team
helps offer our application teams or our tenants in GCP. We provide a suite of services
that any application team can use and basically
provides common functionality around any cloud
that we deploy to. So it abstracts a lot
of the difficulties away from our application
teams in terms of figuring out how to do service discovery,
where to store data, how to replicate data,
things like that, and really lets them focus
on building their apps and then deploying those
apps and not really having to worry about some of
that underlying infrastructure. And then, as I
mentioned, we also operate some of our
commerce stuff in GCP, and so those things are PCI
DSS compliant, which obviously means that there is a great
deal of regulatory compliance that's attached to it,
and those environments are highly segmented. For mobile apps-- so in
addition to everything that we just talked
about for Target.com, we also have some
unique mobile offerings, so we have things
like Cartwheel, which is part of
our mobile app that lets you discover coupons
and various discounts that are available in the store. Wallet, which is
actually really great. It allows you to take all those
coupons and discounts that you discover with Cartwheel
and pay with your red card in a single scan of a
barcode so it's really handy. I use it all the time. And then we also provide some
other interesting services, like in-store mapping. So we can tell you where
a particular product is that you're searching
for in a store and then actually show
where you are in the store and you can kind of find
your way to that product. There are a few other
mobile apps as well. And then we also host a
variety of API endpoints. So we actually host our
developer documentation at developer.target.com on
GCP, as well as a variety of API backends in GCP. So these are everything
from item availability, creating shipping labels,
creating barcodes, things like that. We also host a variety
of static content. So think images,
JavaScript files, all kinds of stuff like
that, in GCS buckets, and we also use Cloud DNS. So I sat for a while and
tried to think about, how can I best communicate
all the various features that we use in terms of
networking and infrastructure in GCP? And I thought for a while
about this and ultimately what I decided to do is turn
to a tool that a lot of people use in large companies when
they're making decisions, which is a particular geometric
shape that tends to be or have magical properties. So I looked at this--
and I definitely didn't make this on a plane. This is very official. And what I saw was that word
cloud, for the eighth year, is dominating the information
density representation. So with that in mind, here
are some of the features that we use in terms of
networking and infrastructure for Target. So there's quite a few up here. I'm not going to go through them
all now because we'll actually kind of just naturally
go through those as we work through these slides. So briefly what I
want to talk about are the environments
that we have. So as I mentioned, our
non-commerce environment has to do a lot with basically
browsing and discovering products and services
on Target.com. So the way those are structured,
we have several environments. So we have a dev environment,
a non-prod for staging and performance
purposes environment, and then we have a
production environment. And you can think of
those environments basically as single
monolithic projects. Inside of those projects,
there is a single VPC that contains many
different regional subnets, and what we've done is something
kind of interesting here. One of the challenges
we've previously encountered in prior
cloud providers and just due to the
nature of Target itself is IP address exhaustion. So particularly in
a cloud, you want to be able to scale and
not run into any issues where you've exhausted a
subnet and now what do you do. So to get around
those challenges, because these
tenants are actually quite large in size and
scale through many thousands of instances, is we
actually kind of divorced the application subnets from
our data center routeability. So what that means is
all of our applications that operate in our non-commerce
environment are basically air-gapped from the data
center, but what that means for us is we've created several
very large subnets so we don't have to worry about partitioning
off specific applications and specific subnets
or things like that. It's just a very large
general-purpose hosting environment. The way that we allow
the applications to talk to services in our data
center or replicate data across is that platform
services suite of components that I mentioned earlier. Those services are data
center routeable and so that's how we replicate
that data across, and then the application
teams can consume it in GCP. If we take a look, then, at
our commerce environment, that environment is
actually quite different. Yes, it's still hosted
in GCP, and yes, we still have our platform
services there, but many things beyond
that are different. So, for compliance
purposes, we wanted to make sure that we had a
highly segmented environment here. So what we've done is we've
actually created many projects and we've used shared
VPC and VPC peering to give us that really
highly segmented environment and help us partition off each
individual application that runs there. In this model, all
of our instances are data center routeable,
and we made that choice because this
environment is actually at least an order
of magnitude smaller than our non-commerce
environment, and many of those applications
have direct requirements for synchronous or asynchronous
calls to our data center to complete things like
payment transactions and so on. So we needed to give
those applications a way to directly hook
into the data center. As I mentioned, these
tenants are highly segmented and basically everything
in this environment is subject to
regulatory compliance. So obviously we wanted to make
sure that we got it right. Going through some of
the common patterns that we have across both
of these environments, we currently use IPsec VPNs
to connect our data centers to these particular
environments. One of the things
that we're looking at, which was mentioned just a bit
ago, is partner interconnect, and that's something
that we're investigating and we'll be taking
a deeper look at that either later this
year or early 2019. We also use Cloud DNS
basically across all of our environments
in GCP and that hosts the authoritative DNS
for our internal resources as well as our
external resources. Our developers are
a really big fan of this because
this means that they can use a common consolidated
Terraform repository to create DNS entries
for their applications and they don't need to cut
tickets to another team to create DNS entries
and wait for another team to complete those tasks. So we really empower developers
by letting them create their DNS entries themselves. For any type of
situation where instances need to reach out
to the internet, and this is actually
fairly common given that API.Target.com is
surfaced on the internet, we provide either a
NAT or a proxy service. So these services
are instrumented with packet capturing
software and they basically allow us to police
traffic that's egressing our caught environment
and take a look at what's going on there so that we can
react to it should anything happen. As I mentioned
several times already, our platform services
are consistent across all of these environments. We also use the HTTPS load
balancers almost universally. I would say that
there are probably six load balancers that are
not the L7 load balancers. We love them. They're great. They give us an
Anycast IP address, we operate services
in multiple regions, and all we do is wire those
services up with the same load balancer and away we go. We don't have to worry about, is
this IP address in this region and how do we
geolocate our users and make sure that they're going
to the region close to them. It's all handled through
that load balancer. The way that we let our
application teams manage their deployments and manage
the instances they're running on is through an open-source
tool called Spinnaker. It's a really great
tool and I would highly encourage you to
check it out if you're able to use it in your
particular environment. It really takes away a lot
of the pain of creating a deployment pattern. You don't have to worry
about having people login to an instance and run
scripts or anything like that. Basically, you create some sort
of package install deliverable, Spinnaker will bake that into
an instance template for you, and then you can deploy that
as many times as you would like and deploy it in
multiple instance groups and it makes things really
easy, so I would highly recommend that. One of the things that
we use fairly heavily is private IP Google access. The reason why we use
this heavily is almost every instance in
GCP that we operate does not have a public
or external IP address. Our NATs and proxies are
there for that purpose, and so what we want to do
is keep all that traffic, especially traffic
that's destined to APIs, for example, to GCS, just
internal to our VPCs. So we use private IP
Google access for that. We collect and analyze
logs across all of these environments. And then finally, we use
Cloud Armor and SSL policies to make sure that our
load balancers at the edge match our security posture
that we'd like them to have. So for instance, only accepting
TLS version 1.2 connections, things like that. So we've talked
through a lot of lists. What I want to do
is give you a visual that we can take a look at. So at a very high
level, this is what our non-commerce
environment looks like. So what we have
in our application hosting area that's not
really pictured here are several subnets that
span multiple regions. But roughly what happens
is we have a Layer 7 load balancer that fronts
our various endpoints. So it could be an application
with multiple URL routes. It could be just a
single-purpose app or a micro-service. And then we have that partition
for platform services again. So again, there's things
like service discovery that we host in there, data
persistence, data replication, things like that. Then, as pictured, we have a VPN
that connects this environment to our data center,
and you'll notice that we've attached only the
platform services subnets to our data center. The application
hosting subnets really only talk to our platform
services and that's about it. Then finally, the part that
we haven't touched on yet is, what do we do with
requests that are inbound? How do we actually
service requests from either our guests or API
calls or things like that? How this works is we
have our clients-- they might be our guests,
they might be a third party that's using our APIs-- and the first step is
they'll hit our CDN provider. So our CDN provider is
effectively our edge. And through there, in a variety
of caching and web application firewall layers, eventually
we'll traverse back into GCP and we'll hit one of
these load balancers that represents a particular
service that we're hosting. So that's kind of things for
our non-commerce environment in a nutshell. If we go over to our
commerce environment, pay really close attention
because the picture is going to look fairly
similar, but what you notice is we have a lot
more boxes down here on the left-hand side. So what we've done is we've used
shared VPC and what we've done is we've partitioned out
each of these applications into its own service project. So what that means
is our application teams, through Spinnaker, have
full reign in their service project to do deployments
that they would need to do and they can look at
all of the logs that are collected via Stackdriver. This partitioning puts
us in a really good place wherein application
teams can't look at other app teams logs, right? You can only see
the logs that are relevant to the workloads that
are operating in your project. So that's been working
out very nicely for us. We've done the exact same thing
with our platform services as well, so no one is
immune to the segmentation and partitioning. So each of our
platform services teams that operates each one of
these particular services that we provide has their
own service project as well. And then what we've done and
the great part about shared VPC is we consolidated all
of our network policy into the host project. So what that means is all of
our custom routes are there, our VPN connections
are there but apply to all of these service
projects as well, and then our firewall
rules are there as well. So the great thing about this
is we have a centralized place for firewall rules, we don't
need to go hunting around to see if a particular
application team has changed a firewall rule, and in
fact, the application teams can't change a firewall
rule because the service projects won't allow it. So all of those firewall
rules are hosted and kind of consolidated in
the host project and then we provide our
developers again a Terraform repository through which
they can enact those changes. In this particular case,
they are heavily reviewed and we ensure the safety before
we put those into effect. And then the rest of
it is very similar. We have VPNs that connect
us to the data center. In this case, on this GCP side,
anything can traverse the VPN and come back to the
Target data center. And then we have a
firewall on our side to make sure that only
the appropriate traffic is coming back or going to GCP. And then our internet path
over on the upper right is basically the same
as non-commerce-- we still use our CDN to direct
the traffic inbound for us. OK. So this was a really high level. What I want to do now is kind
of zoom in to just the GCP area a little bit
and show you what's going on there in a
little more detail. So the first thing
we'll take a look at, we sometimes jokingly refer
to this as a load balancer sandwich, but for our
application teams, what we do is we obviously
have the L7 load balancer at the edge
for public ingress, and then we actually have
internal load balancers that we use for connectivity
either inside the VPC or from our data
center up into GCP. Below, what you'll see
in the Platform Services section is things
got a lot bigger. We talked about a
lot of these items already so I'll just gloss
over this really quickly, but a lot of these services
are common to each environment. And as I mentioned, we
have proxy and NAT services available for these
workloads that need to reach out to
the internet, where we do police that egress. If we take a look
in the lower right and look at the Network
section, there are some things that we use that are kind
of inherent across all of these environments. So we use Cloud VPN and
Cloud Router together. And what we actually
do in these scenarios is we use BGP to announce
the routes across. So this is really great. This saves us a lot
of time from having to try to figure out how we
announce these static routes and putting all these
change requests in. It makes things really easy
for us and it makes failover really easy for us as well. We have Cloud firewall
rules, as I mentioned, which are inherent across
all of these environments, and there are many
of them, and then we use custom
routes to do things like pull traffic destined to
the internet through our NATs. So there really isn't a
good way for applications to egress out any other
way than through our NAT. In the upper right--
we'll kind of just gloss over these a little
bit, but we use many other GCP offerings. We can see a sampling up there. So we use IM very heavily. We use logging in Cloud
Pub/Sub, which has actually been very helpful to us. So what we do is we
have logging syncs on a lot of our
Stackdriver topics and we pull that into
Pub/Sub, and then what we do at our data center
is subscribe to that topic and ingest that. So instead of clouding up or
filling up our VPN tunnels to and from GCP, we basically
transmit that out of band and pull it into our data
center, where it's logged, stored, and analyzed. And then finally, the last
piece that I want to touch on are the Cloud Armor
and SSL policies that we apply to all
of our load balancers at the edge to make sure that
we have consistency in the way that those endpoints
are accessed. So just quickly going through
some highlights, things that have been really great
for us over the past year, the global L7 load
balancers, as I said earlier, have been really great. We're big fans of it and
I don't think we really have any complaints
about it whatsoever. Inter-region networking
has been great for us. So this was something
that we were able to adopt when
we moved to GCP and it's basically built in. It's really quick. It's really easy. We have absolutely no
complaints about it, and for some of our services-- so for example, a lot
of our data persistence is stored in Cassandra. We can just replicate
across regions without having to worry
about regional VPCs and peering those VPCs
together, or even going to the level of running
VPNs in those VPCs and then connecting
VPNs together to get traffic across them. So that's been great. Our IPsec VPNs have
been absolutely stable and we've had a really
great success with those. The only reason
that we're looking at moving to Interconnect
is we need a little more bandwidth than what the VPNs
can currently provide us with. Cloud Armor has been fantastic. That has been in place in
our commerce environment since day one and
we're currently finishing a rollout in our
non-commerce environment as well. A little tidbit that
isn't talked about very often is the NTP server that is
built into the metadata server. So this is great. What it does is it gives you
access to Google's NTP servers, and you actually get
the time smearing that Google provides with their
public time.google.com server as well. So the great thing
here is we don't have to configure anything. This is all kind of
baked in and good to go and we don't have to egress out
to the internet for NTP sync. So that's been really handy. Private IP Google access,
I touched on that already. That's been really great for us. Another really great feature
that thankfully we've only run into one
time is the ability to expand your subnets live. I think a lot of folks know
that no matter how you plan and how many
whiteboards you draw on and how many spreadsheets
you fill out, trying to set up your IP
space for cloud deployment, something unexpected
always happens. And in those cases, what's
actually really easy to do in Google is if you haven't
consumed the space that you'll be expanding to, you
can flip that /24 to a 23 or beyond, as long as
that space is available, and you can do that live. So it's a really good
way if the unexpected has happened for you to get a
little headroom for yourself. And then finally, Cloud DNS. Cloud DNS has been
great for us, and our, as I mentioned
earlier, developers love it to kind of
be in control of DNS and do that in a
self-service fashion. So then I want to just wrap up
with our lessons learned here. One of the biggest ones
is that everyone should have and exercise DR plans. We have an internal tool that
we use that basically allows us to go through all of
our L7 load balancers and disable traffic
to a given region. We use this in
emergency situations or when we have
issues with a stack in those particular regions,
and for the most part, it's been working really well. Typically within
about 15 or 20 minutes after an incident
is called, we're able to drain all of the ingress
traffic to a particular region away and send that
traffic to other regions, and we just do that with
a simple config change the load balancers so that's
absolutely great that we're able to respond in that way. If you're operating a hybrid
cloud, lots of planning involved and lots
of communicating with teams across
the organization. We've been very lucky to
have really great partner teams at Target, all the way
from our infrastructure teams that help us set up the VPNs
through our platform services teams that we've partnered
with for that data replication and storage, all the way
up through our app teams. And one of the things
that we've done that's been really critical is
sat down with our application teams and made sure that, as
we're planning things out, they're included and we take
their feedback into account as we're building out these
new features for them. One interesting item is,
be careful of attaching the same backend service
to multiple load balancers. We encountered this last
year when we had teams that, when they're operating
in that load balancer sandwich model, created a
single backend service and attached it to both. And what happened in that
particular scenario is the signals from
the load balancers were a little cross-wired
and they were not able to scale that service based
on RPS from the Layer 7 load balancer. So if scaling based on
RPS is important to you, avoid attaching
your backend service to multiple load balancers. One other interesting
piece that has been a little bit
of an adaptation for our application teams
and our security teams is that, with the exception of
Cloud Armor or the network load balancer, the firewall
rules that you apply are applied to
the instance itself. I'm sorry, this actually does
apply to network load balancer. So for ILBs, for instance,
there is no firewall rule that you attach
to the ILB, right? It's the firewall
rule that you attach to the instance
that actually allows that traffic through the ILB. So a little bit of
working to get that communicated across
all of our teams. And then finally,
just be wary no matter what you're doing-- if
they're static or dynamic announcements-- of what routes
you announce back on premise. We're lucky to have BGP
route filters in place on the Target side
of things, but we actually for a period
of time did wind up announcing what is actually our
internal DMZ space in our data centers that we
use in GCP as well. And that's part of
that kind of separation that we operate our
non-commerce applications in. So luckily we had those
filters in place and nothing bad happened but
just be wary of that. So I think that
about wraps it up for me and I'll
turn it back over. Oh, sorry. So kind of tying in with
a couple of other things, I'm sure you'll be hearing
plenty of announcements through the rest
of this conference and the rest of this year. Some things that Target
is looking forward to, just a variety of enhancements
up here on the screen. We use ILBs quite heavily and
Cloud Armor quite heavily, so we're looking forward
to some enhancements there. We're also looking forward to
some additional Interconnect locations. And one of the big items
that we're looking for is just some
general enhancements around VPC performance and
some of the managed services that might be announced
later this year. So with that, I'll
send it back on over. Thanks very much. [APPLAUSE] [MUSIC PLAYING]