[MUSIC PLAYING] TINO TERESHKO: Hello. Thank you for joining. My name is Tino Tereshko,
and I'm a product manager on Google BigQuery. Did anybody see that keynote? The CTO of Go-JEK arrived
on stage in a moped. That was amazing. Well, I'm joined onstage today
with folks from Home Depot, Rick and Kevin. I am going to
disappoint you guys. They're not going to make
their arrival on a lawn mower. They're just going
to walk onstage. Sorry. We didn't come prepared. But in all seriousness,
this is a fantastic story. I'm looking forward to hearing
Rick and Kevin talk about it, because one thing
is to be a startup, to be born in the cloud and
to build your infrastructure with bespoke requirements. The other is to have a complex,
multinational organization with online, with mobile, with
brick-and-mortar presence, and hundreds of thousands of
SKUs and professional services, and many, many years of
technology, innovation, and really smart engineers. And how do you move that
into the modern world? [INAUDIBLE] Rick and Kevin
are going to be talking about. And so what we're
going to talk about is the concept of
a data warehouse. It's a really novel idea, right? We're going to
take all our data, and we're going to
put it into one place, and then we're going to
correlate and analyze it, and we're to be data driven. Awesome. But the reality is that
data warehousing is really difficult. It's challenging. The complexity piles
on as your business grows, especially, right? It gets really, really difficult
to start getting business value out of technology, so
much so that technology can stand in the way
of your business, which is the last thing
you want it to do. Technology should be an
accelerator of your business. So it's really, with
the advent of Cloud, and cloud-native technologies,
and the scalability and high level of manageability,
maybe, perhaps, it's time to rethink what a
data warehouse really means. So BigQuery is the world's
only serverless data warehouse. Serverless, of
course, is a buzzword. But what do we mean by that? We mean that we entirely
abstract away hardware. We provide a very,
very high level of manageability
automation, API first. We provide virtually unlimited
scalability, low effort, low maintenance, to reach and
maintain peak performance, and so on and so forth, having
the ability to share data without moving the data
around, bringing people to the data rather than
the other way around. And we really have evolved as
a service over the past six years. These are some of
the things that we've delivered as features
to our customers over the past years
that really make it easier for folks to run
a data warehouse on top of BigQuery. Well, today, we announced
a number of really, really interesting features. And I'll walk you guys
through some of these. Of course, you've heard about
declarative machine learning in BigQuery, which
brings machine learning to the analyst. Folks don't need to know
TensorFlow, or Keras, or any of these really, really
complex technologies. They can just use plain, old SQL
to prototype and train machine learning models. Let's talk about one of
those features-- clustering. In this particular
example, I have a query that is selecting
a date and user ID. Well, clustering
increases data locality for high-cardinality
fields so that if you apply clustering
on your tables, you get vast improvements in cost,
vast improvements in efficiency or performance. In some cases, the performance
can be 10 or 100 times better than without clustering. And actually, at 3 o'clock,
Jordan Tigani and Lloyd Tabb are going to be doing a live
demonstration of clustering. You should not miss it. We also announced
a new BigQuery UI. So it's material design. It's in the Cloud Console. It's localized to all the
different regions that don't necessarily speak English,
with other great benefits-- for example, Data Studio
Explore, which allows you to pop out your
results of your queries into Data Studio for interactive
pivoting and further analysis. One feature that we haven't
really discussed much, but we've had for a
while, and Home Depot will talk about it a little
bit in a little bit more detail is hierarchical reservations. So for complex enterprises that
need isolation and resource guarantees that would like
to have more control, more knobs in their enterprise
data warehouse, are able to create these
hierarchical reservation trees that really
guarantee resources for specific purposes. But also, the beauty
of this is that these aren't silos of resources. If a data science project
is idle for some reason-- the data science team
went on a vacation, and those resources are unused,
transparently, seamlessly, down to subsecond
latency, those resources become available for the rest
of the organization to use. So as other teams continue
to scale on BigQuery, the entire
organization benefits. You get economies of scale. And of course, one
very challenging aspect of running a data
warehouse is ingest. We hear this from
customers all the time. How do I get data into
my data warehouse? How do I get it so that this
data is fresh and available right away, and it doesn't
compete with my query capacity? And we've really
invested heavily in our ingest capabilities. Our batch ingest is free. It's powerful. We have customers
loading petabytes of data into BigQuery every single
day without affecting their query performance
even one bit. And of course, we
have a streaming API on the other side,
which gives you access to your data in the real time. But enterprises
heavily rely not just on native bespoke tools
provided by cloud vendors, but on their partner ecosystem. So if you're a large
organization that have a history of
technology innovation, you are probably using
lots of, lots of partners, lots of vendors. And we will continue to invest
in our partner ecosystem. And finally, we have
lots of customers from all kinds of scales,
all kinds of complexities, all kinds of
industries and ranges. And they're all
eager to discuss, to share their stories and their
journeys with you guys here. And potentially, hopefully,
next year, some of your logos will be up here as well. Well without further
ado, I'm going to welcome Rick
and Kevin onstage to share their story with US. [APPLAUSE] RICK RAMAKER: Good
morning, everybody. Everyone enjoying the
conference so far? And thanks to you, Tino,
for kicking us off here and helping us put our
presentation together. So my name's Rick Ramaker. I've been with Home
Depot about seven years. And I'm part of
the IT team that's responsible for data and
analytics for the enterprise. So my team is responsible for
all the data engineers that are moving all the
data into the warehouse and making it available
for consumption across all the different business areas. Home Depot is doing a lot of
things on the Google Cloud. We were here last year. You heard us talk a little
bit about our dot-com sites. We've got a lot of other groups
using the Google Cloud as well. Today, we're going
to talk a little bit about our journey in the
data and analytics space. So first slide-- so hopefully,
everyone here knows Home Depot. So we're the number-one
home improvement retailer. We got a bunch of cool stats
on the slide about the size of our organization. If you haven't been
to Home Depot lately, I encourage you to check it out. It has changed quite a bit. We got a lot of cool, and
new, and innovative products on the floor. We're growing our
services business. So if you're not a
do-it-yourself-type person, and you want a do-it-for-me,
we have that available as well. We are starting to put
lockers in our stores. So if you buy online and
pick up in the store, you can just get-- you can just pick up your
products in the lockers. Lots of good things going on. We're expanding in
the softline business. Lots of really cool things
happening at the Home Depot. But in short, we're a
big, we're a complex, and we're a very data-driven
organization for a lot of our decision making, which
kind of makes my job really, really cool-- most days. [LAUGHTER] So our plan is to
talk a little bit about what led us to migrating
our warehouse to the cloud. Kevin will jump
on stage and talk through a lot of the technology
and architecture decisions. And then we'll wrap up
with some of the wins and the learnings along the way. So to kick it off here, a
little bit about the Home Depot analytics landscape. So we run our business on the
EDW, on our on-prem EDW today. So we have hourly
sales reporting that needs to go
out to the field. We have all of our
supply chain forecasts for making sure all the right
products make it to the stores. We have all of our
marketing campaigns that go out to
all our consumers. That comes off the
data in the EDW. All our event reporting for
Black Friday, red, white, and blue sales, all of
the store performance scorecards for Monday morning,
reporting at the stores-- all of this is
coming off our EDW and needs to be there on time
and be accurate every day. So we have very strict SLAs. And we have a wonderful
business community that lets us know whenever
we miss it and we don't have our data there that's accurate. So that's a little
bit of our world. But it is very operational
and part of what we do all day, every day. Our business community is also
very invested in this space. We have a very-- self-service is very encouraged
with our business teams. Our model is, we kind of
have a two-fold model. We have more of
our data consumers. And those are the folks
that are running the stores and running the merchandising. They're typically using Excel. They're using MicroStrategy
to get the data they need to do their business. And then we have the data
analysts and data science community. They're utilizing
Tableau, R, SAS, Python, all the different
tools, to drive more of the ad hoc solutions
on a regular basis. But one thing that we realized
is that the demand was growing, and it wasn't going to stop. So we were having a hard
time managing all that on our on-prem environment. We were told our
on-prem environment was one of the busiest that's
out there in the United States. And kind of proud
of that, but it also presented some
additional problems. So that led us to a
decision point here. All right. That led us to a
decision point for where we wanted to go going forward. So our on-prem solution
had served us very well. It helped us get to a
consolidated single EDW. We were finding a ton of value. But we were spending a lot of
time on capacity management. And every time it came
to refresh the hardware, there was a large cost outlay
that we had to work through. And two to three
years prior to this, we had to do a
capacity expansion. And that was about five to
six months worth of planning. It was a three-day outage. There was a lot of work
to make that all happen. And then we used up all that
capacity in about a year to two years. And that capacity was gone. So we were now to the point
we had to do another capacity expansion with another-- with all of that
work ahead of us. And then on top of that, our
business teams are saying, all the technology is expanding. How do we start doing
more and more complex analytics in that space? So that's what led us to having
to do something a little bit different, and really, a
migration to modern analytics. So we started taking
a look at what that would look like to move
to a modern analytics platform. And we know we wanted a
world-class platform, let us use managed Python notebooks,
get into the machine learning. We wanted to make sure our
analyst community stayed very self-sufficient. And we wanted to
make sure we were solving for a lot of the
challenges on the development side as well. So those were our objectives. So we did go through a
pretty in-depth analysis of the different platforms
that are available. And we did POCs with
the-- you name them, we did it with them,
for the most part. We also did a very detailed ROI. So yes, cost was a
big piece of this. And we did do all our homework
there, from an ROI perspective, to make sure we understood all
the different cost differences. And we also spent a lot of
time collaborating across all of the key stakeholders. Security was probably
one of the biggest ones we worked with, along with
our infrastructure teams, as well a lot of the different
business teams as well. And as a result of
all of that analysis, obviously, that led us to
select the Google Cloud Platform, and
specifically BigQuery, to help us with this journey. So I'm going to turn
it over to Kevin. And he's going to
walk through some of the technology and
architecture decisions that we made. KEVIN SCHOLZ: Excellent. Thank you. [APPLAUSE] So as Rick said, this was a
once-in-a-lifetime opportunity. If you look at large enterprises
and legacy data warehouses, these are large
investments, a lot of data, a lot of people basically
relying on this. So if you're about to pick
up and change everything that you've known and move
it, we basically said, let's start with our wish list. What can we do? So we put together this
wish list of items. And we categorized it for you. So what we decided
to do was, if we were going to go
through this effort, we were going to go all in. We weren't just going
to pick up some stuff and move it, do some
[? IAS ?] stuff, and try to move things in. We decided, if we're
going to do this, we're going to
modernize everything. So we right away said,
we don't want a port. We knew this was a huge step. So if we're going to get
to a modern platform, it required that level of
trust and that level of a leap to get there. The next thing we did
was, as Rick said, we run the business on this. This is every day. It's required. So we needed a date. A lot of agile practices, you
try to work on iterations, and you deliver when you can. But we have real dates. We have real timelines. And we don't want to have people
between systems for that long. So we chose a time. And we gave ourselves a
couple of years to get there. This is not a minor undertaking. But we knew, as part
of that, as well, is we needed to be agile
in what we were doing. We were going to learn as we go. We also were working
with the Google team on features that weren't
available yet in BigQuery. And we were pushing
in terms of things that we needed to run our
business on the platform. So we knew part of that
would be to be agile. And we would have to
change along the way. When we look at the people
that were involved-- so we have not
only our analysts, but our folks that
run the platform-- we said, what do we
want to do with them? So if we're going to
make this big change, we have to first
adopt new practices. So part of Home Depot's
general change has been, we want to adopt full-stack
teams, localized, very common. But in a data
warehouse space, that's kind of a novel or a newer idea. It's happening
more and more now. But a couple of years ago,
it wasn't very prevalent. We also wanted to invest in
all the newest technologies, so the cloud involved, as well
as all the things surrounding that. And we wanted to make sure
that no one was left behind. So we wanted to have a learning
path for all of the folks that knew maybe the
old set of tools or the old set of
technologies to move over to the new platform and have
the time to learn that platform. When you look at the tech, we
made some big decisions here. Our teams on prem that
run our data-- in our data centers, all of
our technologies-- they do a great job. But that's a large investment
and a lot amount of time. So we wanted to use as much
managed services as we could. So as Rick said, when we
evaluated different cloud offerings, that
was a huge thing. We didn't want to have all
the stress of the DevOps. It's hard enough to
make sure the data is right, delivered on time,
and with all the changing requirements. And the biggest
takeaway from that is, we wanted to scale
up and down faster. Rick showed you the chart
of the CPU use, right? If we wanted to grow
that very quickly, that's a multi-month effort-- large dollar times. With the cloud, you're able
to scale it much quicker. So that was a huge win for us. One of the things we
also decided to do was around ETL tools. So there's a lot of
great tools out there-- Informatica, Ab Initio--
pick your favorite one. We decided, as part
of this change, as well, that if we're going
to be modernizing the platform, that we were going to modernize
the way that we brought the data into the platform. So we took a much more
developer-centric role. So today, where we are, most of
the ETL and the data pipelines are co-driven, and
they're developer driven, and they're using modern tools. And we don't really focus on
some of those classic tools. There's nothing wrong with them. It's just the
choice that we made. So from this wish list,
we basically said, OK, how can we achieve this? So I'll fast forward the answer. Everything on here, we did. So luckily, we had great support
from our leadership team. And it drove us
to an architecture that we'll walk you through. So we're going to take a
little bit of time with this and walk you through all
the different pieces of how we did this. So let's start with
the capture part. So looking at this
from left to right-- so down at the bottom,
our OLTP sources. OLTP is our operational
systems, where you place orders, sales through
the front-end registers, all of the interactions
with the company. All of those are going to remain
in those operational systems. They may move over time,
but they are where they are. Today, we have
thousands of databases, file sets all over the place,
in prem, from customers, from partners, as well
as streaming messages. We have a lot of things that
are moving all the time. All these things
have dependencies. And they're all scheduled. And they have tight
timelines on them. So our on-prem scheduling is
Tivoli Workload Scheduler. So one of our
requirements is that we have to wait-- when
a workload is done, we have to be able
to pick that up from a dependency point of view
and then be able to move it. For our main scheduling,
though, within the cloud, we didn't want to
bring that product out. So we use Jenkins as
a way-- and we script that environment to basically
act as that bridge into the way that we're moving. So here, you'll see Hadoop
with a fancy name data factory. That's our name. So one of the things
we wanted to do with all the complexity
in the data center, with all these databases,
with all these queues and all these files moving around-- If you look at
that surface area, there is a lot of systems, and
firewalls, and a lot of things you have to jump through. So we didn't want to
expose all of that, and all that complexity,
and all that work each time a team needed to move
a different data source out to the cloud. We have invested in
Hadoop a long time ago. So we have a large on-prem
Hadoop infrastructure. So the teams knew how to
move data to that platform. It was also very well
secured, isolated, segmented, and it scaled
really well on prem. So we made the choice of using
that platform as basically saying, get the data here. And all the teams
knew how to do that. It's relatively easy
within the company. If you can get it
there, we can then create a factory or a
pipeline to move it out. It's bidirectional. So that, basically,
is hiding all of our complexity
of our data center and all the complexity
of those other systems. I'd say, when we first
started, we had a lot of teams. We built, probably, four or
five different versions of this. But we've centralized
on one version, because the more we can
have one way of doing it, it makes it easier for
all those dependencies. And we don't have teams
spending time on plumbing code. It can basically
move a lot faster. So the capture side--
that's kind of the story. We are making use of that
large infrastructure we have. So our next part
is really, hey, how do we get it out to the cloud? So there's many ways companies
can peer with Google. We use our own way. It doesn't really matter the
company we're going through. But we have large,
10G connections through multiple connection
points from our data centers and our partners into Google. The first thing we
do is, we basically wanted to get the data to
the cloud, so a classic data lake, right? The buzz word around
getting it there. But for us, what
that really means is, source-similar data
from those systems, in the format that
it was, typically landed into a GCS storage
bucket or into BigQuery table in its raw format. So that's source-similar,
minimal changes. So I'll give you
a sample change. If it's coming out of an
older mainframe system-- the EBCDIC to ASCII
translations have been done, any character set adjustments. Those things, we take care of. But once it lands there,
it's not been modified. It's basically as it was
from that source system. So that's a
relatively easy step. We've got that on-prem piece. We move it out. Now, we're really into
the processing phase. We've got it there. So the next part is classic
ETL or ELT transformations that you have to do
on those data sets, as well as we have
streaming data sets. So we have a lot of
sources of sales, of inventory, of
order stuff that's basically moving out of those
source systems in real time. So we wanted that to
flow into the system, as well, in real time. So we have all these
different ways. Most of our processing that
we do today, it's either-- a lot of it is BigQuery
within SQL itself. We use Dataflow. We use some custom Java
running in App Engine. We have a wide variety of ways. Again, with the full-stack
development teams, they have some freedom to
choose amongst those platforms based on what's the right
use for their use case. So the processing
is much more fluid. What we tell the teams is
basically, start with BigQuery. If it works there, it's easy. It's very easy to do it in SQL. There's a lot less
code to write. You can go a lot faster. If you need something
more custom, then basically head that
direction if you need to. For streaming, what
we did is, on prem, we use a lot of legacy
or older systems, like WebSphere MQ or
ActiveMQ, different things. We also have some Kafka, as
well, in the newer systems. We basically wrote
a part that runs on our Hadoop platform
that moves that out to Google Pub/Sub. So all of the
streaming stuff ends up with all those payloads
into the Pub/Sub messages. So from there, that's where a
lot of the App Engine pieces pick those up. And then they begin to
transfer them into BigQuery. So that's how the
streaming works. But then we had
two real big parts of that that we
wanted to go into. First was what we call
our building blocks. So Home Depot-- it's
a complex enterprise. We have a lot of domains. So domains, orders, customers,
finance, supply chain, merchandising-- there's
a lot of different areas. So the building blocks for us
are those independent areas of data that are arriving. So there's a large
domain in a series of subdomains under those. But they're landing
into those, basically, nonintegrated,
domain-specific areas. But as they're landing,
they're landing in BigQuery. And they're in optimized
performance structures that they're able to
read from very quickly. And they're able to join. They're still able
to be used for any of the downstream processing. But they're nonintegrated
with other parts of the different domains. So the next phase of
that is, we really take what we call an
ADS, Analytical Data Set. It's really working backwards
from your analyst or your user in terms of what
different domains do they want brought together. So think about it as either
a materialized view or a view without actually
materializing in terms of just a query
set that's driving against the series of those. But most of the time,
it's a materialized view. So we create another
set of tables. We then bring all the
different building block domains together. We'll collapse dimensions. We'll do things to make
it very performant. We also use nested
structures here. So we'll talk a
little bit about how we use nested structures
inside of BigQuery to do that. But the idea here is these
ADSes, these Analytical Data Sets, and these building
blocks are the direct things, the direct data sources,
that our analysts are going to use to basically
query the data sets and drive the downstream. So they have to be performant. Almost all, if not all
of them, are basically in BigQuery today, with
very few exceptions. So the next part of
that is our use case. So how do our analysts use this? So we've kind of got
two different camps. We've got our typical
business analysts that are using more structured
or reporting tools, standard BI things, Tableau. AtScale, if you're not
familiar, is an OLAP cube that reads off of BigQuery. MicroStrategy, which is a more
classic reporting, and SSIS, which is cubes using
Microsoft technology. Our data scientists are
doing much more active, more newer things with ML, Datalabs. And we still have a lot of R and
SAS in the environment as well. So they're basically reading
out of these data sets. They have requests. And we want these
to be self serve. So the Tableau teams--
once we get them out there, we have Tableau running
at Google as well. So they are able to get
instances running out there. They can connect
to the data sets. They can then build
self-service reports. If they need data
or other components, they can look at the
different data sets, pull from any of the values. They can read from them. And then if they don't
have the data set, they'll work with
our analyst teams. And we can help them get the
new data sets brought in. OK, so kind of a quick review
of what products do we use? So we use GCS. We use Pub/Sub heavily,
Dataflow, BigQuery. We do use Datastore. From our compute side, we
use App Engine, GCE, and GKE as well, StackDriver
for monitoring. Analytics, we're
using Datalab, which is one of Google's products. And we're beta testing
Datahub, Dataprep, and BigQuery UI, which was, I
think, announced today. But we've been using
that for a while. On the Home Depot side, we
have written our own tools to basically supplement
some of these things that we made it easier for
our internal teams to use. We call them the Data
Pipeline and Analytics Engine, some SQL editing tools. And we basically provided
a data catalog to be able-- so that teams can find the data. We have a large set of data. One of the things that
makes Home Depot complex is the wide number of
businesses we do under one roof. If you think of paint
and tool rental, there are large,
complete businesses under our roof, outside of just
the pick up the merchandise and go out the front. So those domains make
that really complicated. So finding a lot of the data
and connecting it together can be a challenge
for the business. One small callout here-- when we first started
looking at this, and we showed you in
the earlier slide, we did have a large
Hadoop platform. We do not use Dataproc. We basically don't have
much of the Hadoop stuff. We rely on BigQuery
for that use case. So we didn't really bring a lot
of the Hadoop stuff forward. Nothing wrong with the product. It was just, hey, we
said we're doing a leap. So we jumped onto BigQuery. We do have some of
our data scientists in some of the teams
that do use the product. There's nothing wrong with it. But our core EDW
doesn't use it a lot. OK, so what else did we do from
an architecture point of view? So if you guys are familiar
with any of the EDW things, we did do some changes here. So one thing I'll show
you, our slot hierarchy. Tino showed you how that worked. We take credit for
pushing a little bit to get that out there. And we'll show you our slot
map and what it looks like. So we'll show you
that in a second. The advantage of that
is, what we've found is that it allows the
teams within the company to have a fixed amount monthly
that they're willing to-- x number of slots
they can pay for. It can be dedicated to
them when they need it. So if they need more,
they can buy more. And the best part with the
hierarchy is if, like he said, if they're not using it, other
parts of the company can share. So everybody who goes
into that platform or into that
structure-- everybody wins, because you'll at
least get what you paid for. And chances are,
you'll get more. And we use that excess capacity
amongst all the different teams all the time. So it allows you to plan for
peak, but everybody wins. Another part-- we
made a big decision. In our older system, we used
a lot of surrogate keys. So you're coming out of
the operational systems. There's nothing wrong
with surrogate keys. But we were putting surrogate
keys on to every table. And then we had translations
back to natural keys. So we made a decision
when we moved over here that we were going
to have everything in its natural key format. We have a couple
of structures that still use surrogate keys,
when you're combining disparate domains, possibly from
different business units where the keys don't match very well. So they're not eliminated. But we've basically reduced
them as a standard way that we went in. And we'll talk a little
bit about nesting. So we do use nesting. So if you're not
familiar, in BigQuery, you can use a nested structure. It allows you-- my favorite
example-- think of an order with-- you bought three things. Those three line items
can be all within one row as a set of repeating values. And you can query it within SQL. And it's one row that allows
you to have more performance in terms of data elimination. And you can get
right to that row. We also use nesting
to, in some cases, in those analytical
data sets, to collapse some of the dimensions, or
some of our master data tables, into those. So there might be org
structure, or codes tables, or other things that can
be present in the data set. So you don't have to do joins. So we use nested structures. And we collapse them
into that row as well. The other thing we do is-- the last one is really,
Google projects and views. So we use Google projects-- as a company, we have a
lot of Google projects. And we use them
to our advantage. So a lot of our data
sets, as they come in, we have data projects
broken up by domain. That's where all of the ETL
pipelines, all of the ingest, and all of that works. It's all contained. It's managed by IT. It's delivered out to
those other systems. But it's a place that
basically controls the data. And you saw, in some of
those building blocks, they land there. Anyone can read them. So we grant access to them. But it's all there. But it isolates and
separates the different teams so they don't step
on each other. So we have about
nine of those areas. And we can add a new one if
we need as the company grows. We use views as well. So BigQuery provides views. But we use projects as views to
allow us to control the access. So we can put different
folks, in terms of IAM roles, or things, into different
groups, into those projects. And we can control the access. There might be PII fields
that they can or can't see, or other data sets that
we can use those projects. So we make use of Google's--
just part of GCP projects and views, part of BigQuery,
extensively to basically monitor it, so we don't have
one big project with all of our data it it. It's just too much. So we make use of that. So here's our hierarchy. Its a little bit crazy. So I'll walk you through it. So the way it works, if you're
not familiar with hierarchy-- so there's two ways. In BigQuery, you can
do pay as you query. It's a pretty standard way. Or there's BigQuery
slot structure. With this, it's a fixed dollar
that you can have per month. And you can basically lay
out a series of hierarchies. So what we've done is,
we have different parts of the companies in different--
they're all off of a root node. So the way hierarchies work
is, siblings share first. So anything that's across, if
there's extra slots available, it'll share to siblings first. So if I go to the lowest part
of the tree on the bottom left, for example, those two
nodes in the bottom-left corner will share first. So if either one
of them need it, they share amongst themselves
before it would go up. And that works for
every part of the tree. So what we have is all
the different groups in the company, from
our dot-com teams, to our engineering teams,
to our security teams, that all have different
slot allocations-- everybody's off of a root
node, because everybody wins. There's no loss here. You always get a
guaranteed minimum. And there's a good
chance you're going to get more than you paid for. And that's what we found. And it works really well. So if you're
familiar with Hadoop, it's very much like
the capacity scheduler. It works the same way. But it's really valuable for us. It also, with a
guaranteed minimum, it allows you to
control your SLAs and gives you more fine grain. So what we do then
is, we attach projects into those different buckets. So like I said, Home Depot,
we have a lot of projects. And we can attach them at
different resource nodes. That resource node then grants
them the amount of slots that they need. And if a team needs
a different one, and they want to buy-- let's say
we've got a general SLA bucket. That's where we put
all of our workloads that have guaranteed SLA. If we have a team that has a
project that needs a tighter SLA, and we decide we want
to break that down and buy a little bit more slots, we
could create another node, move that project over, and
they can get that guarantee. So I think, when the Google team
told us, when they saw this, they went, oh my. This is complex,
but it works great. And a good story-- the
day we turned this on was the best day
we had in the EDW, because before that,
we had basically everybody sharing one big pool. And it was-- you couldn't
control the minimums. So everyone was
competing with everyone. The day that we
turned this on, it was the-- we had huge workloads. Everybody's jobs went
through very cleanly. And it was painless. So let's talk the
other big part of what anyone considering
this type of move would hit just like we would. So security is a huge
part of everything. I'm not really going to walk
through the details of what we did. I'm going to give you our
strategy and a strategy that you could use as you're
considering this move as well. So almost every company is going
to have a good, extensive set of security guidelines that
drive based on your business. What we did is, we looked
at the data classifications, the separation of duties. We looked at our DLP and
exfiltration requirements. And we basically
developed a strategy for how we were going to move
to the cloud, what we were going to move to the cloud,
and the mechanisms by which we secure it. A big part of that
is, what we found in talking to a lot
of other retailers, a lot of other folks
considering this move, is a lot of the
IT teams that are more legacy-based companies,
when they look at the cloud, this is a scary thing. All of a sudden, we're going
to have all this data together in a large area. It's a big deal. So we built a partnership
with a security team. And to tell you, having them
use the platform as well is a huge plus. So it breaks down the barrier. They're able to do more stuff
in terms of security footprint using the same toolset
that we're using for this. And it basically
worked really well. What we found also is,
the adoption rate went up. The fear went down. The understanding
really went up. So the fear went down
is the wrong way. It's really the understanding
of the way the technology works. But the one thing
we did work through is, on prem, we have
a lot of native tools that we use for security. So when you look
at the cloud, we're basically looking at what
the new tools are, basically, out at the cloud, and
how to adopt those. So there might be, you have
x tool on prem that you use, but a tool at Google might
be equivalent or better. And we would look at
those different tools. So think about the
strategy that you would look at when you're
considering a move like this. And work with your
security teams on that. And our end of this was
really better adoption, so faster adoption. But we still put a huge
focus and spent a lot of time with our security teams. It's never out of focus. It's just a better approach
to go towards that. OK, so if I could
give you some stuff, some ideas in terms
of how would you start a migration
similar to this-- so first, be agile, because what
you think you're going to do, it's going to change. We've changed multiple times. And being agile has proved
excellent in that case. Evaluate your tools. You're probably using
existing tools today. Look at those tools. Bring the ones forward
that make sense. Maybe adopt something new. I think our single entry
point out of our data center was a huge plus for us. It made it a lot simpler. Keep your schema. So if you have an existing set
of schemas, bring them over. Try them in BigQuery. Don't change anything. See if it works. Optimize if it doesn't. When we started, we kind
of took an inverted way. I would recommend you just
bring it over and move it. Try it. If it doesn't work, then
worry about performance. There's lots of dials
and things you can move. We did not start by
copying the data. We actually tried to
go back from source. Would recommend copying
from your source, from your existing system. Use managed services. We love them, couldn't
recommend them more highly. And work closely with your
third-party companies. So a lot of the tool sets may or
may not be ready for the cloud. So I'll bring up
Rick to bring you through lessons learned and
some of our performance wins. [APPLAUSE] RICK RAMAKER: Thanks, Kevin. Now, we'll go back to
some of the fun stuff. So what is the result of all of
this great work that Kevin just walked through and
the team went through? And I will say, in short,
it is game changing. And it is, and the-- all the additional enhancements
that are coming in that are making it even a
better decision for us. So you see some
of the performance wins we've seen here. Yeah, I stacked the
deck here a little bit. These are our three best
improvements that we have. Your mileage may vary a little
bit on your actual performance you see. But these are real. We were seeing things that
were running for 8, 9, 12 hours that are now running-- that are now completing
in a matter of minutes. And we had a ton of workload
that we couldn't even run on our on-prem
environment, because we just didn't have the capacity there. And that was a cool
experience, to be able to see that stuff get
completed and be able to run. We actually had a-- it even
created a little of a problem for us, because a
lot of the teams were so enthralled with all
the new stuff they could go do, we had all kinds
of folks working on new stuff on
the cloud, and we had to get everybody--
don't forget, we have to migrate
all of our stuff off the existing platform
by the end of this year. So we got folks-- it was a good problem to have. So that's where we are from
a performance perspective. From a capacity
management perspective, it's also been very
game changing for us. I can tell you, we had an outage
in the middle of the night. We ran out of memory
on one of our VMs. In the old world,
that would have been a lot of work to figure
out how to get that fixed. In the new world, it was a
matter of bringing down the VM, increasing the amount
of memory, popping it up back up-- all stuff you guys
have probably seen and heard. But for us, that was a
pretty cool experience. I'll also say, we have a
major deployment going. Last week, even-- it
was just last week or two weeks ago, and we
were running a little bit behind in getting all of
our historical data loaded into the cloud for this release. We worked with our Google team. They swung a bunch of slots
over to us in about eight hours notice-- so thank you for that-- and used those slots for us
to complete that migration on time. And we had all that data loaded
in about a day and a half, versus how long we
were planning on that. So from a capacity
management perspective, it's been very helpful for us. And then, from a delivery
side, we are real, and live, and running on the cloud at
a number of key areas today. We have all of our pro
reporting-- so all of our-- we have a sales force
out in the field working with a lot of our major
contracting companies we work with. All of their reporting
is available to them running on the cloud. All of our services business,
where if you want to have someone do it for you
versus doing yourself, and all the-- they're doing a
measure for you at your home-- all of their information is
available to you on the-- to that team on the cloud. Our CEO Dashboard that
goes out every week, with tons of metrics
about running our business-- all of
that is available and runs on the cloud. We have our sharing of data
with all of our vendors that sell into Home Depot, and
that we sell their products. All of their data, we pile
it out to 75 vendors today so we can collaborate
on the data. All of that is available
on the cloud as well. And we have all of
our clickstream data from all of our websites, about
800 terabytes worth of data, of clickstream data. All of that, also on the cloud. So this is real. This is happening. And we're using it every
day for what we're doing. Last thing I want to touch
on is some of our learnings. And there are many of them. We could probably have a whole
presentation on what not to do. But instead, we'll focus on
some of our key learnings. I would say, one
of our biggest ones was just recognizing
the complexity of the change management
of migrating to the cloud. So before we started this, our
team, from an IT perspective, was your traditional
data warehousing team. So we had a lot of ETL experts. We had a lot of PII experts. We had a lot of SQL
experts that were all experts in the specific
tools that we were using. And they were some of
the best that we had. And we had data modeling teams. We had DBA teams. We had all the
different teams that were more of the typical
center of excellence models that many
organizations utilized. And when we decided
to move to the cloud, that changed a lot, and a lot
of it from a team perspective. So the team now-- and a lot of
the folks are here in the room. They're awesome. They're Java, they're
Python, they're SQL, they're full stack,
they're SRE mindsets. And that was a
major, major change to move that team from where
we were to where we are today. And my advice on that one
is, put the team first. So this is really, really hard
for the team-- all these words, all these new technologies. Easy to say, really
hard to execute it. So we did 10% days,
gave people a chance to learn different
things on the platforms and take their time
to go and learn those. We did a ton of training for
them in many different areas. You're going to make mistakes. That's OK. Learn from the mistakes
and keep moving. And give the developers
the opportunity to make their own choices. We didn't dictate, you must do
things this way and that way. They tried a lot of
different things. And we landed in a
pretty good spot. I will say, you need to
balance that sometimes, too. At one point, I think we had
eight data pipelines built. And we realized that
we probably don't need eight data pipelines. So we worked through it and
got to a good spot on that. I'll also say, your
productivity velocity also takes some time to develop. You're not going to be
as productive as you were or want to be on day one. And that's OK. So take your time
to get to the point where you have that
velocity set for that team. And we didn't even
bother putting together a road map on our deliveries
until that velocity was set. But at that point, once
you have a good velocity, definitely do get
a roadmap in place. That really flips the
switch from learning mode into, truly, delivery mode. And we did that about
the end of last year. And now we're really
executing against our roadmap. Second thing I would
throw out from a learning is to realize that this
change is just as big on all of your consumers of the data. So we spent a lot of
time on the IT side, because we're the IT team,
and we did a lot of work to get us through this. But at some point, all
of our business partners also had equivalent
amount of change. And there's a number of those
folks in the audience with us here today. So we built out an
analytics enablement team. It's a small team, probably
about 3, 4, 5 people. But their full-time
job all day every day was to help out that analyst
community with this migration as well. So they helped out
with making sure we're aligned on the
naming standards, and how many
projects do we need? And when do we hit the views
versus when do we hit tables? What training is required? And just helping out with a lot
of the overall communication. We're a big company. It's hard to catch everybody. I'm sure most of you guys
are using a tool like Slack for your communication. That's been our number-one tool
for communicating to folks. And it also helps all
of the other people going through this journey
to help each other, too. So we don't have to be the ones
that answer all the questions. All of that's been a
huge, huge help for us. And then find your early
adopters and communicate those wins. We've got certain folks
in the organization that were all in on this early. And they were great at helping
move this journey forward. Last thing I'll call
out to you is just, the new Google features
that are coming are really game
changing as well-- I mean, the
improvements in the-- with clustering and partitioning
has been super helpful. The joint performance is
improving all the time. We had to flatten
some structures to get the performance early on. We have to do less and
less of that today because of the joint performance. So thanks, Tino,
for all of that, and so forth-- the slot
reservations and so forth. We're still partnering with them
on a little bit more visibility on what's happening
on the platform. That's probably the
next thing we're trying to get some
help with Google on. But for the most part,
we're getting what we need. So in closing, as I said,
this technology works. It's really been game
changing for what we're doing in our environment. We have buy-in across
the organization. Everyone is all in on
getting us migrated by the end of this year,
end of fiscal year. Our partnership with
Google has been fantastic. We've learned a lot from them. Hopefully, they learned
a little bit from us. And I think we're set up for
analytics for years to come. So step one, get
your data in place. And then hopefully,
next year we can talk about some
really cool things we're doing on the ML side. [MUSIC PLAYING]