[MUSIC PLAYING] YULAN LIN: Welcome
to Cloud OnAir, live webinars from Google Cloud. We are hosting
webinars every Tuesday. My name is Yulan Lin. And today, I'll be talking
about custom visualizations in Data Studio. You can ask questions
anytime on the platform, and we have Googlers on
standby to answer them. Let's get started. So today, I'm
going to be talking about how you would build custom
visualizations in Data Studio. So I'm a developer advocate. I work on Data Studio. And if you want to follow along
with anything I'm saying here or learn more, go
to developers.googl e.com/datastudio/visualization. Just a quick overview
of the things that I intend to
talk about today. Want to give a quick intro to
Data Studio for those of you who may not be as
familiar with it, and then talk about what
community visualizations are, what this custom
visualization feature is, give a conceptual overview. So what are some of the
things, the terminology you need to know to understand it. And then walk you
through how to build one. There might even be some
live coding involved. And then some conclusions and
next steps for you all to take. So Data Studio is Google's
business intelligence platform. It's free. It allows you to
create dashboards that are really easy to read,
to share, and to customize. And so when we think
about the things that Data Studio
allows you to do, you can connect to a whole
variety of data sources. So if you go to
datastudio.google.com/data, you'll see a whole bunch
of connectors where you can connect to data sources,
things that we've written here at Google, things that
our partners have written. And if you don't see
what you want there, we also provide ways for
you to write your own custom connectors so that you can
connect to different data sources. We also provide ways for
you to analyze your data. What does it look
like to find insights to combine different
data sources to do some calculations? We also find ways
to visualize data. So different chart types,
tables, pivot tables. And then finally, sharing. Who needs to see the report? We make that really, really
easy for you to distribute it among the people
that need to see it. So enough talking about it. I'm going to pop
over to Data Studio and just show you really
quickly what that looks like. So this is just
datastudio.google.com. And I'm going to
create a blank report. And I'm going to use the
sample Google Analytics data that everyone has access to. And up here you can see that
there's a bunch of chart types. So you can make a time series,
a bar chart, combo chart, pie chart. So I'm just going to make a
pie chart because I just had too much pie over the weekend. Area chart. And maybe add some text. So while I'm here, I
just want to point out one thing, which is that we
support a whole bunch of chart types up here,
but there's no way that we can support every chart
type under the sun, everything that you would want to be able
to tell stories with your data. And if you look here
on the styling, too, we provide a lot of
different styling options, but there's not
always a way for you to do everything that you
could possibly dream up. And this is where community
visualizations come in. So what if you
want other charts? What if you want other options? And so this is what
community visualizations are really great for. And to show you an example
of one that I've built, we're going to go
back to Data Studio. And I'm going to enter
a component ID, which is basically telling Data Studio
I've written this other code. Please go load it. Please go fetch it so that
I can make charts with it. And so as it's thinking
about the report-- once you've written
the code, you can distribute this ID, this
location to whoever else might be interested in using
the community visualization or in the custom visualization. And as it continues thinking-- and so it would just be
the same as configuring any other chart in Data
Studio, where you can select the dimensions you want, the
metrics you're interested in, and the styling. Since it doesn't really want
to load for me right now, I'm actually going to pop
back over to my slides, tell you more about it, and hope
that it likes me a little bit more in a few minutes. So a conceptual overview of
community visualizations. What's actually
happening under the hood when I click that load button,
when I enter in that ID? So when you load a
community visualization, the end user is providing
a component ID, which is basically saying,
hey, Data Studio, this is where to
find the metadata about my visualization. So Data Studio then goes
to read the package data and it says, OK, this is what
I know about who wrote it. This is where I can find
the other code and all the resources that
I need for it. Then Data Studio will load
their visualization resources. So it goes and finds,
OK, you told me that this is where I
can find the JavaScript, the CSS, the configuration. I'm going to load them in now. And then once it's loaded, the
iframe tells Data Studio, hey, I'm ready for some data
and styling information. And then Data Studio provides
that data and style information so that the code you've
written, the JavaScript code you've written can
render the visualization. So transforming that data
and styling information that the end user has
provided into a really cool visualization. So what are the big steps
involved in building and using community visualizations? So the first is that you
have to write the code. So writing the code
involves a couple different things, which
I'll walk through in a sec. Then you have to upload the
visualization resources. So right now, we only support
Google Cloud Platform. But broadly, the idea is
put the resources somewhere where Data Studio can get to
them so that we can load them into your report. And then finally, loading the
visualization in Data Studio. What does it look like to take
all the code you've written, and see it integrate
with the dashboard? So first I'm going to talk
about writing the visualization code and the different things
that are associated with that. So there's three steps to
writing a visualization. You have to write a
configuration JSON. You have to write the
visualization code, and package the visualization. So we're going to dive
right in and think about what the configuration
JSON is, why it's important. So the configuration
JSON defines two things. It defines data options
and styling options. So it's essentially
a JSON with two keys. It's a data key, and that
configures the data side of the property panel. So it's stuff like, how many
metrics does my visualization support? How many dimensions does
my visualization support? Do I want someone to
only be able to use one dimension and one
metric, or should it be one dimension and between
one and three metrics, and actually, these metrics
should be named something really specific to the use
case so that the end user knows what kind of metric to put in? So all of that happens
in the data section. And you'll see an
example of that later. With styling, it's
stuff that you would see on the Style
tab of the property panel. So things like, I want
to name a checkbox. Should I show labels? Or I want the end user
to be able to change the color of my bar
chart, and I want them to use a fill picker for that. We support a whole bunch
of styling icons that allow the end user to
interact with them, and then Data Studio will
send whatever they pick back to your visualization. So that's what the
configuration JSON does. And this config does two things. One, I just talked about. It impacts how the
property panel renders. So it impacts how
many dimensions the end user can put in, how
many metrics they can put in, what styling options
are available to them. But the other thing it does
is it allows the developer to know what keys to use
to access the data later. So not only am I saying, provide
me one dimension, but I'll say, provide me one dimension. And under the hood, I'm
going to call it dim1 so that when I go to access it
later in my visualization code, I'm going to say, I'm going
to use dim1 in my code. Same with styling. So if I define a color picker
for the color of my text, I'll call it maybe text color. And then I can go
back in my code later and say, give me
the text color thing that the end user selected. The next thing is to
write the JavaScript code. So what goes into the viz code? There's three
components that are necessary for the actual
visualization code itself. One is the Data
Studio helper library. We've provided a
JavaScript library that abstracts out some of
the interfacing with our API. And so that's one piece of it. The second is any JavaScript
visualization library you might be
interested in using. There's a million out there. You all probably
know of D3, Chart.js. You might even
have some that are internal to your own company. And finally, there's the
viz code that you write. And so these three combined
go into the viz code. Optionally, there's
a CSS file where you can provide additional
styling guidance to the iframe. And the code that
you would write within that that's not
the helper library, that's not visualization
packages, is going to broadly look like this. The const dscc
require @google/dscc, that's just saying that I'm
including the Data Studio helper library that helped
me write a visualization. And then I define a callback. I'm going to call
it draw chart here. And I'm just going to say,
hey, give me the height. Give me the width so I know
the dimensions of the canvas that I get to work with. And then subscribe to the
data so that this draw chart function is called every time
there's an update to the data or to the style information
that the end user is selecting inside the property panel. Another required argument to
this subscribe to data function is a transform just to
understand what data format would be most useful for you. And if you go to dev site-- and you'll see that URL later-- the specifics of what
that object looks like and what information
you get is all there. So next, you have to upload the
visualizations to Google Cloud Platform. So you just have a Google
Cloud Platform bucket. You put your stuff there. And again, broadly, the idea is,
I need to put my code somewhere that Data Studio can get to it. And so right now, it's
Google Cloud Platform. So you put it in. And then using the
location of where you've put things in Google
Cloud Platform, loading the visualization
into Data Studio. So loading a visualization. Let's see if Data Studio likes
me just a little bit more now. And I'm going to try to
load a Sankey diagram. I'm going to clear
the canvas so there's a little bit more space. And you see here that I've
defined that this visualization can take two dimensions. So I'm going to go ahead and
put in continent, gender, and then play with
some of the styling. So as it decides to load-- I'm actually going
to refresh it. So not only did I define
in this visualization that I wanted the end user
to be able to configure two dimensions and
a metric, but I also wanted them to be able to select
different styling options. For example, the node
color and the link color. And since this doesn't seem
like it wants to load right now, going to go to-- going to show you some of these
visualizations that I've built. So this is the one
I am trying to load. But here you see this
is the Sankey chart. I have two dimensions. I have a metric. And then here in
the style panel, there's a bunch of
different things you can do with the colors. So I've said you can
play with the node color, whether or not you show the node
labels, change the font size, and the label offset so
that whoever else wants to use this visualization can
configure all of these things without ever having to touch
the visualization code. So that's what that looks like. Going back to that
tab, you'll see it's finally decided to load. So that's how you would load it. So now what I want to do-- might be a dangerous
proposition when I'm doing this on the livestream-- is to show you how I would
build a visualization. And so generally when
I build visualizations, I'm starting from something
that already exists. So yesterday, I took some time
to build a histogram using D3. So I'm just going to show you
what that looks like locally. So it's a histogram. I have buckets. I used the counts of
things in those buckets to determine the bar height. But what I'm going
to do now is I want to be able to load this
visualization in a Data Studio report. So to be able to do that, I need
to define a couple of things. Remember what I was
talking about earlier. The first thing to do is to
define the configuration. So with the histogram,
naively, the simplest thing I can think of is I want
an index dimension that just deaggregates my data. And so I just tell Data Studio
I'm going to call it index. This is the ID that I'm going
to use to access it in my code later. I'm going to do this
label thing called index. And this is the label that shows
up in the property panel when the end user loads
it so they know they're loading it into
something I'm calling index. It expects that it's
a dimension type. And under options,
I'm saying, you have to have at least one
field populated in here, and you have to have a maximum
of one field populated in here. And then I'm doing the same
thing for a metric and saying, this is the thing I'm going to
figure out the distribution of. And again, I want
only one of these, so I'm going to
say min max of one. And so this project is
set up with a little bit of template code a little
bit of scripting so that I can just deploy things. Run-- And so what
I'm doing here is I'm just telling Data
Studio to build it. And so once I have
done that, I would be able to see those
things show up in the data where I would be able to say-- scrolling up-- that there's
one dimension, and it's called index, and one metric,
and it's called sessions. And then the thing I would do in
order to convert the histogram code from something that was
run locally to something that is running maybe
in Data Studio is to add the Data Studio
community component library-- that's the helper library-- and to use that to
interface with the data instead of maybe
hard coding it, like I did for the initial demo. So this is just the setup
for what I would do here. And this is where I'm
converting the data from whatever Data
Studio expects to what my visualization expects. And I'm just saying,
hey, take the data from whatever it was
before into what I know my visualization will take. I'm going to populate
this drawViz, and just copy paste all the code
that I had previously into it. And then this is the
subscribe to data, where I'm telling Data Studio,
hey, draw this visualization. Run this drawViz
function every time I get a new set of data
and styling option. So here I'm saying I want D3. And then for drawViz,
I'm just going to copy over all of
that code, essentially. And instead of height, I'm
going to say, hey, Data Studio, give me the height
of the iframe, because the iframe could change,
because a user might decide that they want to move things
around in their report, or resize things. And so the height
of the visualization isn't necessarily
something you want to fix. And dscc-- And then change
the width to do that. And then instead of the
data being some kind of hard coded sample data. And so that's really all
it takes to take something from a visualization that
you've used elsewhere or written to run locally into something
that you can load into a Data Studio report. So let's go back here. Let's see. And I think that's
what I called it. And so what that
looks like-- here you can see that
there's an index, and that's what supposedly
is deaggregating my data. This particular dataset
isn't quite set up that way. And then a metric. And so I can interact
with it, play with it as if I was interacting
with any other Data Studio visualization. And then here in
the Style panel, I have this thing
called a color picker. And if I really wanted
to, I can pick this now. I haven't set up
the visualization to take the styling
information from it. But then I could go back in. And the way I would
do that in the code is just to figure out where
I set the color of the bar. I think it's fill. And then here, I would just
call it, I think, vizData.style. And I think I
called it barColor. And then that
would be the only-- it's a one line change
in the code for me to be able to take the color
that the end user selects, and have it show up
in the visualization. So going back. Just built a visualization. What are some of the
takeaways from this? One is that the community
visualization, this custom visualization feature gives
you a blank canvas and data to play with. Essentially, we're handing
you an iframe and saying, what cool things can you
build with your data? And so you can do things even
with Web Audio, and such, and WebGL. And the other thing is that
the configuration files allow you to expose options to the end
user, which means that your end user doesn't need
to know how to code, doesn't need to know D3 or
any other JavaScript chart libraries in order to
use what you've built. And if you expose
certain options, they can have a lot of
freedom and flexibility, even within that, without
ever having to go back and dig through the viz code. So thanks I hope you all
go build a visualization. You can find all of the
documentation and tutorials on how to do it at
developers.googl e.com/datastudio/visualiztion. If you build something
cool, share it on social media with
the #DataStudioDevs. And you can find me
on Twitter @y3l2n. Stay tuned for live Q&A. We will
be back in less than a minute. Welcome back. So let's answer a
couple questions about community
visualizations in Data Studio. So the first question is, can
I use non-JavaScript libraries, like Python's matplotlib
or maybe R's ggplot2? And the answer to that is no. We only allow you
to load JavaScript within an iframe because of the
nature of how things are setup. Next thing is, how can we
only let certain users use my visualization? Right now, we don't
have a way to do that, except by limiting who knows
where the resources live and who knows that
visualization ID. But that is something we're
looking forward to working on in the near future. The next thing is,
what data sources work with community visualizations? And that's basically anything,
as long as the data source owner allows it, because
we want to make sure that people are aware that these
are created by the community and by third party folks. And for the next thing, where
can you find code examples for community visualizations? If you want to see examples
of community visualizations or see working examples
from the documentation, we have a GitHub. So if you go to github.com/googl
edatastudio/comm unityvisualizations, you'll
be able to see some examples of things that are
up on our website. And then finally, how do
you test the visualization as you build it? And so that's a thing I
was kind of already doing is as I was building
it, loading it. And so a lot of it is
just loading it as you go. If you wanted to
maintain separate dev and prod deployment so that
people who are using your code aren't broken as you
try to add new features, I would just recommend
having two separate locations of your resources, and just
managing them yourself. That's all for today. Please tune in next Tuesday
for the next Cloud OnAir, and thank you all for watching. [MUSIC PLAYING]