Cloud OnAir: How to build custom data visualizations in Data Studio

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[MUSIC PLAYING] YULAN LIN: Welcome to Cloud OnAir, live webinars from Google Cloud. We are hosting webinars every Tuesday. My name is Yulan Lin. And today, I'll be talking about custom visualizations in Data Studio. You can ask questions anytime on the platform, and we have Googlers on standby to answer them. Let's get started. So today, I'm going to be talking about how you would build custom visualizations in Data Studio. So I'm a developer advocate. I work on Data Studio. And if you want to follow along with anything I'm saying here or learn more, go to developers.googl e.com/datastudio/visualization. Just a quick overview of the things that I intend to talk about today. Want to give a quick intro to Data Studio for those of you who may not be as familiar with it, and then talk about what community visualizations are, what this custom visualization feature is, give a conceptual overview. So what are some of the things, the terminology you need to know to understand it. And then walk you through how to build one. There might even be some live coding involved. And then some conclusions and next steps for you all to take. So Data Studio is Google's business intelligence platform. It's free. It allows you to create dashboards that are really easy to read, to share, and to customize. And so when we think about the things that Data Studio allows you to do, you can connect to a whole variety of data sources. So if you go to datastudio.google.com/data, you'll see a whole bunch of connectors where you can connect to data sources, things that we've written here at Google, things that our partners have written. And if you don't see what you want there, we also provide ways for you to write your own custom connectors so that you can connect to different data sources. We also provide ways for you to analyze your data. What does it look like to find insights to combine different data sources to do some calculations? We also find ways to visualize data. So different chart types, tables, pivot tables. And then finally, sharing. Who needs to see the report? We make that really, really easy for you to distribute it among the people that need to see it. So enough talking about it. I'm going to pop over to Data Studio and just show you really quickly what that looks like. So this is just datastudio.google.com. And I'm going to create a blank report. And I'm going to use the sample Google Analytics data that everyone has access to. And up here you can see that there's a bunch of chart types. So you can make a time series, a bar chart, combo chart, pie chart. So I'm just going to make a pie chart because I just had too much pie over the weekend. Area chart. And maybe add some text. So while I'm here, I just want to point out one thing, which is that we support a whole bunch of chart types up here, but there's no way that we can support every chart type under the sun, everything that you would want to be able to tell stories with your data. And if you look here on the styling, too, we provide a lot of different styling options, but there's not always a way for you to do everything that you could possibly dream up. And this is where community visualizations come in. So what if you want other charts? What if you want other options? And so this is what community visualizations are really great for. And to show you an example of one that I've built, we're going to go back to Data Studio. And I'm going to enter a component ID, which is basically telling Data Studio I've written this other code. Please go load it. Please go fetch it so that I can make charts with it. And so as it's thinking about the report-- once you've written the code, you can distribute this ID, this location to whoever else might be interested in using the community visualization or in the custom visualization. And as it continues thinking-- and so it would just be the same as configuring any other chart in Data Studio, where you can select the dimensions you want, the metrics you're interested in, and the styling. Since it doesn't really want to load for me right now, I'm actually going to pop back over to my slides, tell you more about it, and hope that it likes me a little bit more in a few minutes. So a conceptual overview of community visualizations. What's actually happening under the hood when I click that load button, when I enter in that ID? So when you load a community visualization, the end user is providing a component ID, which is basically saying, hey, Data Studio, this is where to find the metadata about my visualization. So Data Studio then goes to read the package data and it says, OK, this is what I know about who wrote it. This is where I can find the other code and all the resources that I need for it. Then Data Studio will load their visualization resources. So it goes and finds, OK, you told me that this is where I can find the JavaScript, the CSS, the configuration. I'm going to load them in now. And then once it's loaded, the iframe tells Data Studio, hey, I'm ready for some data and styling information. And then Data Studio provides that data and style information so that the code you've written, the JavaScript code you've written can render the visualization. So transforming that data and styling information that the end user has provided into a really cool visualization. So what are the big steps involved in building and using community visualizations? So the first is that you have to write the code. So writing the code involves a couple different things, which I'll walk through in a sec. Then you have to upload the visualization resources. So right now, we only support Google Cloud Platform. But broadly, the idea is put the resources somewhere where Data Studio can get to them so that we can load them into your report. And then finally, loading the visualization in Data Studio. What does it look like to take all the code you've written, and see it integrate with the dashboard? So first I'm going to talk about writing the visualization code and the different things that are associated with that. So there's three steps to writing a visualization. You have to write a configuration JSON. You have to write the visualization code, and package the visualization. So we're going to dive right in and think about what the configuration JSON is, why it's important. So the configuration JSON defines two things. It defines data options and styling options. So it's essentially a JSON with two keys. It's a data key, and that configures the data side of the property panel. So it's stuff like, how many metrics does my visualization support? How many dimensions does my visualization support? Do I want someone to only be able to use one dimension and one metric, or should it be one dimension and between one and three metrics, and actually, these metrics should be named something really specific to the use case so that the end user knows what kind of metric to put in? So all of that happens in the data section. And you'll see an example of that later. With styling, it's stuff that you would see on the Style tab of the property panel. So things like, I want to name a checkbox. Should I show labels? Or I want the end user to be able to change the color of my bar chart, and I want them to use a fill picker for that. We support a whole bunch of styling icons that allow the end user to interact with them, and then Data Studio will send whatever they pick back to your visualization. So that's what the configuration JSON does. And this config does two things. One, I just talked about. It impacts how the property panel renders. So it impacts how many dimensions the end user can put in, how many metrics they can put in, what styling options are available to them. But the other thing it does is it allows the developer to know what keys to use to access the data later. So not only am I saying, provide me one dimension, but I'll say, provide me one dimension. And under the hood, I'm going to call it dim1 so that when I go to access it later in my visualization code, I'm going to say, I'm going to use dim1 in my code. Same with styling. So if I define a color picker for the color of my text, I'll call it maybe text color. And then I can go back in my code later and say, give me the text color thing that the end user selected. The next thing is to write the JavaScript code. So what goes into the viz code? There's three components that are necessary for the actual visualization code itself. One is the Data Studio helper library. We've provided a JavaScript library that abstracts out some of the interfacing with our API. And so that's one piece of it. The second is any JavaScript visualization library you might be interested in using. There's a million out there. You all probably know of D3, Chart.js. You might even have some that are internal to your own company. And finally, there's the viz code that you write. And so these three combined go into the viz code. Optionally, there's a CSS file where you can provide additional styling guidance to the iframe. And the code that you would write within that that's not the helper library, that's not visualization packages, is going to broadly look like this. The const dscc require @google/dscc, that's just saying that I'm including the Data Studio helper library that helped me write a visualization. And then I define a callback. I'm going to call it draw chart here. And I'm just going to say, hey, give me the height. Give me the width so I know the dimensions of the canvas that I get to work with. And then subscribe to the data so that this draw chart function is called every time there's an update to the data or to the style information that the end user is selecting inside the property panel. Another required argument to this subscribe to data function is a transform just to understand what data format would be most useful for you. And if you go to dev site-- and you'll see that URL later-- the specifics of what that object looks like and what information you get is all there. So next, you have to upload the visualizations to Google Cloud Platform. So you just have a Google Cloud Platform bucket. You put your stuff there. And again, broadly, the idea is, I need to put my code somewhere that Data Studio can get to it. And so right now, it's Google Cloud Platform. So you put it in. And then using the location of where you've put things in Google Cloud Platform, loading the visualization into Data Studio. So loading a visualization. Let's see if Data Studio likes me just a little bit more now. And I'm going to try to load a Sankey diagram. I'm going to clear the canvas so there's a little bit more space. And you see here that I've defined that this visualization can take two dimensions. So I'm going to go ahead and put in continent, gender, and then play with some of the styling. So as it decides to load-- I'm actually going to refresh it. So not only did I define in this visualization that I wanted the end user to be able to configure two dimensions and a metric, but I also wanted them to be able to select different styling options. For example, the node color and the link color. And since this doesn't seem like it wants to load right now, going to go to-- going to show you some of these visualizations that I've built. So this is the one I am trying to load. But here you see this is the Sankey chart. I have two dimensions. I have a metric. And then here in the style panel, there's a bunch of different things you can do with the colors. So I've said you can play with the node color, whether or not you show the node labels, change the font size, and the label offset so that whoever else wants to use this visualization can configure all of these things without ever having to touch the visualization code. So that's what that looks like. Going back to that tab, you'll see it's finally decided to load. So that's how you would load it. So now what I want to do-- might be a dangerous proposition when I'm doing this on the livestream-- is to show you how I would build a visualization. And so generally when I build visualizations, I'm starting from something that already exists. So yesterday, I took some time to build a histogram using D3. So I'm just going to show you what that looks like locally. So it's a histogram. I have buckets. I used the counts of things in those buckets to determine the bar height. But what I'm going to do now is I want to be able to load this visualization in a Data Studio report. So to be able to do that, I need to define a couple of things. Remember what I was talking about earlier. The first thing to do is to define the configuration. So with the histogram, naively, the simplest thing I can think of is I want an index dimension that just deaggregates my data. And so I just tell Data Studio I'm going to call it index. This is the ID that I'm going to use to access it in my code later. I'm going to do this label thing called index. And this is the label that shows up in the property panel when the end user loads it so they know they're loading it into something I'm calling index. It expects that it's a dimension type. And under options, I'm saying, you have to have at least one field populated in here, and you have to have a maximum of one field populated in here. And then I'm doing the same thing for a metric and saying, this is the thing I'm going to figure out the distribution of. And again, I want only one of these, so I'm going to say min max of one. And so this project is set up with a little bit of template code a little bit of scripting so that I can just deploy things. Run-- And so what I'm doing here is I'm just telling Data Studio to build it. And so once I have done that, I would be able to see those things show up in the data where I would be able to say-- scrolling up-- that there's one dimension, and it's called index, and one metric, and it's called sessions. And then the thing I would do in order to convert the histogram code from something that was run locally to something that is running maybe in Data Studio is to add the Data Studio community component library-- that's the helper library-- and to use that to interface with the data instead of maybe hard coding it, like I did for the initial demo. So this is just the setup for what I would do here. And this is where I'm converting the data from whatever Data Studio expects to what my visualization expects. And I'm just saying, hey, take the data from whatever it was before into what I know my visualization will take. I'm going to populate this drawViz, and just copy paste all the code that I had previously into it. And then this is the subscribe to data, where I'm telling Data Studio, hey, draw this visualization. Run this drawViz function every time I get a new set of data and styling option. So here I'm saying I want D3. And then for drawViz, I'm just going to copy over all of that code, essentially. And instead of height, I'm going to say, hey, Data Studio, give me the height of the iframe, because the iframe could change, because a user might decide that they want to move things around in their report, or resize things. And so the height of the visualization isn't necessarily something you want to fix. And dscc-- And then change the width to do that. And then instead of the data being some kind of hard coded sample data. And so that's really all it takes to take something from a visualization that you've used elsewhere or written to run locally into something that you can load into a Data Studio report. So let's go back here. Let's see. And I think that's what I called it. And so what that looks like-- here you can see that there's an index, and that's what supposedly is deaggregating my data. This particular dataset isn't quite set up that way. And then a metric. And so I can interact with it, play with it as if I was interacting with any other Data Studio visualization. And then here in the Style panel, I have this thing called a color picker. And if I really wanted to, I can pick this now. I haven't set up the visualization to take the styling information from it. But then I could go back in. And the way I would do that in the code is just to figure out where I set the color of the bar. I think it's fill. And then here, I would just call it, I think, vizData.style. And I think I called it barColor. And then that would be the only-- it's a one line change in the code for me to be able to take the color that the end user selects, and have it show up in the visualization. So going back. Just built a visualization. What are some of the takeaways from this? One is that the community visualization, this custom visualization feature gives you a blank canvas and data to play with. Essentially, we're handing you an iframe and saying, what cool things can you build with your data? And so you can do things even with Web Audio, and such, and WebGL. And the other thing is that the configuration files allow you to expose options to the end user, which means that your end user doesn't need to know how to code, doesn't need to know D3 or any other JavaScript chart libraries in order to use what you've built. And if you expose certain options, they can have a lot of freedom and flexibility, even within that, without ever having to go back and dig through the viz code. So thanks I hope you all go build a visualization. You can find all of the documentation and tutorials on how to do it at developers.googl e.com/datastudio/visualiztion. If you build something cool, share it on social media with the #DataStudioDevs. And you can find me on Twitter @y3l2n. Stay tuned for live Q&A. We will be back in less than a minute. Welcome back. So let's answer a couple questions about community visualizations in Data Studio. So the first question is, can I use non-JavaScript libraries, like Python's matplotlib or maybe R's ggplot2? And the answer to that is no. We only allow you to load JavaScript within an iframe because of the nature of how things are setup. Next thing is, how can we only let certain users use my visualization? Right now, we don't have a way to do that, except by limiting who knows where the resources live and who knows that visualization ID. But that is something we're looking forward to working on in the near future. The next thing is, what data sources work with community visualizations? And that's basically anything, as long as the data source owner allows it, because we want to make sure that people are aware that these are created by the community and by third party folks. And for the next thing, where can you find code examples for community visualizations? If you want to see examples of community visualizations or see working examples from the documentation, we have a GitHub. So if you go to github.com/googl edatastudio/comm unityvisualizations, you'll be able to see some examples of things that are up on our website. And then finally, how do you test the visualization as you build it? And so that's a thing I was kind of already doing is as I was building it, loading it. And so a lot of it is just loading it as you go. If you wanted to maintain separate dev and prod deployment so that people who are using your code aren't broken as you try to add new features, I would just recommend having two separate locations of your resources, and just managing them yourself. That's all for today. Please tune in next Tuesday for the next Cloud OnAir, and thank you all for watching. [MUSIC PLAYING]
Info
Channel: Google Cloud Tech
Views: 4,695
Rating: undefined out of 5
Keywords: Cloud OnAir, AI Web series, Data Studio, Data Analytics, Machine Learning
Id: zQszCuOjr2o
Channel Id: undefined
Length: 25min 52sec (1552 seconds)
Published: Tue Dec 11 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.