How to Use Shapefiles in Tableau: COVID-19 Data Example | UTA Libraries

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[light upbeat music plays] Hello and welcome to another DAVis workshop! This workshop is going to be about using shapefiles in Tableau. If you want to access the files that are used in this workshop, visit this URL: libguides.uta.edu/DAVis Scroll down to the "Tableau Workshop Files" box. Choose the "Tableau: Shapefiles" tab, and you will download this Excel file, and visit this link, and click "Download" and "Shapefile." It will download a zipped file to your computer, and you will leave it that way. So, let's get started. We are going to use COVID-19 data for this exercise. To connect to the data, you're going to click Connect "To a File," and you're going to select the DFW cities file. What this file shows are coronavirus 2019 cases in the various cities in North Texas. We have them arranged by county, by city - I'm going to change this to make sure that it recognizes that that is the city - the number of cases, number of deaths, and number of recovered in each city. So, this is my data, and Tableau allows for states, zip codes, countries, and other polygons already existing within Tableau; however, it does not include cities in its database. Since Tableau does not include cities, I went ahead and used a public city shapefile, which comes from the Texas Department of Transportation. I came here to download the shapefile and, um, you can also get familiar with the variables that exist in the file by clicking on any of these boxes. OK, so, once you have that shapefile downloaded, you're going to go into Tableau, and you're going to add a new data source. In this case, Tableau considers it a spatial file; so, we are going to call it... Click to connect to a spatial file. In this case, I'm going to choose the Texas Department of Transportation city boundaries. Notice that it's still in the zipped file format. We're going to choose that, and we're going to open it. Now, if your files are [not] overlapping - data that don't have a particular area where they join - let's say one is in zip codes and the other is in counties, then you can do a "Create a Join Calculation," where the first equals 0 and the second equals 1, and that will allow you to do a full outer join where those data do not align with one another, and that will allow you to map them on top of each other. However, in this case, we are going to use data that do align, and, in this case, they align based on city name. So, in the first dataset it's called "Location," and in the second, it's called "City Nm." And now it's going to join based on matching those city names. So, you can see here where the Location is DeSoto, City Nm is also DeSoto, and it appends that information. I'm gonna go ahead and do a full outer join. The reason I'm doing that is because I want to keep the other cities in Texas as a just-in-case. If you are 100% percent sure you're only going to want those North Texas cities, you can do a left join that would only include the data from basically that left dataset. Once your data are connected, you can go to your first sheet, and you can start using those data points! From your shapefile, you can double click geometry or drag it on to the sheet, and it will show you all of the shapes that you have in that file. Now, one thing about using a shapefile in Tableau is that it automatically aggregates measures. So, it considers all of these shapes one singular thing. So, you can see when I highlight my mouse, it highlights everything. We want to break it up by each city, of course. So, then, if we go up here to City Nm and move that to "Detail." Now I can look at each city as its own separate thing. This will come in handy when I want to start looking at coronavirus cases by city. So now I can start using my other dataset which were the actual cases, deaths, and recovered. So, for example, I could drag Cases to "Color." And, now I'm starting to color my dataset by how many cases there are. Please note, it assigns a color to everything. So, even if there are null or zero cases, it is still prescribing it a color. In order to filter for that, I can move Cases to "Filters." I'm going to do SUM because some cities exist in multiple counties, and I have that data in my dataset. So, I want to add up all of the cases for the entire city, regardless of which county it falls in. So, I'm choosing SUM, and then I am going to say it needs to be at least 1. So, it needs to have at least one case. and then hit OK. So, basically what happened is it's limiting back to North Texas So that you can see that all my data are in North Texas and so you can see which cities have how many cases. I can add things like the City name to the labels so that they're labeled. Typically, when I do this, I'll kind of remove extra ones that... All of this information is available if you highlight over it, and sometimes, since some of these shapes are kind of oddly shaped you might want to get the wording in the middle so that it's clearly shown. An example of that is Dallas. Dallas is this dark shape which kind of crosses Garland. So we'll want to move Dallas onto, kind of, its main area. Now, you can more clearly tell that that's Dallas. You can keep doing this to clean up your dataset. Once you get it to, kind of, how you'd like it to look, you can increase the size, the text. You can change the font, make it a little bit more easy to read. I also sometimes like to make the font match what I'm showing. So, I can do Cases to "Size," and have a little bit of a difference between the size of the font to emphasize those with more cases. And just like that, I have a map of cases in DFW. Now I can edit the map. I can make it dark. You can edit the map and change the colors how you like them to be. Edit the tooltip. So, let's create another chart! This time, we're gonna do deaths. So, I'm going to drag Geometry onto the screen, I'm going to drag City Nm onto "Detail," then I'm going to drag Deaths onto "Size." And this time, I'm going to do Circles to show the number of deaths. Once again, just to be accurate, since we don't have data for the rest of the state, I'm going to only include those with more than 0 or at least 1. You can add the name of the city, increase the size of those bubbles. I can make the color. Since I want this color to match my other map because I'm gonna put them together, I'm going to go in and copy the color from the first map. I'm going to go here. I'm going to find out what this hex value is, I'm going to copy it, and then I'm going to put it into my new map. OK. I'm going to add the number of deaths to the label. To put the number in the center of the circle, I'm going to click "Label" and then under Alignment, I will change it to centered and middle. I will also make the font bigger. I'll make it bold and white. ...maybe a little smaller. I'm going to also duplicate this. You can't tell, but there's actually a line here. And, I'm going to "Map Layers" and make this one dark as well. So, we can see. Can't really see the numbers here. So, I'm going to make them black. On my second chart, I'm going to remove the numbers. I'm going to change it to a shape. OK, I'm gonna change the color. Move this from "Color," and make it white. Make it a little bit bigger. And for this one, and I'm going to add a label of city name. I want the label to be back where it normally is and then I'm going to overlay these. I'm going to go up here and click "Dual Axis," and now you can see they have like a little white glow and the number is in the center, and the name of the city is underneath. Let's see what happens if we allow Mark Labels to overlap. I think that's okay, a little bit of overlap. It's not bad. I'm going to rename these maps by right-clicking the tab and calling it... Cases and Deaths in DFW. Now, I put my charts together on a dashboard. This can be shared via Tableau Public. It can be shared via Tableau Server or however you'd like to share it with your intended audience. Thank you so much for watching!
Info
Channel: UTA Libraries
Views: 1,087
Rating: 5 out of 5
Keywords: UT Arlington, Central Library, UTA, Library, Libraries, Media, Education, Higher Education, University, College, data, covid-19, coronavirus, tableau, shapefiles, data visualization
Id: sOT1OqIIk00
Channel Id: undefined
Length: 14min 43sec (883 seconds)
Published: Wed Apr 08 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.