Beautiful, Interactive, and Portable Maps using Folium and Live API Data - Ariel M'ndange-Pfupfu

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
yeah how's everyone doing last talk right one more is there okay well if you need to get up stretch walk around in the back that's all fine I do that all the time helps me pay attention it's gonna be pretty casual here unlike the previous presenters you like working for these like famous projects and these super polished applications I kind of hacked us together myself and so this is very much like the DIY project but it's what the themes are very similar to what we've been talking about this whole afternoon I think we could have probably renamed this part of the track like the wrappers around JavaScript track or something like that right see how many of you guys love the capabilities of JavaScript when it comes to making visualizations in general it's awesome how many of you guys like writing the JavaScript code yeah exactly right so that's like Python developers we kind of sit in our chairs and we're like yeah I want that I want that from JavaScript and I want to make web applications too and then I want data in Python objects I want you know very demanding I think we can be but the result is you know that you have this great synergy of like all these different technologies and you get to like put them all in one project and deploy it and I kind of carry it from beginning to end and that's kind of what I want to show I want to show this idea that I had to look to use API data from the Washington Metropolitan Area Transit Authority as well as Walk Score and kind of like incorporate that with maps and make a little application that helps us visualize a city a little bit differently all of this stuff is on a github repository it's up here hopefully you can see the address so you can follow along there are a couple things you won't be able to do until you get an API key but there are for example for Romana they have a guest API key so if you just want to play around with this tonight you can just go copy this from their website and hit their API with it that's obviously gonna rate limit you if you try too hard but but yeah so the what I want you to take away from this talk really depends on what you're bringing to the table okay so I'm gonna jump into detail into a bunch of different things ranging from flask deployment on Heroku which we just heard a little bit about - geocoding - Korra plus whatever if you have some merit some expertise in these areas feel free to you know focus in on that otherwise just pay attention to the big picture and like how we're trying to like bring these ideas back into this like main project that we're working on so you can yeah so you can focus on the details you can focus on the big picture you can focus on both can focus on neither and just enjoy the pretty pictures I enjoy doing that too sometimes also cool so I think we're ready at this point to jump in yeah so all the basic boilerplate out of the way you know package management get Jupiter I think we can skip all that stuff how many of you guys think maps are art that's a pretty high percentage yeah I often think about maps and you know this whether there is a distinction between art and like practical visualization in some sense maps can be like data frames just like like pandas but indexed by latitude and longitude and then a whole bunch of features right but they really allow for a lot more abstract thinking I think than just looking at a data frame this is a map of what do you guys thinks is a map of it's actually a map of submarine cable like the underwater fiber-optic and other kinds of cables that carry data all around the world is it these people actually put out this really cool comparison chart between a map like this and trade route maps from like centuries ago and you can like notice the similarities and differences so I look at something like this and I'm like yeah this is it's showing me data but it's showing it in a way that like immediately brings up all of these other contexts right context of history contexts of geography context of the world you can see like just by looking at where these cables are you can kind of get a sense of like where people are so I think obviously maps are very powerful I probably don't have to spend any more time convincing you guys of that so so let's just jump into how we can like make maps that are as beautiful as I think this is in Python and so the answer is going to be like everything else a wrapper and we are going to wrap a JavaScript mapping library called leaflet I'm just going to run all the things here it's my favorite button in Jupiter restart kernel and run all the Python JavaScript libraries called leaflet and if you've been like looking at these other talks like how do they deal with these cool sliders and widgets and they click on stuff and things happen all of that is JavaScript right javascript is this engine that connects what the user is doing in a web browser to some back-end logic and can instantly communicate between the two so that you don't have to like refresh a web page to get a new map or a new visualization just instantly updates and so if you look in the leaflet documentation you know you'll see things like ok here are events and it's like listening for somebody to move the map and it's it's gonna trigger some action when somebody stops zooming or something like that right these are all the ways that we can like get interact interactivity into our into our apps and JavaScript and so we're going to have that same interactivity in in the Python version I'm going to split this up into a few little mini projects just to prevent this from being too overwhelming so the first thing we're gonna do is just look at some basic bus tracking I'm going to skip through some of this there's some exercises in this notebook and stuff like that you can try at home so when I say bus tracking I mean I want something that will show me a map of where buses are in DC right now that's what I want so the inputs are some location of interests and some search radius around that location is also helpful for like limiting the scope of this and our output is like the location and information about the buses in that area so the first thing we're going to need to do and this is the first of those little subtopics I talked about at the beginning is something called geocoding geocoding I like to think of it as I don't actually know what it's short for it but I think of it as like geographical information encoding right I want the geographic information encoding of a thing right so if I have a location if I have some place like Union Station in DC I want to ask I want to be able to ask a geo coder what is the geographic information associated with Union Station and it should spit me back something like this this is the address of Union Station this is the city this is how confident I am about this being right country icon some of the most important to probably latitude longitude for doing mapping another night things like neighborhood right place ID postal code a whole bunch of stuff so where does all this information come from yes yeah well they probably have their own geo coder at the geographic Intelligence Agency there are a lot of different geocoding providers okay google has one they don't I think you kind of don't can't get it for free anymore this is the OpenStreetMap geo coder it's osm this geo coder package which is a python library i imported at the top is just a something which makes it easier to manage the request to and from these geo coders so a geo coder osm is a shortcut for a whole bunch of like HTTP requests to and from the OpenStreetMap urls this is a nice kind of convenience function so this data is is only as good as Open Street Map is and actually the Google geo coder is a lot better at like figuring out what you mean by these location strings like I can type in the Google geo coder you know Capital One headquarters and it will probably figure out that it I'm talking about this building but if I try to type that into the Open Street Map geo coder it might be like sorry I couldn't really figure out what address you're trying to talk about and we'll might see an example of that later in the app with almost certainly will because I'm sure it won't work so we have all this information now right we've got all of these fields now and the idea is like now let's make a map in folium maps are hierarchical objects okay you make a map object and then you add things on to that object you can think of it as like you make a canvas and then you're going to like put things on to that canvas so all I'm gonna do here is I'm gonna make an object called bus map and it's a folium map I'm gonna initialize it with a location which is the latitude and longitude that I just got from the geo coder and I'm gonna start it at a certain zoom because otherwise you might end up looking at like the earth or like a block and then I'm going to add a child to that map which is a folium marker all of these capital map marker these are folium imported from folium and the marker also has a location but I also can give it pop-up text and I can make give it a custom icon it can color that icon and then I'm just going to type bus map and Jupiter notebook and I'm gonna see a map like this I really should not scroll using the touchpad on something like this this is what I see this is Union Station in DC right and I've got a little pop-up marker here saying okay here's the address so it's not magic but it is kind of nice because this map is scrollable and I can go however far I want here I can even go all the way to Capital One headquarters if I could find it I know my Virginian geography which I don't here so we are like right around there and if I wanted to go look at other countries I could I could just keep scrolling I could do whatever I want in this map I can zoom out I can zoom in and so this is kind of like the benefit of tapping into that leaflet interactivity right because this is JavaScript I can actually update this in the back or update this on the notebook and the information is like dynamically being retrieved from the Internet and in the background and you'll notice like there were some people talking today about like tiles you can see these tiles loading on the edges right it's dynamically updating that so so that's pretty cool and the other nice thing is that this is in a notebook right so how's this in a notebook well it's because folium is it's really just spitting out HTML right in the notebooks can handle HTML they can render HTML and so this is basically just a big blob of HTML code which is largely JavaScript to be honest that you can basically embed anywhere you can bet it in a notebook you can embed it in a web site so that's like the portable part of what I'm trying to talk about and you know we could look at what this looks like if you're you know kind of masochistic and you want to see what this is is no point zooming in because it's it's worthless but this all of this is like a map right every one of these things is representing like a polygon or a marker or a tile or something like that that's what I mean by like JavaScript blah but I don't have to touch it I don't have to write it I don't to maintain it I can just enjoy the benefits ok so that allows us to also save these files right so I could save this as HTML I could email this to somebody they would have the same scrollable zoomable map with the pop-ups with the text that comes up when you click on it all of that stuff would still be there so that's folium the basics all right now let's talk about pulling in some extra data and essentially doing some joins on live API data for those who went to the web scraping talks that there were on yesterday you know you might have noticed that they there were some really good talks about like responsible scraping of the web I have not built in these checks here so this is like very much do not necessarily copy this but the principles are the same I'm gonna start up a request session so request is just pythons way of interacting with HTTP requests and I'm going to point it to the LaMotta API and I'll be nice and I'll stop retrying after two times if they don't respond and I'm going to you know structure this request the way they want right just go to their API site it's very clear easy to use you can figure out how they want the request to be sent and you can you know put your API key in the right place you can make the right keys and values to get something out but the nice thing to do the really nice thing to do would be to not have to make the same API call twice for the same data like every time I run this notebook why should I hit the ramada API again for the same exact information and so one of the things I built into this project as I was kind of like assembling the Frankenstein's monster is some ability to cache API calls right especially when an API is free like you shouldn't be hitting that more than you you really should more than you really need to right it's uh it's like the tragedy of the Commons thing which was in the keynote from the very first day right eventually somebody's gonna be like alright well too many people are using this abusing this free API and we're gonna take it away so edible pickle is one kind of way that you can do this in Python there's many ways to do it edible pickle the way it works is it literally puts a deck gives you a decorator which you can wrap around your functions and then the output of that function will be pickled and cached in a location that you specify and it's cached based on the function inputs so you can see that here I basically templated arguments 0 1 & 2 for this function which corresponds to latitude longitude and radius and so it's going to make file like some latitude X some lauded longitude underscore radius some number dot bus lists as going to put them in a cache directory and then edible pickle when I run this function again it's going to look in that directory and see if there's a file that matches the inputs that that function was just called with and if there is it will just load the pickle and pickle the result and return that as opposed to executing the function again so very simple kind of memoization type stuff we can play around this yourself by by adjusting this keyword if you do refresh equals true it forces it through recalculate the function if you leave it false then it will just get it from the cache because this is cached I know that there's going to be a bus heading to the Kennedy Center on Massachusetts Avenue this one that's been there every time I run this demo notebook because I don't need to hit their API over for the live data right now and just an example of what the JSON looks like from the the API when you get the response from Romagna probably will gloss over this in the interest of time but you can see that this get buses function I put in the latitude longitude and radius and I get back out a whole bunch of bus objects and one of the bus objects might look like this with a vehicle ID location where it's headed whatever so then all I'll do is I'll just sort of a for loop and I'll add to the map every bus that I found in this area right so that's why I'm making these little polygon markers and so I can see that this is like you know different routes of buses going different places so pretty cool that's the bus tracker part everyone okay with that how we like pulled together this API and caching and the map and all right not too bad not too bad so let's do like aggregations right that's what everyone has to do next that's what you do in most data analysis right let's a granade when you're talking about Maps aggregations sometimes well down to drawing lines on those maps and thinking about regions so how do we do that in folium the motivation for this part is really just like looking at some census data census data is really nice clean data set that has geographical information and and properties about each of these geographical areas so what I'm going to do is I'm going to load a Geo JSON file that I just found on the internet geo JSON is a JSON that describes a region so what does that mean well it means that instead of before or I had latitude longitude in some information about that point now I have a whole bunch of coordinates and these coordinates taken together basically draw out like a polygon right each coordinate is like the vertex of the perimeter of the polygon and so and then I have some properties about that that area and and it like what its population is in general this population is 709 so I have these four like every neighborhood in DC that's what this geo JSON file gives me so how do I map this now well one thing I might want to do is color these areas based on these properties right let me assign a color to each population so that if there's higher population it's like a deeper shade of green or something and so color mapping is a kind of interesting decision that you have to make when you just when you do this color breuer is one really nice way to think about this probably not enough time to go into the details of this whole process but suffice it to say you can figure out like some nice palates for the information you have depending on whether it's like sequential or diverging or qualitative data it might want to choose different colors depending on how many classes you have this is three for example if I crank this up to nine you can see how their palette interprets that and where they make the breakpoints and they're trying to do this - like maximize legibility you can still kind of see the difference between each of these different classes and it just looks cool you know it's better than just like randomly assigning colors red green blue yellow you know whatever so we can like access this in our notebook as well we import these linear color maps and you know we can choose a linear green blue scale a purple red scale we'll scale them according to the minimum and maximum of the population and the poverty rate those two of the attributes we have in the census and we will then make a folium map we'll add a little caption that shows a population scale so that people are not so confused and then we'll just call folium geo JSON and put our JSON data in and basically tell folium that we want to style this region based on this lambda function and so when we say fill color is this pop color is this pop colors is our color map and it's translating between the number of the population and some color hex right that's what this is showing up here so what you get out of this after I add some layer control which will just give me a little bit of a be able to click these layers on and off is something that looks like this so I can look at just population and so this is the green blue range and you can see the scale at the top which kind of shows like population zero is kind of in green five thousand is kind of in blue and you can see all of the regions with dashed lines that came from the Geo JSON the tiles are different because you can specify the file that the tile type in folium so this is like a less colorful map so that the colors are more evidence when you when you color the regions right you wouldn't necessarily want to make these colored regions on the first map that we had and the layer control lets you click these layers on and off so I can look at just the poverty rate by region I can look at just the population or I can overlay them and then the cool thing about this is you can kind of like over a blue overlay blue and red and get a kind of intersection type of query where you're looking at like looking for purple regions where the poverty rate is high and the population is high right and so then you can kind of like look at this from a bird's eye view and get a a sense of your your data but we can do so much more because this is in Python right so what if we want to make some custom properties for these regions well just let's open this up in pandas right let's just take our pandas dataframe which we made a little bit earlier we made a pandas dataframe here we cycled through all of these neighborhoods and the Geo JSON and like extracted out the properties and made a panda's data frame of them and so down here we can create a population density variable now that like normalizes by the area of the region and then we can update the geo JSON so we go back into that that geo JSON like massively nested structure and add a property for density to each one of these regions and then remap it and then we'll see something like this and you know if you go back and compare these maps you'll notice that like there's a lot less purple over over here east of the Anacostia and south of the Anacostia because even though there's quite a lot of population out here the regions are bigger and so the density is lower so like you can think about different ways you can structure these queries to get at the exact kind of information you want and you can use whatever processing you want in Python to do that and then have that reflected in the map so it's everyone still with me yeah all right yeah so this guy's kind of the limit for this kind of thing one last example in a couple minutes I'll show you is Walk Score by this point in the project I was kind of thinking you know there are so many ways to to visualize this right there's so many ways to like look at what the city actually means maybe I shouldn't limit myself to these like political boundaries right that the census gives you or that the city gives you always see this diamond in DC you know and so how can we visualize cities in a different way this is just a side note from the chloroplast stuff this is something else you can do in folium as well just add a slider that like cycles through different values as time changes and like link it to the time stamps so that you can kind of scroll through time and see how things change on your choropleth pretty cool alright so different ways of visualization you guys seen this map before this is a map of the US but the only thing being mapped here are bodies of water right there's no streets there's no state boundaries there's no coastlines it's just rivers lakes streams and so what I'm thinking about this ain't along the same lines with Walk Score this is the The New York Times is uh they came out with this recently it's like every building in the u.s. right again there's no streets on this map it's just buildings this is Chicago I I used to live in Chicago I love the the grid you can kind of see I hope just how like grid like it actually is and you can also you know we can actually go back to DC maybe yeah it doesn't matter but the point is that like you know we're mapping the city not based on what we think the city boundaries should be but based on what the elements of the map are saying right so when you think about Walk Score Walk Score is this product that tells you like wherever you are how walkable is that region like are you close to stuff is it easy for you to get around be close to transit you close the grocery stores things like that so what if we mapped this whole like what if we mapped our city based on what the Walk Score was for any given area as opposed to the boundary of the city itself and furthermore what if we filtered that based on like bus routes right so this last example is basically taking LaMotta API information about what routes buses take and coloring it by what the Walk Score is for every stop along that route not every stop sorry for every continuous location along that route it's not exactly continuous but there's a lot of data points and so what you end up with I don't necessarily think I need to go into the details because it's very similar right we're just going to define a different kind of API query to LaMotta define the the request to Walk Score make sure that we cache everything and then like all the magic happens you know here you can see that we're like kind of iterating through all the pins that we get and getting the Walk Score for each pin and then some other boilerplate stuff and then what we'll get out is this part where we can just select a route and get the rap scores for that route and hopefully we get something like this like a series of latitudes and longitudes and what the Walk Score is for each one and then we can use folium color line to actually go and take straight from our pandas dataframe a certain column and treat each one of those column entries as lat/long points on a line to be plotted and what you get out of that is something like something like this where I've added the population back in but you know that's toggleable and so this is like route l2 in DC the goes from Friendship Heights down to Farragut square by the white house and so you can see the Walk Score like all along that route and you can see like we're in this city is not very walkable where is walkable and where did the buses go so like there's an exercise in this notebook after this you can cycle through a bunch of the routes you can look up the routes that go over the Anacostia River to South East and see you can maybe even categorize these bus routes by looking at these walk scores as are they bus routes that connect places where people are already close to stuff or they places that are like taking people in and out of areas that are not walkable right like is this a bus that's meant to get people out into and out of a zone that doesn't have a lot of stuff or is it just like a way to carry people around on their way to work and you can imagine you can make heat maps using this you could do anything anything you felt like with this data so that's basically what's in this notebook the yeah so does that make sense cool yeah it's pretty straightforward and it's not it's not it's a little bit more hack together than like using something like - which would be really awesome but I just wanted to show how easy it is and how much power you actually have in Python to go and pull all this stuff from all these really cool data sources and libraries and technologies out there and just kind of like mush it together and make a product so the last thing I'll say and I'll stop for questions because it's been half an hour is this is a product so because this is in Python we can deploy this on Heroku Heroku is just what are the last remaining free hosting services they actually have this thing where they can they'll spin up your application only when people visit the site right so if I want to go look at this Apple ocation online I be being my best interest to refresh this page to make sure the application is actually alive right now and so what I did is I took the bus tracker part of this and put a front end on it that asked people for the location and the radius part of the equation and then the application reads that from the form it makes the API call in the background it makes the folium map the folium map is just JavaScript so then flask can just go ahead and render that JavaScript so what that looks like is if I want to click on a low font Plaza search radius and so there there it is and there's these are actually live buses now going to Dulles going to foggy bottom partially that's because Heroku is not very good at caching Roku's file system is kind of ephemeral and read-only so you can't like save stuff on Heroku file system and expect it to be there after the flask request context is stopped so that means caching is trickier you have to like save stuff on s3 or you have to like figure out other solutions but because I figured nobody will really ever use this app I'm kind of okay but yeah and we can you know after this talk is over if you're interested in the code the code is all on the github like I said there's two parts to us the demo is what I walk through the notebook and the tracker is the flask application and yeah so I'm happy to take any questions about any of this see if we can get it to break it's Sunday people what what is going on here yeah so let's put in the location we know won't work like maybe the National Zoo I don't think OpenStreetMap will be able to parse National Zoo to an actual address so it hits the the geo coder error page yeah so so how granular can you get with like the shapes when you're like not at the state level but you're like maybe on the neighborhood level and yeah that's that's kind of one of the challenging parts about working with this kind of data is that you need the shapes right I it would be extremely painstaking to go and like look up latitudes and longitudes for some custom shape that you want it to draw around your house or something like that right so often we rely on shape files from authoritative sources that so like the census data that I showed you they also provide they they have like certain shape files for different regions like they have they have voting districts they have other kind of like political boundaries and you can kind of like get these shape files and use tools like there's some mention in the notebook but you can use these tools to like convert them to geo JSON and you can like make it to that that you can use those shapes in Python essentially and there's a lot of like G is related Python tools that will help you do that but you basically need to find that file it's like ah these neighborhood boundaries that I showed you it's because I found somebody who had gone to the trouble of like making a shapefile for neighborhoods but that's the only limit you can go as granular as you want have you ever tried to put colored points or yeah so I showed you a few of the things which I ended up thinking worth look at the best but you know folium has a lot of different options for for putting markers on maps for making heat maps for making shapes on maps all sorts of stuff and it's extremely customizable because it's just hitting the leaflet API so anything leaflet can do you can do info Liam also and I just happen to think certain things are cooler than others so I wanted to do them but yeah you can do all sorts of stuff yeah so I I haven't tried doing that but like you saw how we used a custom color palette right so we could just replace this color palette with whatever palette not plot lib has and did it to do yeah so something like this and so like the way that this color is getting on to the map is is through collect color and fill color like these are attributes of like some CSS style probably somewhere deep in the guts of this thing that it's like choosing what color to make the regions and what color to make the points and you can change these to be programmatically set based on whatever you want right so you can have them be static you can have them be dependent on some Python function can be dependent on whatever matplotlib tells you you can have them be dependent on what the user inputs and then have the user say I want the color scale to be this and then it will redraw the map in that way so all of that is is completely customizable [Applause]
Info
Channel: PyData
Views: 25,530
Rating: 4.927835 out of 5
Keywords:
Id: xN2N-p33V1k
Channel Id: undefined
Length: 38min 17sec (2297 seconds)
Published: Thu Jan 03 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.