[Visualization Nights] Building a Web-based Geo Exploration Tool with deck.gl - Shan He

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello hi Ron my name is Shan I work at the visualization team with Nicole so Nicole just introduced xgl and then I'm today I'm going to talk a little bit about this tool we built with PEG GL now help us it's a web-based tool it basically helped us exploring geolocation data on a map and talk about maps when people ask me what I do like in terms of visualization what I like to say is okay I draw maps that's a short story and the longer version is I make tools to draw maps but sometimes I also make tools to make tools to draw maps as it can see how it goes down so why do we want to do that because location data in bed Canyon code many layers of information and by just randomly throwing on a map sometimes or most of the times you don't actually learn anything from it that's why we need some kind of exploration or web publication to help us a better understanding what is in your location data you know our location data is our biggest assets we have millions of trips happening every day we have billions of GS GPS pings on our platform this is just one example showing how we are doing from our first launch in 2010 we grow wood or doing our first trip in San Francisco and this is showing every single trip throughout the course of six years I think we made our 1 million trip in wait for it already past we made our 1 billion trip in December of 2015 and then shortly six months after that we made it our two billion trip in 2016 that's in just in six years of time to billions right in total happen our platform I mean the reason I'm showing this I mean not to brag but is that I just feel like if I just tell you a sentence of okay two billion chips in six years he doesn't necessarily get me you doesn't necessarily know what actually happened throughout the six years not on till you see this interesting map which is doing stereographic projection of the whole world seeing actual trips happen throughout six years you get idea of what it actually means same applies to an honest eighty level you know when I tell you okay five million trips per day you know like okay so what what does five million even mean but you know I will show you just this one image of New York which is made of old app activities our platform in just one single day you kind of just get it because you know you know New York you know Manhattan you know why the basics downtown Manhattan is light up because just of the population density you also know like those two three I don't have a pointer but this 2000 top does the airport LaGuardia LaGuardia is the bottom and then there's the end at another Airport over there and there's one other Airport in New Jersey so when five billion as a number turn into the three-dimensional map you kind of get the numbers the one-dimensional number bring was bring to life by just one single image and let's go into even more layers that map can embed let's go into every treat level like I love to show this map to people and you might think it is like a map of San Francisco but instead every line or every Street on this map it's actually made by actual trips on our platform in just one single day so I you probably know San Francisco a lot you know we're in the center that soma where a lot of ship happens with but it also has a lot of high buildings so we see a lot of like noise GPS arrows in center of San Francisco and another map made for New York with the same amount as one day of trip this is Mexico City now if you don't know Mexico City you're like why is this you know so like a spider net but the reason why there's a lot of curved or curved arrows at the bottom left is because we actually are mountains so oh the road they are a much wine here and just using you know just seeing cars following this road by doing on this map you kind of understand the city a lot better and I guess there's someone from funds I talk to so yeah so this is Paris I kind of already put the name of the city there vibrating yes this is Linda and this is Jakarta Indonesia this is my favorite I real dirt Rio de Valero had to practice my Spanish um so this is actually my favorite one you see this Bay and then those Malik rose kind of look like two different to animals of fighting with each other and this one is Sydney and same the same low supply where you see there's a bridge right in the middle that's classified actually it's a tunnel and we have lost a lot of GPS signals there that's why the lines are very random and when we bring all those static trips to life we made this image we made a video call that day in life of over this is animating one day of trips in London I'm just going to be quiet and let you watch it if they can I guess they can drive there but I don't know and we will probably have a lot of GPS arrows here so it's tough okay so at this point right you're probably bombarded by all the pretty images I saw at you and you're asking me there's uncomfortable questions of so what everyone working visualization povery got asked this a lot you know at the end of the day what I hate the most is someone come to me saying hey I love the map you did can I use it as my screensaver so if you say that to me I probably felt like I felt because I want my maps tell you a lot more than just a pretty image I want actually you know we make maps just to gain insight from location data I want you to learn more than just a pretty lines and colors so you know so let me talk to you a little bit about this process of gaining insights you know sometimes the process can be very painful in theory you think that you know a sneaker show you write some code you run it and then something amazing just going to happen on your screen but in reality is more somewhat like this right so 10 I mean out of 10 your code probably doesn't run and once you Fanny got running you show something random you don't even understand so this is the thing that we don't talk about like I show you one map you think it looks great I might spend like a day or even a week working on that map just so that it brings you something to your eye so I would always describe this making visualizations that it's a way it's a test to failure we're starting by drawing some sketches we make some mock-ups join pretty images on Photoshop or sketch and they will make prototypes we bill two or three we're running by the p.m. they hate one of them and then the designer hate the other in the end you have we have to do venturi again when finally one of your product prototype is good enough to we built we put it into production but that's when you're lucky most of the time you probably just get stuck in this infinite loop we always have to go back to start from the scratch an example one example I like to show about this painful process that I made so I that's like when we first launched we were pool which is probably three years ago ahead of economy come to a me saying hey we have all the super cool data can you make some visualization and tells a ghost story about it so I look at the data and then think the theory is pretty simple when you use uber pool and so instead of using two cars dropping up driving to two passengers we just send one car and pick up two of them and job-job them up one by one so if you compare this two different methods one with uber pool one without ruber pull you know you have one one side only one car on the road on the other side two separate cars on the road and of course you think group who is going to have less conscious on the road is going to provide cheaper ride and more efficient so if that's not hypothesis all I need to do is to do some visuals and compare using whirlpool people using rapport and then no without were poor people right separately so that's what I did I gather one day of uber poor trip Iran is randoms through OS m engine and gather two separate trips just to simulate each those two people it didn't take single rights they take two Russian said what what the map look like so I go on to do my first try I use d3 and canvas just you know brutally drawing all the lines on the map and then someone animating them and then the result is not very exciting in fact it doesn't tell me anything like they almost look identical there's no sort of differences between this two different different approach and then I got frustrated okay I stopped it will look nicer because imagine on your left side you actually have twice as many trips as you have on the on the right side so that the map should tell me something so maybe it's not lie maybe I should just do a point so I go back I turn all my data data into points and then store them into QJ s showing all the points on a map and trying to compare and as you can see this doesn't tell me anything at all they all also look very identical and then I think the problem here is that when I overlay ten points on top of each other which is 20 points on top of each other and the visual card is not going to much difference you're just going to get this one solid color that my varies a little bit from from each other so maybe that's not the right approach I shouldn't just expect when things overlapping on top of each other they should look different so I go back again for my third try I think about what is this one major number that I want to reflect in my in my visual the one major number that's very is actually the number of cars on the road so what if what I did is what if I calculate for the whole day every five minutes how many actual trips are passing this one single road segment and then we're mapping that first number of trips to a color and they use that color to die to paint aerial segments which reflect it back onto the trip itself maybe you will show me the difference and this is straight out I was really excited to see this because this is exactly what I try to look for I try to look for different traffic volumes that difference when people using to a different transportation mode and by actually calculating the traffic and then drawing them back to the trip I got this result and then when I compare them next to each other you can see there's there's obvious visual difference one doesn't have that much yellow high traffic volume the other has a lot of yellows okay so this is what I got for my third try and as I always do I made a video about it so animating one day for culture on the other side animating one people travels differently using separate trips so this is a very obvious that you can see when people travel separately you create a lot of traffic on the road you know when I show this to our head of economics they're like he's like almost oh my god this is too good to be true but that's actually what happened so it's not like the data is in there it just you have to find the right way to tell the story you have to run you have to find the right visual representation to in co2 to find that to find the inside inside the data so that's a funny practice and it took me two weeks as I didn't tell you but like almost every every trial like okay I'm just going to give up there's nothing there but I suffer through it and then I made something and then he turned out to be great what I wanted to say here is that you know as within many iterations in this finding insight from data process maybe there is some tool that we can build which came to me later maybe the some tool I can build that provide me all those different functions of looking to apply different pewters to my data choosing different visual representations and then embed and and then encode numbers with different visual channels so I think that's how why we why we do this tool coverage so Voyager is it true that can take a bunch of data set location data set and then turn it into different layers applied individual channel food and help people to explore geospatial data and find interesting story from it so yeah with your Voyager for past vision flow region of millions of location data in the browser and the niko just show some image of it so it needs to say because it's built on top of deck GL you can in call like couple hundred thousand millions of points and then do real time on the fly aggregations filtering just so that you can very fastly test different visual visual visual representations and see what to get insight from it and yeah this is this is the interface of it I'm going to go in right into the UFO so when the Senate user Flores you turn in your data set it could be CSV or geo JSON and way to provide you different sets of futures based on what's actually in your data set if you have numbers and numerical data you will give you like a wrench slider you know if you have time stamps you will give you a time slider so and so forth and with the future data and voyager provide you a bunch of states even layer suits like the ones nico just show like a scatter pal layer arc layer a hexagon layer and all the options at how you can encode numbers into visual channels and then draw the map for you with that help just one same single data set I can create all different this type of different diverse visualizations by interact with Voyager okay demo time actually it's a video demo time so I get couple hundred thousand trips from our database I'm total two hundred thousand trips as a CSV format and as you can see in this year in the CSV you have some column that looks like a temp them a daytime you have some columns that looks like a geo pair like speaking trip large speaking trip long yeah then you have some column that has like Numerical numbers like trip fares so what I did is I just saw it in Voyager and you know because I have the trip start and trip endpoint in my data set voice you're automatically recognized at the point and you will so a point later for you so without much too much work you already have this like layer of points scatter plot points of all the trips that begins and end and then the blue ones is all the end of trip and the yellow ones that all the trip starts and bear in mind this is twenty thousand trips with double double number of points which is four hundred thousand points you can render in a browser with much without much lag at all and after that you know like the main of the practice is to try to find something from the data so the answer I'm looking for here is is there any difference between the way that people travels in the morning or in the evening I mean of course they are but how do I show it so what I did is I applied a time filter to all my trips I filled her video didn't please okay so I time slider cuter to all the other older trips so I figure down a day of trip to just couple of hours in the morning 7 a.m. to 10 a.m. I said and few hurry up so you can't even see here you can see here it's very obvious that you know since the blue dots is all the jobs and then the yellow dots all the peacocks in the morning in San Francisco all the job hubs are kind of concentrated around the Somme area the Montgomery Street and the serve Street because that's where like from I guess that's where from Caltrain to downtown and all the other yellow points are around the surrounding areas like Marina Mission novios and I did the same so this is the morning and then apply the same future but to the evenings and then I got this image like it's almost like yoga right all the other patterns like completely reversed in the evening I have all the pickup's in downtown all the yellow southern downtown and then the blue ones are where the yoga used to be in the morning so I was really happy to see this and then Barry meant this just by one like swap in a slider and you can already filter it down your data to draw this kind of maps and then I went out to explore so what if I want to link the start and end of my trip and what I show anything so remember we have this art layer oddly is created by just selecting the start of a point of point as a starting point at the end okay let me turn off the the other point layer so yeah there you go this is just all the trips are then linking itself I mean you don't see much so we build us like interaction called brushing with brush you can move your mouse and then highlight the trips that only start and end inside the region of where your mouse point it and it's pretty cool because you when I mouse over that's like Oakland Airport and there's a lot of trips starting from there and then you can see moto trip we have which is like downtown to the airport you very this very sick French down there and was this like you know this also this interaction so happens on the GPUs we write them into the shader so we're not using javascript to filter data every every movement like so calculated inside the shader so that's why it's so fast and then I so just by looking scatter data it doesn't show aggregated is out so you can use we build this we also connect later with deck GL hexagon hexagon later to show key maps so what I need to do is just selecting my he map layer and then selecting the points I want one of the heat map to aggregate so this will perform on the fly pinning things things all the points into his hexagon cells and then we can also use it with a change resolution of the cell to make it a lot more fine-grain and then change different color palettes because you know who doesn't like purple and of course everything looks better in 3d once we add the third dimension to our key map we add a height use connects of the in by the number of points to the height of the hemas I want to pause here you can see like right here is where you have all the hot a lot of trips happening those are all the airport and then here I think it's a Caltrain station so we maybe see there's like anomalies in your data set where most of the tributes happen from a farm transportation HAP's but what if we want to filter down old hide those top nominees to show more patterns and within the lower distribution bread we can just die scrubbing the percent how slider it as well hide the top percentile bins and recalculate the color for the lower percent opens and shows you more more patterns underneath it with that Oh being showed now I'm going what if like you know two years ago I have Voyager or twice you have this true which week of painful exploration so I know just of curiosity I actually did it today I find my old data set where I used to create over poor visualization as your storage into voyagers so first of all I'm plotting the same point cloud with all the GPS points and again don't see anything but unique when I as soon as I turn around the hexagon layer where I actually aggregating the points because the very obvious the highlights are around the highways and then when I turn on the height where the density or the aggregate value is a month a lot more visible so this is when people taking separate trips and then it's very interesting where you see a long cup of roads there are a lot of those other roads and most car take those are highways and then even the the bridge actually not that many because I guess at that time you can really pull to East a and when I compare it it's like this only takes me like five minutes I just draw this your Voyageur agile those two image with 3d hexagons and way Jose since would also allow me to filter so time I try to recreate my my any road traffic animations was with my time pewter I dropped all the Geo distance I went once calculated into a pass and then color each row segments by number of traffic which which is already encoded in Python and then I got this image and since I later on I should heard this this is all the trip in one day but when I shoot her down when I future by time you know and actually animating suin I have this error error immediately create this animation how ro traffic changes throughout the day you so what is the hi Voyager we use react we use remap GL and we use tag GL which Nico already show you and those are like those just three simple libraries that we put up together just so that we can quickly draw maps and explores your spatial data and then with that was that helping us you know is we can create more and more geospatial mapping visualization that actually tells interesting stories about the city you know there's only one Shen so if you ask for like a video for the city you used to just give me a week now you can only give me two days all right that would be it thank you [Applause]
Info
Channel: Uber Engineering
Views: 9,553
Rating: undefined out of 5
Keywords: Uber, Uber Engineering, Engineering, Software, Data Visualization, Uber Data, Data, Maps
Id: pjzCMQBqjFw
Channel Id: undefined
Length: 26min 3sec (1563 seconds)
Published: Mon Jul 17 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.