The What-Where-How-Why of GPU computing with R

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
alright I'm business okay thanks for sticking around guys I know it's the end of the day and I have a train to catch anyway so I can take this real quick but I'm here today to talk about GPUs my name is Kelly Bryant I'm a developer advocate for map D very recently but before that I spent some time calling myself a data science product engineer and like Jared said I also do a lot of things for our ladies specifically our ladies TC and some cloud stuff for our ladies Global which is pretty cool I publish about play series called our profile on the open side blog which I like a lot so check that out if you haven't seen it and I just want to hop right into it so I'm here today to convince you to be interested in GPUs if you aren't already or you've heard about GPUs and GPU computing and you don't know what it is yet so let's start I wanna talk to you about what GPUs are and wouldn't we introduce GPUs to people oftentimes you'll do it or you'll see it done in conjunction with talking about what CPUs are so GPU is as you may have guessed an acronym and it stands for graphics processing unit GPU processors were designed for the purpose of rendering images animations videos to your computer screen but in the last decade we've been seeing them applied to a lot more applications it's common to see them alongside CPUs as I said and so I've drawn up the classic like CPU GPU core comparison so a core is just a unit office unit that receives instructions and and does the processing of calculations and performance things based on those obstructions so CPUs are often called the the brain of the computer they have multiple cores maybe like a couple but those cores are really fancy and they're awesome and they do awesome stuff and they're optimized for doing serial tasks gpus on the other hand may have hundreds or thousands of these cores and but don't get too excited because they don't do all the awesome stuff that cpu cores do they do very basic tasks but there's a lot of them and they're optimized for doing parallel tasks here to explain the difference between serial and parallel parallel task execution are the Mythbusters if you haven't seen this video it's really great I just created two gifts of it on the one side you see CPUs where they have this robot that aims and fires paintballs aims and fires to create eventually this rendering of a smiley face on a big canvas and on the other side they've created this monstrosity of a machine that has thousands of pipes with a single paint ball in each pipe and they hit a button and all at once all those people shoot at the canvas and create a rendering of the Mona Lisa so I don't know if that's a perfect explanation of the difference between GPU and CPU but it is very fun and if you watch the video online you can hear Adam Savage just maniacally laughing after he hits the button which is really great but was talking about why people are so excited about GPUs you know like when I first heard about GPUs it was from my dad actually and I think I was in college he's an aerospace engineer and he does start camera stuff and some of this stuff's I don't understand and he asked me hey do you know CUDA and I was like dad why would I nuku do it what is that but hearing about it again in graduate school and then again over and over now recently JJ gave a really awesome keynote at this past our studio conf and in it he talked about how GPU computing just revolutionized the field of computer vision through deep learning if you haven't seen that that's also available online I highly recommend it it's long but it's worth it so people are doing really cool stuff with GPUs and here are some other things that they've done so I want to talk about what problems GPUs are good for it outlet what applications have people found for that but first off they started being a graphics card then people discovered you could do high performance computing with them from there we discovered application in the same machine learning revolutionize deep learning from there even further general-purpose computing for big data analytics and finally database operations so today I want to talk about this last two kind of pieces further on but first just kind of to sell you on this like I know that a lot of people get up here people from the day of science industry and they say you know we're all dealing with big data and like I want to be real I know a lot of people a lot of people in this room probably aren't day-to-day dealing with big data but this was one of the quotes that came out of my are profile project by interview project with our open sigh the interview was actually conducted by Sean cross but it was with dr. Julia Stewart Landis and I love this quote from Parrish says I learned a code and a total panic because my data was too big to be opened in Excel so because I was trying to figure out science and coding at the same time I conflated research with data science I was confusing my thesis questions that no one had ever asked before with data questions that many people had asked before and salt so if anything comes from you attending my talk today I hope that like even if you aren't ready for Big Data data analytics like maybe someday you will be and you'll remember the stuff that I've talked about today and and you won't be lost in this total panic where you don't know what to do and you think these problems have never been solved before so with that how much does it cost I love Wired magazine I read it all the time this title in it from an article in 2013 really got me though now you can build Google's $1,000,000 artificial brain on the cheap how cheap you say only $20,000 and by you I mean and shooting yeah so so when I was reading this in grad school it kind of pre biased me to think like oh gee peace really cool there's a lot of computer vision applications but it's probably not for me yet I'm a student I don't have this kind of cash and so I I kind of went along with with that assumption for a good number of years but it's 2018 now so so what has happened since then there are ways to get access to GPS now and I sort of listed these in order of harder to figure out down to easier to figure out and you can argue with me afterward whether I've got these in the right order but these are kind of my personal experience GPUs are computed on your computer perhaps there's one in your computer that you can use now I've read that these things are more difficult to figure out than just that and some of the graphics cards on your computer might not be optimized to work in that way and so you have to do a lot of googling I haven't bothered to do this second you can go out and buy these things nvidia sells them and you can buy one and own it and keep it in your domicile and take care of it like a pet I I don't know I don't touch hardware you good now here we get into the space that I am comfortable with so you can go and rent this hardware from a cloud service provider and you can do it pretty cheaply here's Amazon Web Services you can get one for 96 hour they go up more expensive than that obviously pretty quickly but ninety cents an hour very reasonable here's some from Google cloud all the major cloud providers have them as ur has them I don't have a cypher Azure but they're all out there and they're ready to be rented and you can go and learn the skills to rent them very easily but I do want to caveat this with a couple of other thoughts like when you or I'm assuming most our users are going to be doing GPU compute in kind of a hybrid way where most of your our code is or a bunch of it will happen on CPUs so I think CPU are and then maybe you have a set of functions or processes that a GPU compute might be useful for so the hybrid approach would be lots of Hvar on CPU and then you flip over do some GPU stuff but you're not using it all the time for everything so you have to sort of think about like what is your use it's gonna be what are your needs whatever your team's needs and sometimes those can be very hard to predict so are you leaving it on all the time are you are you trying them on turning them off you have to learn these cloud computing skills which I'm a big proponent of I wouldn't be and white are our talk for me with if I didn't get up here and like evangelize about cloud computing so you have to get over your fear of giving them your credit card it's okay and then you're probably going to waste a lot of time learning about this stuff and you're probably gonna make mistakes and that's okay too but learning about the the cloud and and how to use are in the cloud just really change my perspective on data science and opened up a lot of avenues for me personally so I have a big proponent of it the last one is kind of an interesting thing and it's fairly new but there are now services that you can subscribe to where you get Hardware GPU and software as a subscription service so here's the service offered by map D my employer where you can subscribe to a a monthly tier system where you are choose how much data you want to ingest and then they give you a rate based on that so I do want to talk now about GPU powered sequel engines and this is kind of the core of what I do as a developer advocate for map D I work with their map decore open source database and I try to get people excited about using it and trying it out so map D has has created this sequel database that is GPU powered and the cool thing about that is that not only can they do things with big data process sequel queries very fast but you also have access to not only the compute pipeline inside the GPU but also the visual pipeline so you can do a query and sequel and then very quickly not even having to copy transition to the graphics pipeline and create a visual rendering which is really awesome because the application for that is then doing exploratory analysis on big data so I want to do a demonstration of our open source database and a visualization platform that is not open source but is a great addition to sit on top of our open source database and provide you with some of the kind of the visual power so showcasing the visual power of the map D open source database so this is 11 billion rows of data then I'm visually like visualizing and the reason we could do this is the the math T database is doing the rendering on server-side and it's kind of a hybrid visualization approach where all of the fancy visualization rendering gets taken care of compressed down on server side to about a hundred kilobyte static PNG and then it gets sent client-side so that we can interact with it very quickly and it feels like real time human interaction with data so we could do things like filter very fast take off filters zoom to new york city and obviously you're probably thinking like Kelly you can't show eleven billion points on a screen like that doesn't work so yes these are layered but but you can zoom around and it will show you things there's even some really fancy stuff they do where you can show some metadata about different points and brush over certain periods of time so yeah I think this is a really powerful platform you know what what it's lacking is kind of community support from from data scientists from people like the art community to come in and try things out and and to give feedback on what these platforms are missing like there are a lot it is in very early stages so so that kind of feedback would be really useful and I think it is really important for the art community to become involved in projects like GPU computing especially open-source projects so that we can have a say and like what happens and how data science is affected by GPU computing and how we can get it very quickly into our own analyses and use it so with that I want to say please reach out if you have any use cases that you think would be cool fits for this kind of solution we have our open source projects which are available on github and we have a vibrant community over our at our forum so we need our folks that would be great to hear some of your voices in our community and with that thank you very much [Applause] you
Info
Channel: Lander Analytics
Views: 3,730
Rating: 4.6883116 out of 5
Keywords:
Id: k1M6iPbsOXw
Channel Id: undefined
Length: 16min 18sec (978 seconds)
Published: Wed Aug 15 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.