How to Choose an NVIDIA GPU for Deep Learning in 2021: Quadro, Ampere, GeForce Compared

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
What type of GPU should you buy in 2021? That's  the topic of this video, and just to set the   scope of this I am covering GPUs for either a  desktop system or a laptop both desktop systems   that you might build yourself or if you're  going with an OEM. And specifically for deep   learning. I will focus on NVIDIA because NVIDIA is  supported out of the box by TensorFlow, PyTorch,   these kind of things. Also if you're using a GPU  in the cloud which is often what I'm developing   for on a desktop or a laptop NVIDIA is the type  of GPU that is available in the cloud systems. So a GPU, graphics processing unit, traditionally  for video games but the math is very similar   between video games and neural networks. So this  they've really found their home in deep learning   as well. Now you can decide to have... By  the way this is an older GPU. It's a prop   mainly for the YouTube video, but it has all  the right connectors that I want to talk about,   in terms of what you should be looking for  when you are choosing a GPU for deep learning.   We're going to talk specifically about memory,  about cooling, if more than one GPU will help   what you're working on or not and  the different lines of GPU in NVIDIA, because there's a lot of choices. And 2021 has  brought the introduction of the Ampere line and   that that changes everything. So one of the first  things that I found somewhat confusing when I was   reading about GPUs is just the different ways  that they cool. You will generally hear the   cooling of a GPU referred to as either blower  or fan and this has a lot to do with how densely   you can pack these GPUs into a system. A blower  which this type of GPU that I'm holding here,   takes air in through through this mechanism a  fan-like mechanism and sends it out the card that   you have here. So it's taking the air and blowing  it completely outside of the system. Whereas a fan   GPU is going to have two fans here that are  either circulating through the GPU or simply   drawing air through and and blowing it out into  the system case. If you're going to have two of   these together, that becomes an important  consideration because if you pack them   literally right next to each other. If it's  a fan configuration it's going to be blowing   blowing hot air into the other GPU or drawing the  hot air the exhaust from from the other GPU into   the one that we're talking about. So, that becomes  a bit of an issue if you're mounting two fan type   GPUs usually they're mounted more like  this in a case and then possibly in the   link connecting them. And speaking of NVLink. A  lot of GPUs will have a card connector up here   where you can basically put, a bridge so to speak  connector between the two and that combines them.   Not into the same physical device but it provides  a bridge that can transfer data between the   memories of the two GPUs very effectively. This  was used a lot in video games when you would   typically put two GPUs together and try to have  the game see them almost as the same processing   unit. In deep learning configurations and machine  learning you often might run it without that link.   It's not necessarily needed if you're primarily  using these for parallel processing capabilities.   I've used systems both with and without NV-link  it's definitely not a requirement and not all GPUs   even have it, and we'll get into that in a moment.  The other connector that's very important on a GPU   is the power. This just keeps getting wider and  wider on GPUs as they need more and more wattage   coming from the power supply to power them.  Another thing I definitely want to mention in   this video I have not worked with every GPU setup  that there is and this is as much a discussion   as me showing you some of the things that have  worked well for me. So have you tried GPUs for   in parallel for tasks in different ways that I  will describe in this video, please post in the   comments. I would love to hear what you what you  think and if you disagree with anything I have in   here, definitely let me know in the comments.  First let's talk about memory, because in my   opinion this is probably the most important factor  in choosing a GPU for deep learning. This is a   chart that I found on NVIDIA StyleGAN2, and I work  with StyleGAN quite a bit for image generation.   So I'm very interested in having enough memory to  train the various GANs that I want to deal with.   This is a good representation for a  lot of different memory applications.   You need to have enough memory on the GPU to hold  your neural network and one batch. So now you can   play around with that a bit you can do simpler  neural networks or you can decrease your batch   size and that might help. But look at this just  using the basic configurations from the paper   you'll see that for a 1024 by 1024 GAN and  that's that nice faces GAN that you've seen   training for 25,000 images. So this is all  those nice realistic faces that you see   for GANs to train this from scratch is  going to retire 13.3 gigabytes on the GPU.   Now if you've got multiple GPUs, it requires that  much on each of them it's not like they combine.   Now if you're willing to decrease down  the configuration smaller batch sizes,   other things you can get that down to 8.6 or  if you want to really drop your resolution   down to 6.4. You could probably even go into lower  batch size configurations and get that down a bit,   but the point is with less memory you're going  to be compromising and you're going to be   scaling back on the neural networks that you're  trying to create. If you're creating your neural   networks from scratch and you're defining how  many layers and other things then you're going to   be defining the size and you're going to probably  fit it to the GPU, but once you hit that ceiling,   you error out and stop. It's not like it's  just going to go slower or something like that.   Not having enough GPUs will just make it go  slower and we'll get into that in a moment   so let's look at some of the common  GPUs remembering that around 13 and 6.4   are where you need to be. Here you can see the  current line of the the 30, the ampere series.   24 is the the 3090 the big one 3080 the next  one down 10 gigabytes and then you're right   right on the edge with that eight. Uh you could  run the smaller ones for sure for the GANs the   256 by 256 but bigger you're going to have trouble  even with the 10 gigabytes. So you can tell this   3090 is really the one that NVIDIA created  for deep learning. I don't know that this is   going to do a whole lot more for a gamer versus  the 3080. It's got more cores so it certainly   probably would be a better gaming system, but if  I was going to buy a 30 series I would need the   3090 just on memory to do some of the things that  I want to do. The 10 would work generally for me,   but for the kinds of things that I would do  I would be running into the ceiling a lot   on the 3080. Now if you're just getting into  deep learning you can probably scale your needs   back and you can fit into either of these. I'm  glad to see even the 8 gig on both of those.   If you're constructing basic neural networks,  like would be in my course in deep learning,   you'd probably be fine with the 8 gigabytes.  Now when you're talking Quadro memory is not   generally a problem if you look at the quick  specs, on like the new ampere the 8 the a6000.  It's got 48 gigabytes of  RAM. I mean that's insane, that is well bigger than most things that I would  need for this now something like a BlazingSQL   or DASK where it is parallelizing things across   your your GPU, with RAPIDS, the  more memory you can throw at that   absolutely the better. Quick specs on the 5000  put it at 16 gigabytes so that I mean the the   Quadro jumps up pretty quickly to the class of  RAM that you need for deep learning problems. Now, let's talk about multiple GPUs  together there's several considerations here   are you buying very high-end GPU just to  parallelize and do more and more or do you   want an entry level GPU that you can maybe add  another one on later as your needs grow. We'll   talk about both of these scenarios there are a  lot of different ways that you can use multiple   gpus the two primary ways and i'm pulling this  from the Keras TensorFlow documentation and   PyTorch really operates pretty similar, is  data parallelization or model parallelization.   Data parallelization is where you're copying  the same data across all of your GPUs and   you're really just using the GPUs to speed up  what you would have done with just one GPU.   This is what I use when I'm using StyleGAN2 ADA  to train faster, so that I can get that model   done much quicker than with a single GPU. The  two GPU when you're doing data parallelization,   their memories don't really combine,  so you can't take on something bigger   than what your original GPU had the  memory to deal with. Model parallelization   and I will say this is a little bit more rare,  I have worked primarily in data parallelization,   but model parallelization is usually  when you want to have multiple GPUs   as members of the same neural network. this is  usually for a crazy big neural network that just   would not fit in one GPU to begin with. Now you  could, if you have smaller GPUs, with less memory,   use this to train a larger memory than you would  have normally had access to. And there might be   some advantages to that. However, that's not  usually the case, and that would require some   engineering on your part to get that really set  up properly. So looking at the GPUs we'll see   more on this in just a moment, but the 3090  and 3080 (actually just 3090) are the two of   the 30 series that support NVLink. Now you can put  multiples of any of these in here and do the data   parallelization. You just can't move data across  them particularly fast. Now I have not tried this.   But I really do believe it would be theoretically  possible you could have multiple different types   of GPU there. Now your workloads are going to take  different amounts of time and again you're going   to need to do some engineering to make that all  work out correctly. But if I were looking at this   and I wanted to buy something and upgrade later I  might think about a 3080 and it's got the 10 GB of   RAM if my model doesn't fit in it buying a  second 3080 is not going to necessarily help   me. An interesting experiment and if anybody has  any opinions on if this would work or not maybe   you have more money when you're buying the second  GPU maybe you bought a 3080 and you throw a 3090   onto it you could potentially use that 3090  when you need something that has the bigger RAM you could use both of them together just to  speed up training when the 10 GB is enough   to fit your model in there. Like I said, I have  not tried that would be a bit unorthodox but I   think in theory that might work, if you do some  work on your on your engineering and pipeline.   But any of these, if you want to buy a second  one, at some point and again you're only going   to get NVLink on the top two. It's not going  to let you handle bigger models but it will   pretty easily let you train nearly twice as  fast as what you would have done before and   will it train twice as fast. Let's go back to this  chart. Notice the training times here to do the   state-of-the-art style GAN faces eight GPUs  takes nine days. I mean this is heavy duty   GPU training to truly build  the state-of-the-art face scan.   But look it scales pretty linearly  four GPUs took twice as long and then doubles and then pretty much doubles. So this  this is nice scalability you can literally just   throw more GPUs at it. And this is eight GPUs on  a single system most systems that you will see   will have only two GPUs on them. I know the Lenovo  P920 line you can put a third GPU into the system. But if you start to go above four, to four and  eight you're probably dealing with one of the   NVIDIA DGX systems or you you've really done  some serious engineering to get that many GPUs onto a actual motherboard. The other  thing I'm going to mention is NVLink   it's not a necessity and i have worked in  systems myself more without it than with it. When   you're working on the cloud it's generally there  because the AWS EC2 instances make it available.   At least the systems that I've worked with. If  you're doing something like rapids or BlazingSQL,   things that actually support it out of the box.  The NVLink will considerably speed things up.   It's supported in TensorFlow, but you have to  specifically make it take advantage of NVlink  with an actual command, and with  some engineering into your model. So   it's not going to necessarily just out of the box  do a great deal for you. Again anybody have other   experiences with NVLink on the started situation  I'd love to hear about it in the comments.   This is not an area I've worked with a great  deal. What about laptops? If you're on the run   you don't want to have a monster tower with you  all the time maybe you're going to send data up to   the cloud for processing. What do GPUs look like  on laptops because some computers support them   really well some not so well. Now laptops have a  lot of the same types of GPUs available on them. A lot of the Ampere have not become available  yet in mobile but that is that is definitely   coming. NVIDIA has made this a bit more simple to  understand so if you look at the Quadro RTX 5000   and 4000 the desktop model of this exactly same  named GPU the performance is nearly exactly the   same between mobile and desktop. So this makes it  pretty simple the laptop GPUs are named the same   as the desktop. That's a little trickier to  compare price here because you're going to   be buying this as part of a system the one that I  have the most experience with myself is a Quadro   RTX 5000 which is installed on a Lenovo ThinkPad  P53 that Lenovo has been kind enough to loan me   for a bit to try out with the YouTube channel,  and I will say it is it is quite fast. It seems   to have the same the same specs as the the higher  Quadro for the desktop. What I have found on the   laptops is if you can afford them the Quadros are  great because they get you the memory that you   need. The memory to me is really the key feature  for a deep learning machine. Now if we look   at picking graphics cards for laptops this is the  selection page on one vendor that that i've looked   at laptops before and you'll see that basically  they are offering the same 20 series that that   you get basically in the desktop line. However  if you're not going with Quadro, at this point   you're really not getting the the memory that I  would really like to have in a laptop situation.   But this gives you an idea of some of the relative  costs and all these dollars are in US dollars.   Okay, there's all kinds of lines of GPU you've  probably heard of. Names like Tesla and Ampere,   the latest one, GeForce Quadro. What  does all this mean, let's take a look. Great so which GPU should you get well it  really depends the GPU that I purchase. I mean   I work with deep learning extensively. Machine  learning this is my career I'm fairly advanced   in my career, so my compensation level is and net  worth level are probably at the point that I will   more throw money at these kind of things than  some of you might, for me on a desktop. I would   really be looking probably if it's a system that  I'm putting together myself or purchasing. I would   look at probably dual 3090s or starting with one  3090 and adding the second one if I really needed   it the 24 GB. For what I do is really enough and  the cores is going to be quite fast it's Ampere so   not everything is necessarily taking advantage  of all the cool things that this GPU can do   in early 2021. If I didn't have quite as much  money to spend I would very much be looking at the   3080 or the 3070. I would really really try to  get the 3080 because that extra two gigabytes   of ram could make all the difference for certain  things that i would work on and it's a lot less   money than that 3090. Other ones that you might  think about if you really want power and to pay   less for it the 20 series. You can possibly find  these guys on eBAY and other other options as well   since they're now kind of one behind the touring  architecture I run a TITAN RTX myself that's my   primary GPU and I absolutely love the card, so  you could look potentially at the 2080ti something   like that. It's very very similar to the TITAN it  just does not have as much memory and that 11 GBis   going to do a lot for you if I was dealing for me  my own personal buying. If I was dealing with a   laptop at this point I would probably get a Quadro  5000 or a 4000. I just really want that RAM. I would probably get the 5000 because  the 4000 you're pretty close now into   the into the 20 series i would also really  be keeping an eye as these as these 30 series   Ampere really starts to become available  in the laptops and as the Ampere the a6000   mobile equivalent also becomes available. These  are all the things that I would be watching if   you're a student say in my deep learning course  I would highly highly recommend the 2080 at seven   hundred dollars that is going to do pretty much  everything in the class as would probably the   30-70 that's kind of how I see this. What do  you think everybody? I'm sure has different   opinions here and has probably worked in  different scenarios than what i have so let   me know in the comments. Very very curious  to see this is a discussion as much as me   standing on high and telling you  what you what you should do I'm   sharing my experience with you and my opinions. but these are opinions I'm very curious to hear   hear from you guys as well. Thank you for watching  this video please like and subscribe if this was   useful to you. Please follow the channel because  there's all things artificial intelligence   and particularly generative networks like  GANs and other other cool technology.
Info
Channel: Jeff Heaton
Views: 41,881
Rating: 4.891892 out of 5
Keywords: gpu, nvidia, deep learning, jeff heaton, nvidia 3080, nvidia 3090, quadro
Id: pWzlL51oqRo
Channel Id: undefined
Length: 21min 4sec (1264 seconds)
Published: Wed Jan 06 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.