How to Choose an NVIDIA GPU for Deep Learning in 2021: Quadro, Ampere, GeForce Compared

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

What type of GPU should you buy in 2021? That's the topic of this video, and just to set the scope of this I am covering GPUs for either a desktop system or a laptop both desktop systems that you might build yourself or if you're going with an OEM. And specifically for deep learning. I will focus on NVIDIA because NVIDIA is supported out of the box by TensorFlow, PyTorch, these kind of things. Also if you're using a GPU in the cloud which is often what I'm developing for on a desktop or a laptop NVIDIA is the type of GPU that is available in the cloud systems. So a GPU, graphics processing unit, traditionally for video games but the math is very similar between video games and neural networks. So this they've really found their home in deep learning as well. Now you can decide to have... By the way this is an older GPU. It's a prop mainly for the YouTube video, but it has all the right connectors that I want to talk about, in terms of what you should be looking for when you are choosing a GPU for deep learning. We're going to talk specifically about memory, about cooling, if more than one GPU will help what you're working on or not and the different lines of GPU in NVIDIA, because there's a lot of choices. And 2021 has brought the introduction of the Ampere line and that that changes everything. So one of the first things that I found somewhat confusing when I was reading about GPUs is just the different ways that they cool. You will generally hear the cooling of a GPU referred to as either blower or fan and this has a lot to do with how densely you can pack these GPUs into a system. A blower which this type of GPU that I'm holding here, takes air in through through this mechanism a fan-like mechanism and sends it out the card that you have here. So it's taking the air and blowing it completely outside of the system. Whereas a fan GPU is going to have two fans here that are either circulating through the GPU or simply drawing air through and and blowing it out into the system case. If you're going to have two of these together, that becomes an important consideration because if you pack them literally right next to each other. If it's a fan configuration it's going to be blowing blowing hot air into the other GPU or drawing the hot air the exhaust from from the other GPU into the one that we're talking about. So, that becomes a bit of an issue if you're mounting two fan type GPUs usually they're mounted more like this in a case and then possibly in the link connecting them. And speaking of NVLink. A lot of GPUs will have a card connector up here where you can basically put, a bridge so to speak connector between the two and that combines them. Not into the same physical device but it provides a bridge that can transfer data between the memories of the two GPUs very effectively. This was used a lot in video games when you would typically put two GPUs together and try to have the game see them almost as the same processing unit. In deep learning configurations and machine learning you often might run it without that link. It's not necessarily needed if you're primarily using these for parallel processing capabilities. I've used systems both with and without NV-link it's definitely not a requirement and not all GPUs even have it, and we'll get into that in a moment. The other connector that's very important on a GPU is the power. This just keeps getting wider and wider on GPUs as they need more and more wattage coming from the power supply to power them. Another thing I definitely want to mention in this video I have not worked with every GPU setup that there is and this is as much a discussion as me showing you some of the things that have worked well for me. So have you tried GPUs for in parallel for tasks in different ways that I will describe in this video, please post in the comments. I would love to hear what you what you think and if you disagree with anything I have in here, definitely let me know in the comments. First let's talk about memory, because in my opinion this is probably the most important factor in choosing a GPU for deep learning. This is a chart that I found on NVIDIA StyleGAN2, and I work with StyleGAN quite a bit for image generation. So I'm very interested in having enough memory to train the various GANs that I want to deal with. This is a good representation for a lot of different memory applications. You need to have enough memory on the GPU to hold your neural network and one batch. So now you can play around with that a bit you can do simpler neural networks or you can decrease your batch size and that might help. But look at this just using the basic configurations from the paper you'll see that for a 1024 by 1024 GAN and that's that nice faces GAN that you've seen training for 25,000 images. So this is all those nice realistic faces that you see for GANs to train this from scratch is going to retire 13.3 gigabytes on the GPU. Now if you've got multiple GPUs, it requires that much on each of them it's not like they combine. Now if you're willing to decrease down the configuration smaller batch sizes, other things you can get that down to 8.6 or if you want to really drop your resolution down to 6.4. You could probably even go into lower batch size configurations and get that down a bit, but the point is with less memory you're going to be compromising and you're going to be scaling back on the neural networks that you're trying to create. If you're creating your neural networks from scratch and you're defining how many layers and other things then you're going to be defining the size and you're going to probably fit it to the GPU, but once you hit that ceiling, you error out and stop. It's not like it's just going to go slower or something like that. Not having enough GPUs will just make it go slower and we'll get into that in a moment so let's look at some of the common GPUs remembering that around 13 and 6.4 are where you need to be. Here you can see the current line of the the 30, the ampere series. 24 is the the 3090 the big one 3080 the next one down 10 gigabytes and then you're right right on the edge with that eight. Uh you could run the smaller ones for sure for the GANs the 256 by 256 but bigger you're going to have trouble even with the 10 gigabytes. So you can tell this 3090 is really the one that NVIDIA created for deep learning. I don't know that this is going to do a whole lot more for a gamer versus the 3080. It's got more cores so it certainly probably would be a better gaming system, but if I was going to buy a 30 series I would need the 3090 just on memory to do some of the things that I want to do. The 10 would work generally for me, but for the kinds of things that I would do I would be running into the ceiling a lot on the 3080. Now if you're just getting into deep learning you can probably scale your needs back and you can fit into either of these. I'm glad to see even the 8 gig on both of those. If you're constructing basic neural networks, like would be in my course in deep learning, you'd probably be fine with the 8 gigabytes. Now when you're talking Quadro memory is not generally a problem if you look at the quick specs, on like the new ampere the 8 the a6000. It's got 48 gigabytes of RAM. I mean that's insane, that is well bigger than most things that I would need for this now something like a BlazingSQL or DASK where it is parallelizing things across your your GPU, with RAPIDS, the more memory you can throw at that absolutely the better. Quick specs on the 5000 put it at 16 gigabytes so that I mean the the Quadro jumps up pretty quickly to the class of RAM that you need for deep learning problems. Now, let's talk about multiple GPUs together there's several considerations here are you buying very high-end GPU just to parallelize and do more and more or do you want an entry level GPU that you can maybe add another one on later as your needs grow. We'll talk about both of these scenarios there are a lot of different ways that you can use multiple gpus the two primary ways and i'm pulling this from the Keras TensorFlow documentation and PyTorch really operates pretty similar, is data parallelization or model parallelization. Data parallelization is where you're copying the same data across all of your GPUs and you're really just using the GPUs to speed up what you would have done with just one GPU. This is what I use when I'm using StyleGAN2 ADA to train faster, so that I can get that model done much quicker than with a single GPU. The two GPU when you're doing data parallelization, their memories don't really combine, so you can't take on something bigger than what your original GPU had the memory to deal with. Model parallelization and I will say this is a little bit more rare, I have worked primarily in data parallelization, but model parallelization is usually when you want to have multiple GPUs as members of the same neural network. this is usually for a crazy big neural network that just would not fit in one GPU to begin with. Now you could, if you have smaller GPUs, with less memory, use this to train a larger memory than you would have normally had access to. And there might be some advantages to that. However, that's not usually the case, and that would require some engineering on your part to get that really set up properly. So looking at the GPUs we'll see more on this in just a moment, but the 3090 and 3080 (actually just 3090) are the two of the 30 series that support NVLink. Now you can put multiples of any of these in here and do the data parallelization. You just can't move data across them particularly fast. Now I have not tried this. But I really do believe it would be theoretically possible you could have multiple different types of GPU there. Now your workloads are going to take different amounts of time and again you're going to need to do some engineering to make that all work out correctly. But if I were looking at this and I wanted to buy something and upgrade later I might think about a 3080 and it's got the 10 GB of RAM if my model doesn't fit in it buying a second 3080 is not going to necessarily help me. An interesting experiment and if anybody has any opinions on if this would work or not maybe you have more money when you're buying the second GPU maybe you bought a 3080 and you throw a 3090 onto it you could potentially use that 3090 when you need something that has the bigger RAM you could use both of them together just to speed up training when the 10 GB is enough to fit your model in there. Like I said, I have not tried that would be a bit unorthodox but I think in theory that might work, if you do some work on your on your engineering and pipeline. But any of these, if you want to buy a second one, at some point and again you're only going to get NVLink on the top two. It's not going to let you handle bigger models but it will pretty easily let you train nearly twice as fast as what you would have done before and will it train twice as fast. Let's go back to this chart. Notice the training times here to do the state-of-the-art style GAN faces eight GPUs takes nine days. I mean this is heavy duty GPU training to truly build the state-of-the-art face scan. But look it scales pretty linearly four GPUs took twice as long and then doubles and then pretty much doubles. So this this is nice scalability you can literally just throw more GPUs at it. And this is eight GPUs on a single system most systems that you will see will have only two GPUs on them. I know the Lenovo P920 line you can put a third GPU into the system. But if you start to go above four, to four and eight you're probably dealing with one of the NVIDIA DGX systems or you you've really done some serious engineering to get that many GPUs onto a actual motherboard. The other thing I'm going to mention is NVLink it's not a necessity and i have worked in systems myself more without it than with it. When you're working on the cloud it's generally there because the AWS EC2 instances make it available. At least the systems that I've worked with. If you're doing something like rapids or BlazingSQL, things that actually support it out of the box. The NVLink will considerably speed things up. It's supported in TensorFlow, but you have to specifically make it take advantage of NVlink with an actual command, and with some engineering into your model. So it's not going to necessarily just out of the box do a great deal for you. Again anybody have other experiences with NVLink on the started situation I'd love to hear about it in the comments. This is not an area I've worked with a great deal. What about laptops? If you're on the run you don't want to have a monster tower with you all the time maybe you're going to send data up to the cloud for processing. What do GPUs look like on laptops because some computers support them really well some not so well. Now laptops have a lot of the same types of GPUs available on them. A lot of the Ampere have not become available yet in mobile but that is that is definitely coming. NVIDIA has made this a bit more simple to understand so if you look at the Quadro RTX 5000 and 4000 the desktop model of this exactly same named GPU the performance is nearly exactly the same between mobile and desktop. So this makes it pretty simple the laptop GPUs are named the same as the desktop. That's a little trickier to compare price here because you're going to be buying this as part of a system the one that I have the most experience with myself is a Quadro RTX 5000 which is installed on a Lenovo ThinkPad P53 that Lenovo has been kind enough to loan me for a bit to try out with the YouTube channel, and I will say it is it is quite fast. It seems to have the same the same specs as the the higher Quadro for the desktop. What I have found on the laptops is if you can afford them the Quadros are great because they get you the memory that you need. The memory to me is really the key feature for a deep learning machine. Now if we look at picking graphics cards for laptops this is the selection page on one vendor that that i've looked at laptops before and you'll see that basically they are offering the same 20 series that that you get basically in the desktop line. However if you're not going with Quadro, at this point you're really not getting the the memory that I would really like to have in a laptop situation. But this gives you an idea of some of the relative costs and all these dollars are in US dollars. Okay, there's all kinds of lines of GPU you've probably heard of. Names like Tesla and Ampere, the latest one, GeForce Quadro. What does all this mean, let's take a look. Great so which GPU should you get well it really depends the GPU that I purchase. I mean I work with deep learning extensively. Machine learning this is my career I'm fairly advanced in my career, so my compensation level is and net worth level are probably at the point that I will more throw money at these kind of things than some of you might, for me on a desktop. I would really be looking probably if it's a system that I'm putting together myself or purchasing. I would look at probably dual 3090s or starting with one 3090 and adding the second one if I really needed it the 24 GB. For what I do is really enough and the cores is going to be quite fast it's Ampere so not everything is necessarily taking advantage of all the cool things that this GPU can do in early 2021. If I didn't have quite as much money to spend I would very much be looking at the 3080 or the 3070. I would really really try to get the 3080 because that extra two gigabytes of ram could make all the difference for certain things that i would work on and it's a lot less money than that 3090. Other ones that you might think about if you really want power and to pay less for it the 20 series. You can possibly find these guys on eBAY and other other options as well since they're now kind of one behind the touring architecture I run a TITAN RTX myself that's my primary GPU and I absolutely love the card, so you could look potentially at the 2080ti something like that. It's very very similar to the TITAN it just does not have as much memory and that 11 GBis going to do a lot for you if I was dealing for me my own personal buying. If I was dealing with a laptop at this point I would probably get a Quadro 5000 or a 4000. I just really want that RAM. I would probably get the 5000 because the 4000 you're pretty close now into the into the 20 series i would also really be keeping an eye as these as these 30 series Ampere really starts to become available in the laptops and as the Ampere the a6000 mobile equivalent also becomes available. These are all the things that I would be watching if you're a student say in my deep learning course I would highly highly recommend the 2080 at seven hundred dollars that is going to do pretty much everything in the class as would probably the 30-70 that's kind of how I see this. What do you think everybody? I'm sure has different opinions here and has probably worked in different scenarios than what i have so let me know in the comments. Very very curious to see this is a discussion as much as me standing on high and telling you what you what you should do I'm sharing my experience with you and my opinions. but these are opinions I'm very curious to hear hear from you guys as well. Thank you for watching this video please like and subscribe if this was useful to you. Please follow the channel because there's all things artificial intelligence and particularly generative networks like GANs and other other cool technology.

Info

Channel: Jeff Heaton

Views: 41,881

Rating: 4.891892 out of 5

Keywords: gpu, nvidia, deep learning, jeff heaton, nvidia 3080, nvidia 3090, quadro

Id: pWzlL51oqRo

Channel Id: undefined

Length: 21min 4sec (1264 seconds)

Published: Wed Jan 06 2021