GPU Passthrough on Linux and Docker for AI, ML, and Plex

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

This past few weeks I figured out how to passthrough a GPU to a Linux server, so I decided to documented the process.

In this tutorial we:

  • Passthrough a GPU to an Ubuntu Server
  • Install and configure the headless drivers
  • Configure for Docker
  • Configure for Kubernetes
  • Install and run nvtop (nice terminal utility to monitor GPUs)
  • Run a Deep Learning workload with TensorFlow
  • Plex transcoding test

Thank you.

πŸ‘οΈŽ︎ 67 πŸ‘€οΈŽ︎ u/Techno-Tim πŸ“…οΈŽ︎ Oct 10 2020 πŸ—«︎ replies

Can multiple containers share a single gpu?

πŸ‘οΈŽ︎ 14 πŸ‘€οΈŽ︎ u/urw7rs πŸ“…οΈŽ︎ Oct 10 2020 πŸ—«︎ replies

docker run --gpus all

πŸ‘οΈŽ︎ 6 πŸ‘€οΈŽ︎ u/Boozybrain πŸ“…οΈŽ︎ Oct 10 2020 πŸ—«︎ replies

Subbed nice video.

πŸ‘οΈŽ︎ 3 πŸ‘€οΈŽ︎ u/N1ghtS7alker πŸ“…οΈŽ︎ Oct 10 2020 πŸ—«︎ replies

I see Techno Tim, I upvote.

πŸ‘οΈŽ︎ 16 πŸ‘€οΈŽ︎ u/lmm7425 πŸ“…οΈŽ︎ Oct 10 2020 πŸ—«︎ replies

Is this possible using LXC containers instead of docker?

πŸ‘οΈŽ︎ 3 πŸ‘€οΈŽ︎ u/Brru πŸ“…οΈŽ︎ Oct 11 2020 πŸ—«︎ replies

Nicely done sir. Was just listening and realized to the average person it sounds like you are speaking a made up language.

πŸ‘οΈŽ︎ 6 πŸ‘€οΈŽ︎ u/ThePerfectLine πŸ“…οΈŽ︎ Oct 10 2020 πŸ—«︎ replies

I just started watching this video and was scrolling through reddit and found your post. Love your content.

Will this work on a system installed on baremetal? And would the process be similar for amd gpus?

πŸ‘οΈŽ︎ 2 πŸ‘€οΈŽ︎ u/[deleted] πŸ“…οΈŽ︎ Oct 10 2020 πŸ—«︎ replies

Subbed

Dope video!

πŸ‘οΈŽ︎ 2 πŸ‘€οΈŽ︎ u/[deleted] πŸ“…οΈŽ︎ Oct 10 2020 πŸ—«︎ replies
Captions
so you've got an extra gpu laying around and you want to use that on your linux server but your linux server is virtualized and headless and you'd like to take advantage of that gpu using docker or some of your kubernetes workloads well if you'd like to take advantage of that gpu for some of your kubernetes workloads for things like tensorflow machine learning deep learning or maybe even plex well get out your buzzword bingo card because i think we just got a bingo hey welcome back so i'm techno tim and today we're going to talk about using your nvidia gpu for docker and kubernetes workloads as a quick reminder i stream every tuesday thursday and saturday so if you want to continue the conversation about passing through an nvidia gpu to an ubuntu server there that was a mouthful we can so let's talk about using your nvidia video card for a kubernetes workload so why would you want to use your video card in kubernetes or docker well you might have a docker container that can take advantage of that video card there are plenty containers out there that can use your gpu for things like deep learning ai or something as simple as plex these containers can offload that processing to the gpu which frees up your cpu to do other tasks and not only that some workloads like ai deep learning and data science are optimized for gpu accelerated computing but then you also have some workloads that just want to use the encoder things like plex or jellyfin and that's where the challenge comes in getting docker containers to play well with nvidia's docker container can be a challenge and if you're virtualizing that linux machine it can also be a challenge to pass that gpu through from the host to the guest now i've already done this a few times and so i think i have the process down so that's what we're going to talk about today so today in this video we're going to use an nvidia gpu within a linux virtual machine we'll then use that gpu within docker and kubernetes we'll install a few other tools to monitor and measure our gpu then we'll run a deep learning container to check our work then after that's working we'll apply this to something a little more practical for me which is plex transcoding in docker and kubernetes we'll use the same tools to monitor and measure that transcoding and then permanently offload all of our transcoding to the gpu did you get a buzzword bingo yet so with that out of the way let's get started the first thing you're going to need is a gpu i've had success with some consumer grade cards like an nvidia 1050 or a 1650 but quattros will also work once you have that video card you'll want to install it in your server you'll want to make sure that your server and processor support i o mmu and that's so we can pass through this video card to our virtual machine and if you need help with that on proxmox i've got a guide on how to pass through your gpu to a virtual machine you'll follow that same process but then create a linux virtual machine at the end then you'll want to be sure that you enable io mmu then after that create your linux virtual machine and shut it down now if you're doing this on a bare metal linux server you don't have to do this but if you have virtualized it we'll need to change a couple of things in the hardware tab we'll add our pci device in the device drop down you should see your nvidia gpu and then we'll add the device then you want to ssh into your proxmox server there we'll edit one thing on our virtual machines config be sure to pick the right virtual machine and this is indicated by the id here we'll want to add a couple flags to our cpu i've added host hidden equals one and flags equal plus pcid once you have that set you can start your virtual machine backup then you'll want to ssh into that virtual machine and run this command the command is video and when you run that command we should see our nvidia video card there so this is a good sign now we need to install the nvidia drivers this part is kind of tricky because we're running a headless server we don't want to install any of the x11 stuff because there's no point in installing a windowing system when it's headless so here's the command you use first you want to update aptitude then we'll want to run this command right here so this command is for the latest nvidia drivers which at this point it's 450. but we'll want to be sure we don't install any recommended packages so we'll need the nvidia cuda toolkit the nvidia headless driver for 450 the nvidia utils for 450 and then lib video dash and code 450. now this last one the encode piece took a while for me to figure out but this is really important if you're going to do plex transcoding so let's install these say yes then you want to reboot your server now that the driver is installed we need to install some additional configuration for docker first we'll set our distribution then we'll add a gpg key then we'll add an app source then we'll add the nvidia container toolkit then we'll install the nvidia container runtime i've also needed to install this package it's nvidia docker 2 and when prompted i typically say no here and that prompt was to update our daemon.json file and so we need to create a new one so that our docker in kubernetes can take advantage of the nvidia runtime so let's do that so let's modify that file and let's replace it with this json and don't worry all of these commands as well as this json will be in the documentation and you can find that in the description so we'll save this now we'll need to restart the docker process now after all that we can test our driver so let's query our video card first from the guest machine and that's as simple as running nvidia smi this should respond with your video card so this is a good sign this means that our video card is exposed to our virtual machine but now we actually want to check to make sure it's exposed to docker and so we'll run a similar command but within a docker container and that's docker run dash dash rm gpus all nvidia cuda 11 base and then we'll run that command nvidia smi and so this will pull down the docker image spin up a container and run that command and as we can see here we were able to execute that command within the docker container and that docker container can communicate with the virtual machine which exposes the driver so this is a good sign this means that docker can also communicate with our video card next we'll install a tool called envy top so this is a nice tool that helps you visualize your nvidia gpu and we'll use this to make sure that our processes are taking advantage of our video card so let's install envy top and if we run mv top we can see it can query our video card and as you can see nothing really is going on and so on the right side i'm going to open another session to this server so that i can launch a deep learning workload and this is a tensorflow workload the nice part about this is is you don't have to install all the dependencies to run a tensorflow workload now you can just spin up a docker container so let's do that so you can see on the right it's pulling down this tensorflow workload this is a workload that should take advantage of our gpu and now let's run this ai training script and once we run the script we can see on the left side that mv top is reporting we have a process and this process is being used from docker and you can see right away that the ram shoots up and so this is a really good sign so far we've passed the gpu through to our linux guest from there we were able to install the nvidia drivers and then from there we've exposed our nvidia gpu to docker and now we're able to spin up tensorflow really easily in a docker container and take advantage of that video card and if we wait long enough for this training we should see the gpu start to spike but we can exercise that gpu using plex in a previous video i helped you fully set up plex using kubernetes rancher and docker in that video we walked through setting up plex using rancher and if you set it up using my guide we'll need to tweak a few things before we can take advantage of this video card the first thing we'll need to do is install the nvidia device plugin this allows us to use our nvidia device in kubernetes and it's really easy to do so in rancher we'll want to go to our cluster and then we'll want to launch coupe control we'll apply the nvidia device plugin here and after that you can verify it by going to cluster system and you should see the nvidia device plugin listed here okay this means now that our kubernetes workloads can now take advantage of this video card and so let's get back to our plex installation there's one thing we need to change so if we go into our plex workload and we edit this workload we'll need to add one environment variable and that's nvidia visible devices and we'll give it the value of all this will allow our plex container to query our nvidia devices and use them if you have problems using all you can query your device using this command this command will query the device and give you back your uuid and you can plug that uuid in the value instead of all this is the value i would use here instead of all if i were having problems but if all works just use all okay let's save this workload now once we save that we should be able to use our nvidia video card within a plex docker container and once you're in the plex settings you want to make a few changes in the transcoder section you'll want to make sure that use hardware acceleration when available is turned on then we want to show advanced here we'll also want to make sure that we turn on use hardware accelerated video encoding this will use our gpu when we're transcoding also worth mentioning that this is a plex pass feature okay make sure that you save that and now let's transcode some video so you can see on the left i have envy top open this is envy top running inside of my linux server and on the right i have a 4k video and let's click play and so you can see right away that we're able to use this video card on the left you can see my gpu is using around six or seven percent of this and this is transcoding a 4k file you can see it's using about a half a gig of memory which is to be expected and you can see that my cpu usage is pretty low here i'm only using 35 of my cpu typically when i transcode stuff this goes anywhere from 100 to 800 depending on how many chords you have so let's really turn it up and you can see here i have quite a few streams running now and you can see here my gpu is keeping up the other cool thing is if we go into our plex dashboard and look at these streams we can see that these are being hardware transcoded which this is a good sign it means it's using our gpu and this is just one more example of what you can do with the gpu you have laying around the cool part about this is is because it's running in docker we can have multiple docker processes taking advantage of this gpu so if you have plex jellyfin mb or any other docker container that can use your gpu you can now give them all access to your gpu and so i hope this shows you how awesome this is to be able to take advantage of your gpu not only on the virtual machine but within a docker container this opens up a ton of possibilities if you have multiple docker containers or kubernetes workloads that can take advantage of your gpu you can do this all within one virtual machine and then you don't need multiple virtual machines with multiple video cards but if you plan on running more workloads than just a couple i recommend investing in some quadro cards those will perform a little bit better under these types of loads but for me and the workloads i run i'm going to stick to a consumer card so what do you think about running workloads that can take advantage of an nvidia gpu what do you think about applying this to ai machine learning or data science did you get a bingo in our buzzword bingo game if so let me know in the comments section below and while you're in the comments don't forget to give this video a thumbs up and consider subscribing if you haven't already and if you have more questions you can always join my live stream i stream every tuesday thursday and saturday so if you have a question about this video or any of my other videos hop into my stream and i'd love to have you so thanks so much for watching and until next time stream on my friends version controlled and so the cool thing about that is is that um oh coop cuddle coop control sorry i had to say coop cuddle hey cube coop ctl should we call you coop ctl uh cube cuddle coop control or cube control or cube cutoff hey thanks for the fall appreciate it welcome uh yeah so anyways
Info
Channel: Techno Tim
Views: 38,492
Rating: undefined out of 5
Keywords: nvidia, docker, rancher, kubernetes, tensorflow, machine learning, ai, gpu passthrough, ubuntu, headless server, nvidia kubernetes, nvidia docker, plex transcoding docker, nvtop, nvenc, plex, jellyfin, nvidia-docker, proxmox, deep learning, nvdec, iommu, gpu passthourgh linux, lshw, cuda, container, toolkit, pytorch, nvidia-smi, kubeflow, ml framework, compute, vm, virtual machine, gpu pass through, hardware acceleration, emby, containerization, virtualization, data science, nvidia_visible_devices, k8s
Id: 9OfoFAljPn4
Channel Id: undefined
Length: 11min 59sec (719 seconds)
Published: Sat Oct 10 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.