From Beginner to Pro: Deploying Custom YOLOv5 Models for Object Detection with OpenCV

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

it took maybe 10 15 minutes tracks like generator dataset label it and then train our neural network and this is the results that we get even if we're moving around here in the image frame is still detecting it pretty good drawing a really nice mounting box around it hey guys what was new video in this new networks and deep learning tutorial in this video here i'm going to show you how we can deploy a custom yellow e5 model with opensv so in one of the previous videos we actually like build our own data set uh with a mock detector so we took a couple of images of a mock then we did a data annotation so we actually labeled our images and we also did some data augmentation in roboflow so make sure to check that video out first of all because in this video here we're just going to deploy that model that we actually created so i'll link to the video up here you can check it out we're building new network or we're using ulb5 from scratch building our own data set training the euler model on that data set and we also show some of the results and predictions that we get but in this video here we're actually going to deploy it with opencv as we've done in the previous video but first of all from john disco server i'll link to the transcription here comes john channel channels about computer vision deep learning ai and so on you can also become a member of the channel if you want to support channel with a small month fee everything will go to create more and better quality content here on the channel also if you're a member of the channel here i can help you out with your projects and give you some guidance if you have some if you have some problems in your own projects and applications if you're a member of the channel so thank you guys so let's just jump straight into the code here i'm just going to go fast over the code here because in the previous video i'll link to it up here you can just check it out on my channel we're just doing optic detection so basically here we're just implementing a class where we can actually get a video frame or like we can get the video free for from a webcam or some other camera that you connect with your computer in last video we just did it with a computer vision or like with a url from youtube and then we're just passing that youtube video through our model and then deploying a a pre-trained yolum v5 model um in that way in this video here i'm just going to convert it so instead of taking the images from or like a video from youtube we're just actually we're just we're just importing or like we're just opening a video capture with opencv and then we're going to read in the images from our video capture or a whip game and then use them as normal power arrays and then we just pass it on from numpy race through marioli 5 model and then we get the results out and pre-process or like post-process them as we did in the last video so here we just have our class again we're just going to go shortly over it if you want to know more details about every line of code here make sure to check that other video out how to act like deploy a yolo v5 model here in python with ohmsv first of all here we're just importing torch in the other video i also show you how we can import or install torch and also for gpu support because when you're deploying neural network you often want to deploy them on on gpus so they're not taking up like all your processing power on your cpu but here we're just creating our mock to take the class first we have our initialization function so when we are actually initializing an object of our mock detector class we're just going to specify the capture index so which index of a webcam do you actually want to get the images from then we also need to specify the model name here so this will be the model uh or like the weights file direct like download from either like google collab after we've trained our model so this will just be the the path here to our model weights that we load in and then we want to use that to actually like post process or like process our images that we get in from our webcam so here we're just going to set up these different kind of things so our self video capture index model the different classes that we have so all the classes that we actually want to do optimization on it could be here that you've trained your model on on doing cat and dog predictions then you also need to specify the classes here so the classes here will relax like um get in the different kind of labels that you want then we also set the device here so we're just using cuda if it's a weightable or else we'll just see use the cpu again i refer to the other video if you want to know how to use the cpu or like the gpu for doing new network interference here with opencv and uh jolov5 so here we're just going to get the video capture we're going to load the model so in this video here we're going to create this custom model in the last video we just used the other model here so it's a pre-trained ult five and the small model that we used and then this would here if you're specifying a model name we will act like load in a custom uv5 model that we're trained on our own and then we need to specify the path to the wage file that i just talked about we can also set this fourth reload here to true if you get some warnings down in the output or in the terminal when you're running your program so this will basically just be our model our custom model that we've pre-trained in the other video or like train in the other video to do mock detection and i'm going to show you the results here at the end in this video then we're going to set up the score frame here so we're basically just going to to pass through uh like the frame through our model then when we pass the frame through our model down here we just get out the results then we're just going to take the labels from our results and also the coordinates for the mounting boxes of optics that were act like detecting in the image frame we're just returning that so we can actually use that later on we have a class to label here so we're basically just converting from our classes to our label then we have a function to plot boxes so the boundary boxes around it basically here we're just getting the results and the frame that we want to uh toward the boundary box on we just unpack our labels and the coordinates for bounding box and then we basically just have a full loop running through that we have a threshold value here so if our detection is not sort of uh like more certain than some threshold then we will not say that this is actually a mock that we're taking in the image so if you get too many false positives and word like you don't get any detections at all in your image make sure to tune this parameter here it could be like a 50 confidence score level that you want it to be above or something like that it depends on how good your model is and act like how many objects do we want to detect in the image frame and also the certainty in those objects then basically here we're just drawing our rectangle here around the objects that we're detecting so in this way here we're just going to go with point three as the confidence score we're returning the frame that we actually wanted to to detect or like this like display the boundary box around that object that we're taking in and then we have our call function here so when we initialize our update later on it will just call this call function here automatically it will get executed and it will just run the program here in our one loop so we're opening up our video capture which is the webcam of a computer then we just have a while loop here that is true while we're not like it is just running in this wall up here as long as we're not pressing or like hitting escape on our keyboard first of all we're resizing our image so they'll likely be the same image dimensions as we trained the needle network on then we're starting a timer so we can get number frames per seconds where i can like just um take the scores here so we just put in a frame into the score frame we get out the results and then we're actually like just plotting the results that we get and then we get out this new frame that we're just going to show later on down here at the bottom we're ending the timer we're calculating the number of frames per second that we get and then we're displaying that on the image frame as well down here if we're hitting escape at any time we'll just terminate the program and it release our video capture or like the webcam that we actually opening up to get our images from um our webcam feed down here at the bottom after we created the class this is the last lines of code so here we basically just we're going to create a new object and then we're going to execute our code because we have this automatically called function here inside of our um mock detection class first of all we're setting the capture index here my webcam is capture index zero you could have maybe one or two depending on if you have several cameras to your computer and then we're going to specify the path here to our model name if you get some errors when you're running your own code and make sure to go down to the description here i'll have to decode on my github you can just copy paste it train your own models or use like some custom yolo models that you can find on the internet play around with yourself your own cameras and so on you can just copy paste the code run the python file here and you'll be able to actually run it i'll also upload this um these wait files here as well if you have a mock that you want to do detection on or follow the other video that i have where you can actually create your own update detective from scratch and then you can deploy it here so we have like a fully project of generating dataset training a neural network and then deploying that new library even on gpu so here we're just going to create our detector and then we're going to run our detector down here so right now we're just going to run the program here we're going to test out a couple of different marks and see how the detection act like works when we're deploying our yolo model here on the gpu here we see using device cuda we can even see that the the cuda nvidia driver i have over here or my um graphics card from a tpu so we have the nvidia geforce gtx 1066 gigabyte of um off member here so we're using cuda we're running this new network here in opencv with python and we're running it on the gpu so here we see we get the number of frames per second we have me in the image frame i'll just take this mock up here that we act like did our opting station on so this is the mark that we trained our new network on we created our data set with um around i think it was 70 or 80 images that we used and then we'll use some data augmentation so we ended up with around 200 images it took maybe 10 15 minutes to act like generator dataset label it and then train our neural network and this is the results that we get even if we're moving around here in the image frame is still detecting it pretty good drawing a really nice mounting box around it even if we're rotating it around it can still take this mug even though we haven't trained it on that many uh images it's still really robust we can play around with it we can like move it around here in the image frame we can see that if we can't see like half of the mark here it is not able to detect it and that is because it hasn't been trained in those situations but if i can see the whole mark and we're moving around in the image frame it's actually like pretty good at tracking it actually i have another cup here so we can see that it is actually able to detect multiple marks in the image frame even though even if we're just moving it around here so we're able to detect as many mocks that is actually like in the image frame and we can even try here with some other different kind of mocks so this is an example of a mock that we haven't trained on before so we see sometimes it actually sees it as a mock maybe this hang here that you can hold the cup in is the reason that it can actually be it's detected sometimes but if we're moving it around we can see that it is not really detecting this cup it doesn't see it as a cup maybe it's too high maybe the color maybe it uses the color tracks like get some features of the mark but it's not able to detect it maybe sometimes when we have this hang it detects it in the image frame but again we haven't trained a new network on this type of cup we have only trained it on this cup and when we get an insulin frame it just detects it immediately even if we rotate it around here orientating it rotating it translating around here in the image frame it is still able to detect it the last cup here i have that i want to show you is another one here it's a bit smaller it it looks kind of a bit more like the other one but it's still it's a white cup and the other one was a yellow cup but it's really it is technically really nice it hasn't trained in this cup at all it hasn't seen it ever before so it's actually a pretty nice detection we see if we move it up here we lose some detections maybe most of our images were actually like in the middle of the image maybe learn some features there we can rotate it around see how performs we can see is still pretty good now we can even get some detections here at the boundaries of the image frame it tracks us around pretty good even if you're rotating it and maybe if maybe if you just see it from the front we had some orange as well rotate it around we get some more orange so this is actually a pretty nice mock detector even though we have only trained it on a couple of images and with us with a really special cup like a mock which is this one get around 50 frames per second it doesn't influence our cpu at all because we're running the whole newer enable network here on the gpu if you have a better gpu you will just have more fps you can do more processing you can take more things in your neil network you can even have larger new leverage running so this is just a model with around like i think it's a 7 million parameters that we have in this new network yeah we can see it down here at the bottom we have 23 13 layers in our neil network here that we're running on the gpu so this is actually a really nice project we completed it from scratch where we actually have our data set trainer yola model and now we're deploying it on the dpu with opensv and python so thank you guys for watching this video here and rented subscribe button and notification under the video and also like this video here if you like the content and you want more in the future it just really helps me and the youtube channel out in a massive way i'm currently doing this computer vision tutorial where we're talking about basic image operations camera calibration stereo vision to get depth information in our image and now we're actually creating point cloud from that depth estimation uh with stereo vision and then we're doing point cloud processing techniques how we can actually get post estimation of point clouds optics in the scene and so on so if you're interested in that tutorial i'll link to it up here or else on the scene next to you guys bye for now

Info

Channel: Nicolai Nielsen

Views: 17,976

Rating: undefined out of 5

Keywords: yolov5, yolov5 neural network, yolov5 custom object detection, yolov5 object detection, yolov5 tutorial, object detection yolo, object detection pytorch, object detection python, opencv object detection, opencv yolov5, opencv python yolov5, object detector, object detection yolov5, opencv, yolov5 dataset, neural networks, detect objects with yolov5, yolov5 opencv, opencv dnn, opencv neural networks, deploy yolov5 model, deploy yolov5 model opencv, how to deploy yolov5

Id: Cof7CNjDppo

Channel Id: undefined

Length: 13min 9sec (789 seconds)

Published: Mon Jan 17 2022