Instance segmentation YOLO v8 | Opencv with Python tutorial

Video Statistics and Information

Video

Captions Word Cloud

Captions

and now we have a segmentation surrounding only the pole this could work also in videos it could work for different objects in the ball we could display multiple objects in real time for example selecting oh hi welcome to this video I'm Sergio I'm a completion consultant and developer I build companion solutions to help companies improve their process efficiency reliability and scalability today we're going to see some exercise on how to use computer vision to segment an object specifically we will see object segmentation with your version 8. uh if you're a beginner into this and if you don't know what is object segmentation or object detection let's now see the difference between the two of them and later we'll jump right away into the code let's go what is segmentation and what is the difference between object segmentation and object detection the most common operation is computer vision are mostly three um object classification for example if we take inconsideration this picture object classification it means classifying what's in this picture so we can get as a result in this picture there is a rugby ball there are trees there is grass and so on that's classification we get just information without knowing exactly the location in the picture second object detection it means not only detecting what's in the picture but also detecting the position and we will get a bounding box surrounding the specific object plus the class so this would be ball rugby ball then we will have the same also for the other objects that our model can detect so we'll have trees and so on what is object segmentation that's the one that we will see today it means not only having the bounding box but also the exact segmentation of the object plus of course we know what is this so we'll have rugby ball we have bounding box if we want to have that plus the segmentation and of course also we have the class so we will still know what is this and of course objects implementation is the most advanced one of these and it's mostly useful if you need for some reason to attack the shape of the object if you want to calculate the size of the object and this kind of operations let's now jump right away into the code oh I have this code right here let me show you so for the code we will start I will start coding a file from scratch and you can follow along plus you need to download uh the link that I will put down below in the blog post because there will be the yellow version 8 model for segmentation in in the file there plus a very simple module that will allow you to do uh the segmentation very easily because normally there are things that we need to extract for the segmentation and I put them into a file that I created that you can download so we will import that file and it will make all the process simple for you also you need to install the library Ultra elitics so it it depends on what you use to to go with python so you should install that depending on how you usually install the modules the power modules so it would be peep installed ultralytics and then you press enter because we'll be using the yellow version 8 from ultralytics and then peep install opencv python because we we will be using the opencv library for this project once we have the libraries installed let's now create a project first of all let's load the image so I'm going to import CV2 and then let's load our image so we have our this rugby.jpg you can of course load your own images we have now when you download the package from the link below we have YOLO version 8 segmentation oh and then images inside we have two images one is basket.jpg and then regby.jpg so let's load now rugby.jpg EMG equals cv2.m read we want to read the image from now we need to put the path of the image images slash reg B dot jpg now to make sure that we are loading the image correctly what can we do we can show the image cb2.m show uh title of the window let's say image and then what do we want to show we want to show EMG cv2.weight key now the code let me show uh we run this it displays the image but it closes very soon so we need a weight key event to keep the image on hold cv2. wait keep zero so this function is keeping the code still active until we press a key on the keyboard so this is the image and we can see now the rugby ball we can shrink a bit the image if we want because now it's bigger than the screen so if we want we can say EMG equals CV2 dot resize we want to resize EMG we can Define a specific size for example 500 by 500 we don't want a specific size we want to keep the proportion so we're saying known instead we want to uh let's take 70 percent of the width of the image 0.7 and FY we take 70 percent so that we can see the full image in the screen okay now the image fits the screen now we need to proceed with object segmentation so as I told you if you do this operation manually from the standard Library there are a lot of things that you need to extract this is an example to extract the segmentation a lot of technical a few technical things I mean not really a lot of few technical things that I hide it into the URL segmentation Pi file which will do that in the background for us so we just import the file from YOLO seg underscore segmentation import YOLO I miss import yellow segmentation before you ask uh where do you install YOLO segmentation from because this is not a module this is a file that you need to download below so if you're getting an error it's because you are not using this file in your folder your segmentation.pi that it's on the link below so once you import this to make sure that you are importing that correctly you run this if you don't get any errors it means that you are importing the file correctly if you get some error like for example let's let me misspell this so if you get this error no module name your segmentation it means you are not the file is not there in the folder with the main.pi file let's now perform the segmentation with this for the segmentation we need just to say first create an segmentation detector let's call it like this segmentation detector and it will be YOLO segmentation and we need right here to put the model path the model path is Yolo v8m e Olo V8 M um m Dash sag dot PT also this one it will be included once you download the the files and let's call this why this like or Ys YOLO segmentation and now it's very simple once we want to get the result from the image we can say y s dot detect where do we want to detect the segmentation on EMG we get four outputs from this and we'll see them one by one the bounding boxes B boxes uh the segmentation the the classification the classes and the scores I'm not I'm not sure about the order let me just quickly check uh box velocity segmentation and scores so I mixed the order so we get boxes bounding boxes classes uh segmentations and scores let's now Loop through them or let me let's print them first let's just print bounding boxes let's make this very simple and let's run this so once we run this it's loading now it will take longer than just loading an image because it needs to load the objects augmentation model your V8 into memory and then perform the segmentation once we print in this case bounding boxes we see a bounding box 538 358 and like we see four numbers what is this we will see it in a bit even if you can already guess what this is uh let's take another image that let's take for example this was one basket dot jpg I want to show this image because there is some difference with this image with this image basket dot jpg so let's run it on this image also and we have the output of this image basket.gpg and I wanted to show you this because now we see many more numbers these numbers right here represent each of them a bounding box or of a different object in this case we have not only the basketball but we have like people and probably some other object that the Yellow Version model can detect and that's why we get these more bounding boxes let's now go further and let's now also see classes segmentation scores and so on we start now looping through them for B box class ID seg score in zip so to each bounding box we have Associated a class ID a segmentation mask and a score so what we are doing we are looping through all of them together and then to the first bounding box we have the class ID segmentation score and so on so that we get the information of each of them four um vbox class id6 score in bounding boxes classes segmentation scores let's now print beatbox let's now print all of them so we have big box comma plus ID seg and score we could also put some comments the Box like this some of it's more clear then glass ID then sag and then score and let's now run this one uh we have bounding box 73 42 787 1042 class ID 0 then we have the segmentation we have a polygon and then score 0.95 then bounding box of the second object class ID and so on let's go slowly through them we start now using this information so let's not print them anymore and instead let's now use the bounding box the bounding box is the surrounding of the object so it's bounding box let's now you extract this so you saw that there were four numbers in this and they are the coordinates of our bounding box to do a rectangle we need top left Point bottom right point so we have double top left which is X and Y that we get from that and right bottom X2 and white soon equals bounding box let's now draw a rectangle from this C to that rectangle where do you want to draw the rectangle we want to draw it on EMG now let's use top left Point X and Y right bottom point x two y two what color do we want to draw this rectangle let's make it red 0 0 to 55 0 Blue from 0 to 255 zero of green 255 of red thickness of this rectangle let's say three or two pixels so that it's well visible and now let's run this one oh now you see that in this image we have a lot of rectangles surrounding the people and the ball that we don't see okay here we see some part of the ball I want to go and make note this simple so let's focus on our first image rugby.jpg regby.jpg let's focus on only on the rag ball and later I will show you also the rest we focus only on this okay we have this running so now we have a bounding box surrounding the regular ball we're talking about object segmentation we don't uh now we have a bonding box but we can have the segmentation so I wanted to show that you also get the bounding box but that's not the goal of this video uh let's now also show the segmentation so this bounding box then we have seg and it's a polygon with the coordinates so if we print seg so if we put SAG we see a lot of numbers these are coordinates of the surrounding of the ball so we can use these values to draw the polygon use the opencv function to draw polygons so cv2. poly lines where do we want to draw a polygon on EMG now we need to specify the polygon so sec and we need to put this into brackets that's how the polylines function takes the coordinates the numpy array into brackets and then we need to specify the color let's make this blue so we can uh we can see the difference in comparison with the bounding box so 55 like this and I'm for getting probably something yeah we need to say is a polygon Clause true or false is true this means that it's the function is connecting the last point of the polygon with the first point otherwise it will keep the polygon open also for this one let's say thickness two and now I want to make sure that everything is correct and there is no type or error with this function let's run this one and we have now the segmentation you can see the blue polygon that we have surrounding the ball now object segmentation will allow you also to understand what class this is so what object this is and we have a the different categories that our YOLO detector can detect and I have this list right here so we have this is the list where the yellow model has been trained four we have person bicycle car motorbike airplane and so on and somewhere of course we have like the sports ball which is the one that we are detecting and so to each of these we have a different class Associated and we can display also this information for example we have the class ID and we can display the class ID CV2 Dot um put text and let's put the text on EMG now what text do we want to display we want to display the class ID string of class ID oh after this we need to say the position where the text should be let's put this close to the bounding box so the bounding box we have the top left point which is X and Y so we can put X and Y we don't want the text to overlap the bounding box so let's put that a bit Above This Means let's take out 10 pixels then pixels from the y axis so it goes 10 pixel above the bounding box and then we can decide the the text the font C the two dot font here shape plane so I'm putting just some random font like we don't really care now how it looks like so we care more about the functionality size of the text let's say two color of the text let's make it red as the bounding box and the thickness all the text let's say it so now let's run this I want to make sure that the text class everything is correct and if it's so we should see that on the screen oh and right now we have here class 29 class 29 which is the sport ball and I can show you that so if we go to the line 29 of this list which is the one the cocoa model has been trained for we should have the sport ball so let's look for it line line 33 so probably it's either I have the wrong list or there is something wrong with this well apparently it's probably detecting this as a frisbee I'm not really sure this is seamless amount to a frisbee but it's more about wrong detection we have the perfect segmentation there is some mistake with the class so the we have line fairy which is index uh 29 with frisbee we will see with some other images that the detection will be correct Also regarding the the class but we have a very good segmentation so despite the class you always somehow mistakenly giving us the wrong one we have a very precise segmentation of the ball and also this is just a simple uh that I'm showing you with some specific model but you need to know that there are some parameters that can be tuned to increase the accuracy whether it's to improve the performance regarding light 12 and more accurate uh segmentation boundaries or like performances in term or in terms of speed so many things can be done with this or keep now this in mind it's just an exercise to show you how we can access this information and how we can show the segmentation on the screen I want now also to go to the other image with what we are showing right now so regby.jpg and let's now loadbasket.jpg I'm going to shrink this a bit so we can fit it on the screen 0.5 so 50 percent uh so in this case you see that we have the ball which is detected with index 32 so in this case everything is correct because in that case we have the sport ball not 29 as before but also we are detecting a lot of classes zero classes zero is person and if we want we can choose so if our project like in this case we want to detect only the ball we can decide okay let's display only the ball and we could use an if statement so just right here after we get we look through it we can say if class ID equals 32 then only in that case we want to display the bounding box display the polygon and so on let's now just display the polygon let's make it red and very big well visible four pixels thick around the surrounding the ball and let's now run this one and now we have a segmentation surrounding only the pole this could work also in videos it could work for different objects in the ball we could display multiple objects in real time for example selecting only uh the ball plus a chair and like maybe exclude person from this so there are many many things that we can do with this and as you can see with this implementation the code is very very simple and intuitive I'm going to put all this code so everything that you see right here it can be downloaded from below I recommend of course to play around with this and write everything that I wrote today from scratch because that's the only way you can understand better each step and that's how you can work more easily later instead of just downloading the code and running a pre a pre-made code that I've done I hope this was useful let me know what else you would like to see what projects you would like to build with instant segmentation and also this needs to be considered only as an exercise because of course if you need to implement this into some specific project there might be many more things to take a look into like for example the installation with the graphic card for faster processing how to tune all the parameters either for faster processing for better accuracy or also how to implement all of the all of these into a project and so many other things oh if you need some support consultation to develop industrial projects commercial projects uh you can contact me from the link down below like fivestars.com contact uh if you need to learn more about object detection with opencv and deep learning I have a full course on postsource.com this is all for this video see you in the next one

Info

Channel: Pysource

Views: 26,081

Rating: undefined out of 5

Keywords:

Id: cHOOnb_o8ug

Channel Id: undefined

Length: 25min 11sec (1511 seconds)

Published: Tue Feb 21 2023