Measure size of objects in real-time with Computer Vision | Opencv with Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
foreign [Music] consultant developing course instructor and I help company startups and developers to ease and efficiently build competition projects we are going to see now how to calculate the size of an object from a fixed camera on a plane surface let me give you an example I have here four nuts but there will be more because this will be in real time or from a video footage and let's suppose that we want to see check in real time the size of each single object which we might need to classify the objects so if we want this small nuts to be placed um maybe by a robot arm in some box and bigger nuts in other box we can do that automatically with computation in this case decide that we might want to identify consider that we have circular objects might be for example the diameter of each of them you can see that we have like two bigger one they have approximately 1.4 centimeter of diameter wide this small one it's around zero six zero seven centimeters and we can have this information in real time and this can be applied of course now I'm taking this as an example I'm taking nuts for example but this can be applied to any object it doesn't matter like the shape the size the color it can be applied in any object with the methods that we are going to see right now in this specific video we will focus it will be structured in two parts uh in the first part we are going to see how we can get we can identify the object because we will need before calculating the shape of the object we will need to identify the mask so the mask will be the exact surrounding of each single object because once you have the mask so you take the mask it means that you extract all the background from the object with that then later you can calculate of course the size so first stage of this video we get the mask second part once we have the mask we will know how to convert this information which we will get into pixels because we're working with a screen so the size of the screen it's pixels its pixels we want to convert this into either centimeters or inches or the unit measure that you use in your country before moving further I want to let you know that in case you are a developer you are a setup founder you are a student research and you want to learn how to build competition projects in the easiest and most efficient way I created a crush course which is called Combi division blueprint where this one hour crash course where I'm going to teach you the fundamentals to build completion software to detect and track any object easily and efficiently playstore.com slash blueprint and now let's move to this project now we will start building the project from a video footage of the real scenario where we are going to apply this in this case I want to identify the nuts that are passing on this specific conveyor belt so I took some nuts so the more you have the better because of course you want to make these as precise as possible so the idea is to take a very big sample of nuts so that when you put also nuts that the computer has never seen before they will be identified correctly in this case I have this video let me show you so what is happening right here where the nuts are moving on the conveyor belt from this video we're going to extract images that we will use to train an algorithm there are a few things that I said in the sentence but let's just follow me carefully and it's easier than it looks like this is the first part extracted from this video a few images and this is what we will call the data set which we're going to use to create a custom model to detect the nuts so let's take for example a few pictures that we have right here uh what should we do right now right now we will take all of them and we're going to annotate the objects that we want to identify in this case we have the nuts we want to annotate the nuts with the segmentation so let's do that in order to annotate the images there are a lot of software to do this I'm going to use a freeware software edits online which is called make sense.ai and I'm going to first I'm running this right here and now I'm going to put the images inside this software so I'm going to select all of them I'm going to move them right inside the filter we see I have just 25 images I mean you should have more like if the object is complex 25 for this very simple scenario it's it's okay I'm not saying it's optimal but it's decent and then I select object detection I'm going to create a label which I will call not and then I'm going to start a project by the way I'm going very fast and this specific sample because uh the tutorial doesn't want to focus on creating a custom model and so on I have already other videos for that I'm going to leave some sources down below I also have a full mini course which is called mask rcnn Pro where you're going to have all the sources to create your custom models so I'm going to leave also that link down below just to give you an idea of what is an annotation let me take an example we want to teach the software like the the custom algorithm hey this is a nut you need to uh so that the algorithm can identify nuts that has never seen before so we can do that by annotating them we do polygon annotation because we want the exact mask so you will see now I'm selecting the exact polygon of each object write this so this is a nut so I have the not annotated I will do the same for the second nut right here annotation like this and we have to do this for all the images so for all the objects we have this is a very time consuming process if you have thousands of objects it's a very time consuming process but it's necessary and we have to do that after this stage that we have the data set we can now create our custom model I'm going to show you like through the mask rcnn mini course what I created and how I created that uh this one right here is from the master scene and training Pro course you will find this on posters.com I'm using like the notebook to train a custom model to detect the nuts so I uploaded the images with The annotation into the notebook there is a lesson where I explain all these steps and at the end uh this is the result I got so uh it's able to correctly identify the knot this is an example on an image and of course from this I'm going to download the model the custom mask rcnn model that has been created to identify the nuts on the computer and then I can use this in real time so we can use this with the camera for or from a video footage I downloaded the model from I had that on Google collab in the computer I also have like the source code from the mini course mask arson and pro to detect the segmentations of the object in real time we're going now to run this one to make sure that the detection is working well with our real video footage and this if this is working well already and doesn't require any Improvement we are done with the first stage of the project which is only detecting and segmenting the surrounding of the object let's now run this one and see how this works I have the code I will click run it might take a half minute to run because the model is a bit heavy so let's wait I'm going to pause this now we're running this on the video footage where we see that the mask is detected quite well in this specific case the nuts have a surrounding like segmentation so like the boundaries we have also some other details we have a number one which is the class we don't really need this um for this simple project we need the class to identify like different objects it could be for example class one it's the not if you have like some other object like skew drive it could be Class 2 another object could be class three and so on so if you want to differentiate between the objects you can have multiple classes we can remove this one we have only the nut there is no point to show one in all of them and we have 0.99 this is the confidence how confident is the algorithm that this is a nut and the confidence is incredibly High also the reason for such huge confidence is that the images I used to prepare the model uh in the like the video they have like same lightning condition is exactly the same so that's the reason for such high confidence in real world scenarios like when you play this for much longer where you will have for example uh multiple objects different lightning conditions of course also due to the weather these might of course have a little bit of lower confidence but it will work with a very high accuracy always we have this segmentation which is working great so we will stop with this first part of the project and now we move to the second part where we're going to code a bit a few things uh first how to use the boundaries of the segmentation to calculate the size of the object and also we're going to display the size in real time one thing you might be wondering about why is this so slow the master CNN algorithm is quite heavy so even with a graphic card which I have now running is able to process around a bit more than one frame per second and this video has 30 frames in a second so that's why it's going so slow when you have this on a real video footage I mean on uh in real time from the camera it would be smooth because we will skip frames when they are not necessary I will not go more into this because each scenario must be analyzed and that's not the purpose of this video let's now go to the second stage let's find the side of the nuts and let's display that on the screen now consider that this is a short video I can't of course write everything from scratch as I normally do I I write everything from scratch this specific code in the course in this case we will focus only on the most important part of this video which is finding the size of an object let's understand what we have right here once we get the information from the mask rcnn algorithm we get a few points so in what's happening exactly in this part of the code is we are draw we are detecting the position of the objects and drawing the information for for each single object we are looping through all the objects this means that first we have this object right here we take all the information of this object then we Loop to the second object we take all the information of this object and so on and one by one we're going to first object we take the rectangle the rectangle we have the rectangle by using uh two points uh we have top left point and bottom right point we have X and Y position of each of them we have the class ID I show you class ID in this case was one because we have only one objects in this case the class that is pretty much useless uh in this I mean at least when we have we have one object then we have the mask the mask is the exact boundaries surrounding of the object uh this means that the mask will be for example of this nut right here we will be the coordinates of the entire polygon surrounding this object sometimes for for some purpose like the bounding box might be enough we might only need to get the bounding box in this case let's say that we want only the width calculating this uh this way the bounding box will be pretty much what we need sometime if you want more information like for example the area of all the object or if it's a rotated object you might want more than that because if you have a rotated object let me first find a rotated object so I can give I can tell you uh why let's take this an example of rotated objects for example the Lego on the left this one is a rotated object in this case the bounding box will not be enough because the normal bounding box in computer vision is always like this the the horizontal and the vertical are always like 90 degrees they are never rotated in this case when the object rotated like the bounding box well you can use you cannot use that to calculate the size you will need mask rcnn to get the exact segmentation of the object so just to make this very clear why of that now in this specific case of the nuts I want to keep this very very very simple what I'm going to do right now is I'm going to pretty much remove all the code that we don't need we don't need this car we don't need the mask we don't need class ID I'm going to comment on that one we will focus mostly on using the bounding box that we have because it's a circle object with the bounding box we're going to get away already the diameter approximately of course like this is now not an industrial pro project where we where we need to be precise of course if we need to be to do something very precise I will put much more attention to all the details now it's mostly to give you an idea uh y1 X1 Y2 X2 let's draw a rectangle for the specific object so I have here the code that will I will text put this again cb2.rectangle we want to draw the rectangle on the EMG uh we have the points for the rectangle y1 X1 so to draw a rectangle we need two points top left we have X1 and y1 and then we have also bottom right X to Y2 which is the which are the points that we're getting from uh mask rcnn uh then we need the color and the color we have already colored which is generated automatically by the code we don't need to do anything else about that and then let's also decide the thickness of uh the thickness of the the the Border like the rectangle uh color is not extracted this way but it's colors color oh yeah let's take like the first color or let's let's put a random color so let's make this red let's say in order to give a color is BGR format from 0 to 255 where 0 is the lack of the color 2055 is the maximum we have blue green and red let's put 25 of blue uh 15 of green and 220 off red now to make sure that this is correct I'm going to run the code and we should see a bounding box surrounding the object so let's run this one now we have the exact bounding box surrounding each object what we want to do right now is to get the width of each of them and we want now also to display uh the feed when you have a circular object in this case the width will be the diameter this is not exactly secret but let's take like it's circular let's now show uh the width so I'm going to stop this one how can we calculate the width of this we have rectangle then width of the object will be consider that we have already two points we have the top left and the bottom right let's let me show you now we have a bounding box like this uh like this surrounding the object let's put it right because that's red a bounding box that surrounds the object like this to have this bounding box we have two points we have the top left point this one and we have the bottom right point which is this one we have X1 y1 X2 Y2 consider that we are on the horizontal axis of the X to calculate the width we can simply say X2 so from X2 we remove minus X1 and we are left with this so with this size X2 minus X1 and we get the width as simple as that so width equals x 2 minus X1 now we want to show that in real time above the object CV2 dot put text we want to put the text so that the text so we can put this above the object on EMG uh the text could be the width okay there is nothing that could beat the width that we want to show so let's say string of wheat and I'm using string because the width is a number but the text must be a string so we we convert this to string uh then we need the position where are we going to place this text we can place this just a bit above the rectangle consider that we have the top left point of the rectangle let me get back to paint we have the top left point of the rectangle we can put the text somewhere here so it means if we have the top left Point X1 y1 we will take x one by one and place that a bit above of that one this means we take X1 y1 as a base position but we want to place that a bit above so we're going to take some value from y1 because when we have let's say that this is our image we have the bounding box in the images this is point zero zero is top left zero zero and then we have one two three four five six and some pixels going down with the Y one two three four five and so on going to the right so if we have here a number we need to get a take out some value when we go up so if here is zero and here is the maximum we have y one let's remove some pixels from y one and if you're a beginner this might be a lot of stuff that is going on I understand that uh so you might need to watch maybe more beginner videos if you have our time to follow but it's not complex anyway so just either follow like the article associated with this blog post or just pause the video when you are when you're working with this we're not seeing any complex stuff and then we need the font face I will not pay too much attention to this so cv2. font let's get any random font Hershey plane the size of the phone let's say one the color let's make now this one red later we might improve also this graphically so let's see if we have time for that 25 15 or 220 thickness of the text we say two so this is the color of the text same red color as I gave to the rectangle and we should be good with this let's now run this let's see what we get we have now the width in real time the number is so small that of course later we're going to increase this but there is some width which is showing around 150 147 153 uh and this is the width in pixels this is of course doesn't match with anything like it doesn't mean anything one width like this uh why does this it doesn't match with anything for different reasons because when you work with cameras you know and this is also for the human eye that the farther you are from the object the smaller the object is it doesn't mean that the object has a smaller size but it's how you view that from your perspective this means that we're getting such values like 150 for like the big nuts we're going to get around 100 for the small not because we had the camera at a certain distance if we place the camera further of course also this size will change if you want something very Dynamic the the project would be complex you will need need a depth camera that understand how far is the object is it will have like a different view it can adapt to the height to different different positions of the object if you want to make the object more simple like we have now you can have a stable camera but you need to do some calibration the calibration will be in this case Peak cells ratio to centimeters or inches as you wish like of course it depends on the measure unit that you're used to using your country in this case I'm going to use centimeters I'm in Europe we're going to we're using that I will do 100 I will take now one centimeter and Associate that with how many pixels or like something like that in this case we can say I know that now the big nut is 1.4 centimeter and I see that the big nut is associated in this video with around 100 I will say a bit more than 150 153 pixels it means that I will take 153 pixels is one centimeter and based on that we will have all the different sizes at least that's our goal and let's also test how accurate this will be I want now to make this calculation of course it would be simple if we have like one centimeter already uh instead I have like 1.4 so 1.4 cm corresponds to a 153 pixels to simplify this problem I could say that 14 millimeters will corresponds to 153 pixels and I can get what is the ratio millimeter to pixel so the ratio ratio uh pixels mm will be that uh 153 divided 14 153 million 153 pixel divided 14 millimeters 10.92 pixels correspond to a millimeter uh once we have this uh let's also show this so we have the width we have the ratio ratio uh pixels to mm to millimeters and I'm not I'm not going to display anymore uh the text with the width instead let's display this I mean with the width in uh in pixels so instead of string with let's do it this way I'm going to display with this brackets um mm dot format so we're going to display what and I'm saying we're going to display but still I'm not doing it so we get the width uh m m equals we need to divide the width with the ratio so it will be with divided or yeah divided Direction the ratio pixel 2 mm and so we have the mm I'm going to increase the size of this text which we almost couldn't see and later when we we can improve a bit the graphic if we want to display this nicer if we have time so let's now run this one to make just to make sure that everything is correct so we will double check I want to make sure also that the size is correspond at least to a certain degree as I said before this is a very basic prototype it's a project to give you an idea this is not it is not a final project where of course you need to take into consideration many more things there will be more calibration also take into consideration that the camera has a lens Distortion so the same object when it's on the side of the screen and it's passing in front of the camera due to lens Distortion might be perceived with a bit of different size these are all things that need to be study taken into account when you need of course to build some industrial project and the Precision is very important maybe there is some margin of error which is tolerated but usually it's very small it's a very small percentage and so it must be very very precise uh let's show this right now [Music] uh we have now 13 millimeters now the number is huge because I haven't rounded the numbers so that's something we can look into it which uh seems now realistic we have a big nut which is 1.3 centimeter also we might convert this into centimeter with the smaller knot which is only nine millimeters we have another smaller nuts coming which is nine point something millimeters and so on we are almost done with the project I just quickly as the end I want to show you uh how we we will convert this into centimeter and we will improve a bit the graphic so that we can we can see this right away with the eye because now it's very tiring if you keep looking at such numbers all the time uh let's do that then uh what we will do is we convert millimeters to a centimeter which is a very simple operation CM uh will be a millimeter divided 10. Also let's round the numbers so instead of millimeter now let's show centimeter but okay here are centimeters but uh we don't want if it's 1.4 centimeter we don't want 1.4978 like all these numbers uh we want to round this with maximum two number after the comma so we use the function python function round uh we round centimeter we only two numbers and we have centimeter let's increase the side of the text let's say four uh I'm also now going to put a red rectangle below the text and we'll put a white text on the red rectangle so it will be very easy to see with the I see to the rectangle uh we want to put the rectangle on EMG uh point one will be for X1 I'm not sure about this point one um X1 minus let's say X1 let's say X1 now uh y one minus I'm not let's say 25 so we want to make a red high red rectangle maybe even less 35 later I will adjust this then X uh one plus let's say plus 80 80 millimeters so we have a 80 pixels not millimeter and then X or y1 minus five so it's a bit bigger than the text so the rectangle starts below the text so it's it would be like we have a rectangle uh maybe let's make something new don't save Let's uh we have a rectangle and we have the text right inside so this is uh we need now to Define this point and this point that's what I'm doing right now uh minus five let's use the color this one and then minus one because we want to fill the rectangle so if it's a positive number you say like the side of the Border if it's negative one it's field rectangle with full color and then let's make the the rest white 255 255 255 which is the maximum of all the colors which give us the white let's run this one let's see how it works the decks got quite big which is nice which is what I wanted the rectangle is not matching that uh this is also an option which would be more advanced uh on the graphical point of view to get like the size of the text and make dynamically the rectangle according to that size uh that would be now a waste of time to be honest to do that so we just increase the tax make this look nicer and we are done now we have now the final project which is uh looking really good we're attacking in real time so this is in real time now I'm running this from a video footage it will work exactly the same in real time and of course you can make this very interactive and perform some action based on the results that you get if you want to perform some action different action if there is a bigger object let's say 1.4 centimeter not you will can send a comment to the robot arm to pick that and move it somewhere else you can do the same for the small nut if you are doing this for a project where you want to use this some like sort of quality control you might connect the computer to an alarm and the alarm will go off as an alert if something is wrong with the side of the object or whatever it's connected with a size of an object you can apply this for remember this is of course a very simple prototype from a fixed camera position on a plane surface this is all for this video if you want to learn more about how to make projects I have courses at pisource.com there is a crash course that you can download for free as well it's a one hour Workshop paisa.com blueprint if you have a business or if you're a startup you want us to develop such projects at Industrial Level or for sample types you can contact as again at polysource.com this is all for this video see you in the next one
Info
Channel: Pysource
Views: 59,698
Rating: undefined out of 5
Keywords:
Id: xjH0e7kYJsU
Channel Id: undefined
Length: 36min 44sec (2204 seconds)
Published: Tue Nov 29 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.