YOLOv7 Segmentation | Concrete Crack Detection | Google Colab | step-by-step Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
before I started to work in computer vision I used to be a civil engineer so this video is a great opportunity for me to go back as we will be using YOLO V7 segmentation model to detect cracks in concrete traditionally Yola used to be a strictly object detection model however it was very influential in creating architectures like yolak that was image segmentation model that brought a lot of architecture design from YOLO itself this year yellow V7 got released and it brought image segmentation and post estimation soon after that yellow V5 followed the suit we already have a video about instant segmentation in that Library you can watch it by clicking the tab visible in top right corner right now okay so let's jump into Jupiter notebook and train YOLO V7 instant segmentation model on custom data set as usual we started roboflow notebooks which is the perfect place for you to start learning about computer vision we add new tutorials almost weekly and we already have everything from classical resonance to latest Transformers so if you want to learn more please visit github.com let's scroll a little bit lower in models tutorials table we look for yellow V7 pytorch instant segmentation and let's open that in Google call up that will of course redirect us and before we start to do anything let's just create a copy and save it on our drive that will open the same notebook in new tab we can close the previous one and let's increase the font size a little bit so you would guys have an easier time to follow the tutorials over here the first thing that we'll do is execute before you start sell which is basically only an Nvidia SMI command it will allow us to confirm that we have access to the accelerated version of the Google collab with the GPU for us it works we have access to Tesla T4 if you have any problems with that command most likely you don't have access to the GPU you can fix that by going to edit notebook settings and over here in Hardware acceleration probably probably you have selected none that's I guess the default value over here so you can switch it back to GPU and save I don't need to do it so I just cancel and we can proceed to installing the uob7 the installation itself is pretty simple we kick off by setting up the home directory and then we install the actual YOLO V7 what is happening over here is first of all we cloned yellow V7 repository and then we need to switch to different branch how yellow V7 handles all of those use cases that they support is they decided not to have all of them in the same code base they just have different branches that execute different models which is a little bit weird because if I would like to have at the same time object detection and instant segmentation or post estimation it's very hard to have that at the same time I done that in one of my previous projects and it was a total headache but regardless if we open the branches list we see mask we see posts and all sorts of other stuff that they are currently working on the main one is object detection but interestingly if I scroll over here to instant segmentation and I open the code here and here those are different branches one of them is mask one of them is U7 whatever that means and judging by the latest comment date over here is four months ago over here is two months ago I decided to go for U7 and that's the branch that we will use I know I need to stop using over here constantly sorry for that I really don't know what's going on over here [Laughter] so this is why we have this line git checkout and we are basically checking out a specific comment from the U7 Branch the next thing that we'll be doing is obviously going into yellow V7 repository and installing all the required python packages so let's just do that what I usually like to do before I even start training in Google collab is to confirm that everything has installed properly and there is nothing really better than running the pre-trained model on some example image let's first download pre-trained weights for yellow V7 segmentation model and some example image that is actually coming from my home photo book and just simply run predict script with those weights on that image just to confirm that all the piping is configured like it should be fun fact you know your ob7 is actually actually unofficial Fork of YOLO V5 and apparently they forgot to switch some logs from yellow V5 to Yellow V7 I guess that's an Easter egg especially for those who have very definitive opinion about which framework is better and we can display the results so here you see me and my dog let's just make it a little bit smaller for the time being what I immediately noticed is the quality of the masks that we get from the model not so far ago I guessed few weeks back we were testing one of the latest instance and semantic and panoptic segmentation model which is one former and the quality of the masks that we were getting from that model compared to Yellow V7 was amazing the problem over here is that the model was super super slow it took like between 5 and 10 seconds on a GPU to generate those masks so there is a reason why all those papers does the last with those graphs that compare the speed and the accuracy of the models it is well known fact that smaller models tend to have lower accuracy but at the same time return the results faster which is the primary argument to use them in some specific use cases if I want to run a real-time application I cannot afford to wait 5-10 seconds for the inference result to return I even cannot wait like half a second I would need to run 10 20 25 frames a second to be a real-time application on the other hand there are some use cases that are all about accuracy so if we are talking about segmenting medical images for example I don't really care if I run it 5 seconds or 30 seconds even I really care about as high accuracy as possible and this is what we call accuracy versus speed trade-off and by the way if you want to take a look at that one former video I highly encourage you guys to do it it's really mind-blowing the accuracy of that model you can take a look at it by clicking the card visible so in the top right corner right now now the time have come to download the data set like I said in the intro we will work on a crack segmentation data set that is available at roboflow universe but before we start let's talk about the data format that is required by YOLO V7 segmentation model it is quite similar to typical YOLO format but this time we are not working with bounding boxes but with segmentation polygons so as usual we have images and label files every image has separate label file and the label files are in txt form in our case the images and annotations are divided into three subdirectories which is test a train and validation and if we scroll a little bit lower we can see how a single label files look like as usual with yellow every Row in that txt file is a separate annotation and the elements in every row are divided with space the first element in the row is the class ID of the given annotation so if we have 80 classes like in Cocoa those numbers would go from 0 to 79. the next elements are basically subsequent points of the polygon every Point has X and Y coordinates so that's what we get class ID X and Y of the first point X and Y of the next Point etc etc and the last part of our data set is data yaml file which is basically a simple file where we Define the class names the total amount of classes and the paths to every subdirectory so for train test and validation that's it the great thing about using roboflow is that you don't need to really care about the structure of a data set you can basically export it in any popular format one of those is yellow B7 and I will show you how to do it in just a second and by the way if you want to learn more about different data formats and how they are structured you can read about it on our formats page which will be linked in the description below enough of the data set theory let's download it and trained our model first thing that we'll be doing is we need to provide the API keep to roboflow we'll be downloading the data set from there so we need to do that first so I'm just pressing enter over here I get the prompt to give the API key so let's go to roboflow sign in if you don't have account you need to create that obviously then you go to your account settings rubber flow roboflow API and private API key let's copy that and go back to our Jupiter notebook paste it here enter and we are almost ready to go the second thing that we need to do is provide the information about the data set that we want to use in our case it's the crack image data set so we hit download select the format like I said we offer different formats but we are most interested in Yolo V7 pytharch for that tutorial click continue and after just a few seconds it generates a code snippet for us that we can just copy and paste and be done with that I'm just copying those two lines because I have the rest of that in my jupyter notebook so now I'm going back there and just pasting those two lines here press shift enter and the data set should be downloading momentarily by the way I just noticed that one of the demos featured at the roboflow main page right now is football players tracking which is one of the recent projects I done so I absolutely encourage you guys to watch it it's an awesome video you will learn how to train a custom yellow V5 model and how to connect it to tracker you can watch it by clicking the link in the description below right now or the tab visible in the top right corner the only thing that is left to do is run the train script for like 30 45 minutes from my tests it looks like 10 epochs is enough to get a really decent result on that small data set so that's what we'll be doing make coffee make a tea chill a little bit and we will be back soon thanks to the magic of Cinema we don't need to wait too long for the results to come like I predicted between 30 to 40 minutes for that particular training to be completed and we can now visualize the predictions of our newly trained model on the validation budget so similarly as yellow V5 yellow V7 saves a lot of information from every experiment and one of those information is basically an image containing I guess 16 images from the validation set along with the prediction I guess only one image looks a little bit off the rest of them I would say the prediction is quite accurate maybe if we would train for a little bit longer the result for that particular image would be a little bit better but for the sake of tutorial I'm really happy with the result now the last thing that I want to show you is how to actually use our freshly trained model to predict on images or on videos this is quite easy because we just use different script to do that so for the training we use train py so for the evaluation we use eval py and over here we use predict py and what the script does it loads the model into the memory go into the source that can be for example in our case a directory of images and basically run our model on every each of those image and saves the result into another directory sounds easy enough so let's just do it so it is taking a little bit of time to load the model into memory but then when it's already loaded it's just like magic 32 milliseconds per image and the actual results are I think quite okay I'm really happy with that small crack being detected looks promising that's all for today I hope that you liked the video if you did make sure to like And subscribe that really helped the channel to grow and gives me a little bit of motivation to put more effort in the videos stay tuned for more computer vision content coming to this channel soon my name was Peter see you bye
Info
Channel: Roboflow
Views: 14,032
Rating: undefined out of 5
Keywords:
Id: vFGxM2KLs10
Channel Id: undefined
Length: 13min 54sec (834 seconds)
Published: Fri Dec 23 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.