Can we detect objects with the NEW YOLOv6 Model?

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

what's happening guys how you doing welcome to the live stream today we are going to be going through yolo v6 or at least beginning to get up and running with it now i am absolutely pumping out content today as you would have seen there's another video that literally got released i got another 10 minutes ago so uh if you do want to check that out i went through the ama questions and answered around about half of them so there's still a bunch to go did your question get answered who knows go and check it out um so today we're going to be focused on taking a look at yolo v6 but i also wanted to have a little chat to you about something that i actually noticed in a presentation i had today so i had a really big presentation um to a potential client about using some machine learning capabilities in a particular space with uh text to speech and speech to text and you know what was funny i am really excited and pumped about machine learning capabilities and i think you can probably tell based on how psyched i am every time i do one of these videos and you know what i just let that flow today i actually was like you know what to the client i'm like mr x i'm like super excited about this stuff i don't know if you are but i'm really excited because there's just so much you can do it changed the complete tone of the conversation because everyone kind of got excited because it was really really interesting and the reason that i wanted to tell you this story is look if ever you're going into a meeting with a client or you're going for an interview or you're doing a presentation or let's say you're going to try to get into your masters or something along those lines bring the energy because it's addictive and it wears off on people so um don't be afraid to be excited about what it is that you do and what it is that you're interested in because that passion really shows anyway i figured i really wanted to um share that little snippet with you because it it was really really interesting today that um that sort of came through all righty cool what are we talking about today let's have a quick look at the chat i saw yolo v7 was released will you be covering what is the difference i haven't looked at v7 i haven't even looked at v6 so we're going to be taking a look at least one of them we can always go um to v6 i liked val cube's comment makes yolo 5 gui yolo v6 comes out so the thing with all of these yolo models is and and i raised this question i read it a while ago and got absolutely obliterated but i did it for you guys there yolo is not like a brand term it's like people are taking these different architectures and just calling it yolo they're still great it's just anyone can go and call their model yolo just like anyone can go and append a new term onto a specific model so i don't be surprised if we end up at yolo like v25 by next year because people can go and make improvements and release those and call them yolo the next version because i was genuinely curious i'm like is this the same model that was originally developed by i think it's pj ready or redmond um is it the same model that's gone and being developed by him or somebody's gone and taken up the reins and and built it to the next level but i i'm pretty sure the v5 version that we did a tutorial on was by ultra lytics i'm pretty sure v6 isn't by ultra lytics and v7 it'll be interesting to see whether or not they have the same authors but again that's just a little bit of a nugget of information for you guys because i was interested in that as well and i wanted to try to find out because i think we made a video on it and i wanted to sort of relay that back to you um i got the information sent on a discord cool cool be great if you can make a video on burnout i might do that next week so i think tomorrow i want to try to do something but i don't know we'll we'll see what happens um when will i be continuing the python finance series i think a lot of people are interested in the doing more stuff around python and finance i'm really sort of keen to get back into it and i know that i really want to go and build the machine learning trading algorithm or do something in that space actually build something i have not released it and i haven't talked about it haven't got it on github as well but i built an entire model and started integrating it into like a live trading platform i can't remember what live trading platform it was but i never actually went and released it namely because i wasn't confident enough in what the performance was like so i'll probably come back to that once i finish the next three transformer videos which i think are very close now so i just gotta i wanna bulletproof those tutorials before i get them out there i'm um i'm taking a step back and making sure that that is tight and then we're going to get into it uh do you want to create from scratch maybe we'll see i have to do my master's thesis on instant segmentation do you recommend yolo or mask rcnn um take a look at the different performance metrics for segmentation i can't remember what actually performs best hey we are working on the sign language translator we tried using media pipe but the duration of the sign had to be kept fixed can you please suggest any other methods i mean that's best practice you could try sequence the sequence model i'm not too sure how that would work but you could potentially do something in that space maybe alrighty go we are going to give this a crack my guess is we're not even going to get to detection today this is really just going to be like set up exactly as we did for the voice cloner model but uh we'll give it a crack and see how far we get today the nice thing is that i'm just going to use the voice cloner environment to or the pi torch environment that we set up for voice cloner to be able to start with yolo v6 alrighty let's kick this off okay so i'm just going to go and grab the repo so i think it was on github somewhere yolo v6 see all right interesting right so yolo v5 was published by ultra lytics yolo v6 let's see who's authored this h1 and then if we go to yola v7 again someone different again like there's a whole bunch of yolo v7 repos so that's just something to keep in mind that i don't necessarily like i'd have to actually go and dig into it a little bit as to what the differences are because it kind of depends on which implementation you use as well i don't necessarily know if there is a officially agreed apartment to be looking this camera whether or not there's an officially agreed upon repository that is classed as yolo v5 yolo v7 so on and so forth but um there's something to keep in mind all right let's should we take a look at yellow v6 or yellow v7 let's i mean we're dynamically changing on the fly so whoever asked about the instant segmentation so important to take a look at your performance metrics right because this is giving you the coco mean average precision based on each one of the different models and based on the number of what is that batches so this is how fast it's able to detect versus the the mean average precision so you can see that the olo v6 is obviously very quick that's your v6 small and what's that uh ms coco what are we looking at here mean average precision is dropping versus batch speed oh frames per second okay got it interesting what's v7 happening i don't want to do vc vc6 v7 it's part of the discovery yellow v7 is faster but there's no compe is there a comparison a yellow v6 on here does not look like it [Music] let's just use yolo v6 for now i hope i have a camera plugged in that is going to allow this to work anyway yolo v6 all right so we are going to get this started so i'm going to let's go and take a look at how we can run this so we need to clone the repo cd into it and then install the requirements so let's kick this bad boy off and i'm going to open up a command prompt and go into my d drive so i already created a repo on side of my youtube folder which is all where all the magic happens let's make this bigger so you can see it so inside of the voice cloner model pretty sure i've got a pie torch environment or a pie torch installation i don't want to have to go and install it again because it takes a ton of time so we're going to cd into there the cd12 as 12 and then we're going to activate it so dot backwards voice cloner scripts activate so uh if this is really just based on a virtual environment we've gone and created in the voice cloner tutorial so you can take a look at that in terms of creating a virtual environment i'm going to clear that to make it a bit cleaner and then what are we going to do so we are going to clone this so let's run these commands so git clone https colon forward slash forward slash github.com four dash made one for slash yolo v6 cool all right that looks good see the yellow v6 i'm just going to open up this folder and let's take a look at what it's going to install so torch torch vision numpy pi yaml scipy tqdm addict sensorboard pi coco tools onyx on the next simplify and stop i've never heard a thought be interesting yeah what um what version of pi torch should we have in here already so if i run pip list we should be all right we've got uh 1.12. okay cool that is good so let's go and run this next command so this date i'm basically just going through this right now so we've cloned it proceeded into it now we're going to run pip install dash our requirements.txt the pip install dash uh requirements.txt hey yo nick do you own a rubber duck i've actually wanted to go and buy one is that from the what is it cs50 where you talk talks about speaking or rub it up when you're writing code well to be honest i'm talking to you guys as i'm coding so that definitely helps me work through my ideas um one pod dish did i learn machine learning in university or did i learn it by doing so i started my masters in data science and to be honest i really put very little effort into it um and eventually i was like i like i smashed the absolute crap out of it so like or at least that the subjects that i did do i i went really really well and i was like not really for me because like that a lot of the stuff that i wanted to do wasn't being included in the course and so i was like you know what if i went and put the same amount of effort into the master's degree or into my own self-learning as i was into my master's degree i'd probably get way better better benefit out of it and that's exactly what i do i if i had done my masters i don't know whether or not i'd still be doing youtube because i'd be like i've done my masters now i kind of have a solid grounding let's just go out and and make some money um but now i'm like i've sort of built this habit of continuous learning which which sort of brings us here okay we have gone and installed our dependencies we've got this warning here but that is perfectly okay uh all right so now we can do inference and so this is going to do inference on a source image so let's try that i wish i could make this text smaller it's uh let's actually move it anyone's got a suggestion for a replacement command prompt now is your time we're going to be doing more command prompt stuff that's probably a bit better and bring that there oh gosh this is yeah all right that's probably about as good as it's gonna get all right we can see what we're doing now oh god there we go all right that's about as good all right so what do we need to do now so we need to run that command down the bottom don't go stop okay so we are going to run this line down here so python tools forward slash infer dot pi dash dash weights yellow b six what is it s dot pt dash dash source image dot jpg forward slash image hdr is that meant to be separated onto two lines i'm guessing we just run that right because we just want it to be done on a single image okay so what is it saying so we don't have the the weights do we need to go and download the weight separately youtube and then if i go into voice cloner yolov6 what's happening here assets but that looks like some sort of detection configs what is this that looks like just configuration dollar v6 models paid up nope layers nope models nope these all the python components quantization runs do we have anything in there first download the pre-trained model oh okay hold on so we've clearly skipped a step okay so it hold on let's quickly read this again before we skip in steps all right so i've gone and installed first download a pre-trained mod yeah okay so we we need to go and download the weights so let's just run it on yolo 6s hello 6s 6n is getting a 6s so we are going to want this bad boy over here all right so that's downloading 18 seconds so presumably what we're going to want is to grab that out of our downloads and throw it into the folder that we're working with dollar v6 yes all right let's try that looks okay okay we don't have an image so let's get an image now uh soccer match image grab this one save image and d and i'm going to put it in the same folder as i've where i've got the actual yolo v6 model so let's just change this um soccer we'll save that all right so we've got soccer there beautiful so let's test this out now so if i go and run it oh can you see those commands probably dot jpg jpg yep does it say where it outputs presumably it's in here there we go boom take a look at that so i'm guessing this is just chain trained on the coco model bit there you go there's another object detection model that you guys can test out which looks like it's very highly performant all right so let's quickly recap on what we've done so far so in order to get this up and running you need to clone the repo it just basically go through the github repo but what you need to do is close down the repo cd into it run pip install dash requirements.txt and then from here you can run this command before you do that you have to go and download a specific train set of weights so you can either download yolo s dot pt or yolo v6n.pt from this link over here and then you can go and run it on an image i wonder if there's any documentation to go and run it on a on a webcam i think we will do custom data but not in this live stream if for anyone that's thinking about it all right i'm going to get back to the chat in a sec but uh at least this sort of shows you a quick way to get up and running so source let's go into that python tools infer dot pi i don't know how well this is documented but let's uh if i go to tools infer dot pi remove this so the device is whether or not to run it on a the device is let's find a source here we go data images [Music] sensor all right webcam anyone got any ideas in the chat how we can use this real time i know that it's possible because i've seen it we might need to dig into this one tomorrow let's buy classes class agnostic save inference results to a project save inference to do [Music] but they do use half precision so if i go to dash dash source we will need to dig into this i wonder if there's anything in the github repo training evaluation resume tutorials train custom data but at least that sort of shows that it's relatively simple to get up and running i know that real-time training is always a fine tuning can always be a bit of a pain but what is it let's take a look so we can at least prep ourselves for tomorrow so to go and train we are going to need to run python tools forward slash train specify the batch size special specify the config or which model that we want to train so that's okay and then the data needs to be inside of a yaml file so do we have a sample yaml file so if we go into yolo v6 data here we go oh so it's the exact same structure as the v5 training oh we can do this easy not today because your boy needs to go out to date night but uh we can definitely get this done all right i think we'll knock this out tomorrow but uh that's it in a nutshell at least yolo v6 getting up and running let's do it on another image because i don't want to leave you hanging too much uh give me some images that you want me to use dash dash source zero means webcam alright cool god i love coding with you guys source zero wait is that source zero going to mean it's going to detec let's try it invalid path source zero what if i try source one no you guys saying source zero dash dash source get rid of the filters touch nothing yellow v6 let's go to google maya i save our yolo v6 webcam how do we do it for v5 i wonder if the implementation is similar yeah i think the number is for v5 check my webcam source number i don't have opencv installed do i we do okay hold on all right let's just quickly write up a python script all right we've got five more minutes if i got a bail and see if we can write this quickly file new file python yeah so import cb2 with a cv2 cap equals cv2.video capture and then what do we do pass through the device while cap dot is opened cap dot read um if cv2 dot wait key uh one would queue and then what do we need so cap dot release db2 dot destroy all windows let's save it here best cv2 window uh we need to deactivate this environment e drive cd youtube cd 12 activate this environment voice cloner scripts activate and then what are we doing we are sitting into yolo python what is the code python uh test god why did i make it such a long file name uh what am i doing here i'm guessing i've got the wrong source no all right so it's clearly not video capture device zero what about one oh you know what we're not displaying it either hold on screw that up cb2 cap.read uh cv2 dot what is it window god it's been a while since i've messed around with cv2 opencv the number of times i go back to my own github repository to work out how to do stuff is hilarious guys at least you're seeing what uh what real development looks like i can remember a bunch of stuff and then there's always one thing that it's like cv2 dot i am sure god how did i forget that one and i forgot to do this comma frame tv2.imshow and then what is it window and then the name of the frame just check the rest while we're here yep that's fine cv2.whitekeyodq i think we should be okay yeah see i normally use video capture device zero getting any pop-ups no it's not capture device one i don't even know if i got the other webcam plugged in who knows we might need to mess around with it and get that up and running eventually but uh yeah i failed with zero and one let's go let's try two i'm gonna be late for dinner aren't i that's oh okay so it's two all right so we've got that up and running so hold on all right so let's try video capture device two no see it's not picking up that device we're going to need to dig into it as to how to actually go into the webcam but we will find out that's fine we'll get it up and running don't stress guys i'll definitely dig into it anyway on that note oh wait i said i was going to do one more example didn't i let's do one more um what's another like traffic images let's see it these are always good for doing tests i'm gonna save it in the yolo v6 traffic and then let's try that and then i can just type in graphic or jpg so you can pass through the source we've got to double check what we need to use for for real-time detection so if we go into runs inference exp there's the traffic detection that is surprisingly pretty good it's missed this region up here but i mean pretty awesome right in terms of how quick we're able to get that up and running that that is absolutely brilliant yeah i figured it's not used directly no that that is a webcam but let's um all right let's jump over to chat but for now that is at least the first beginnings of yolo v6 explored so i think what we're going to need oh no we've unplugged that camera it's always fun screwing around with live streams and then the camera not working anymore let's see if we that's back working now oh my camera's frozen all right we're back we are back all right cool anyway guys thanks so much for tuning in let's quickly go through the chat thank you so much for helping me debug but i think we're going to need to delve into this a little bit more to try to get the real-time detection working but we will we'll do it i'm sure i'm pretty sure i've done it with yolo v5 i can't imagine it's too much different so um we'll probably need to work through that that big water glass is nice it is from one of my favorite breweries in australia it's called hope estate in a place called hunter valley and they make like some awesome beers it is pretty awesome the era to code withdraw hey nick the error is likely because you're using the webcam and most likely obs so it'll be better if you used an external camera now so i've got i've actually got two webcams so if you actually go and watch the um the battle station breakdown i've actually got two webcams so one to do the live streaming and then another one to do computer vision on the fly i don't know we're gonna have to dig into it treat the girlfriend to a nice dinner yeah no that's right i think she's probably gonna be finishing up work soon now so we'll go and and say hi um someone help me with the installation of pi coco tools i think that the docker for that is actually pretty good but we will get this up and running when to use underscore i'm not sure what that's in reference to yeah alrighty guys cool we'll wrap it up also just before i forget there is a new youtube video out so i know we were doing the ama let me share this is so meta i'm going to my own youtube channel but i just released a video on answering all your deepest darkest data science questions so if you haven't gone and checked it out already go and check it out if you want to hear about me answering some data science questions anyway thanks so much for tuning in guys i love you as always i will see you in the next one i'll catch you later peace have an awesome night stay coding be good you

Info

Channel: Nicholas Renotte

Views: 6,842

Rating: undefined out of 5

Keywords: data science, machine learning, deep learning, python

Id: b-jR2dUFnCg

Channel Id: undefined

Length: 34min 25sec (2065 seconds)

Published: Thu Jul 14 2022