What I Do as a Computer Vision Engineer| Step by Step Guide|

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone my name is natsugeeta and welcome back to another video in this video i'm gonna be talking about what i do as a computer vision engineer this is a field which has been rising lately the field of ai and video ai is getting a lot of hype so and there is a lot of demand for computer vision engineers who are good at what they do so as a pressure or as someone who has not worked in this field there could be a lot of doubts you would have so as to what would someone actually do in a full-time job as a computer vision what what would be the typical requirements what could be the typical responsibilities so i'm gonna be talking about what computer vision is what i do as a computer vision engineer what would be the typical skill sets required for this job and some basic tips that can help you get started in this field as well so to start with what's computer vision computer vision is a field of science that will basically help a camera interpret video the same way as a human does right so for example if you show me a photo of a dog or a cat right i can understand what this is what the context is but when a computer sees this image when a computer sees this image all it sees is pixels it does not have any understanding of the image so building ai algorithms working on image processing so that cameras can now perceive images the same way humans do is what computer vision is all about there are different forms of this there are different uses of this so a self-driving car would be one application of a computer vision problems wherein the car is using a camera to interpret roads the way humans do right so depending on what company you're working in what the main product is there could be different ways or of how you're using computer vision so when i talk about my specific role i work at aviras which is a video ai operating system and marketplace so like you have android and like you have iphone amiros is an operating system for computer vision application so like you have a play store or an app store on your phones we have the equivalent of that which we call as the agarose app stack right this is a compilation of video ai application that any customer would need and as a computer vision engineer my job is to build these applications for the end customer so to give you an example we have an application for camera health check so i'll only stick to the i'll only stick to one application for now and i'll show you what the process is so let's say we get a requirement from a customer that they need a camera health check application which basically means that if a camera is being tampered right if someone is blocking the camera view like let's say a robber at a bank right or if someone is changing the camera view flashing a light onto it then basically the user should get a notification so different applications have like that is what we are supposed to build now based on this application we have to work on pre-processing of the data working on different data augmentation tricks taking care of data imbalance and then building ai models or image processing algorithms that can help accomplish the given task so just to give you an example of what my end goal would be it is to get a production ready application which a user can use right just to give you an example to show you how it happens in practice at my company this is this page is what we call as the average app orchestrator wherein the rows at the top the row at the top which you're seeing represents different video ai applications and the column at the left represents different videos or camera sources and the way you run a video ai application as a user is you select the corresponding cells which you want to run for that particular application so the cells i selected map to a camera streaming in camera health check application which will run on a particular video source in this case so the job is some a user would select these cells and you click on the start button and in the backend the frames from the camera would come to my algorithm the algorithm would process these frames and then uh make sure that the camera health check application works as desired so the end result is that whenever the user runs these applications he should get notifications whenever the camera is being tampered so so let me give you a little bit context right so whenever i get a problem like this we have some data set that we work with right the data set can be acquired online from available data sets that could be synthetic generated data sets that could be data set which you get from the client or you manually generate yourself uh using a camera right so what i had done is i had taken a particular camera and i had simulated various instances of a person tampering the camera so if you have a different video ai application there is gonna be different process of generating uh video scenarios for that particular problem so this is the camera stream of the camera i used and i went into the field of view and blocked the camera as you can see right about now i am covering the camera with my hand and i do different kinds of tampering right now i'm flashing a light onto the camera different things that a typical intruder would do to block the camera right or covering the camera with a cloth or just completely moving the camera onto a different field of view right so this video served as a test video basically after i'd made the algorithm just to test if it works as desired or not so the end result is you get the camera stream you run your algorithm and it gets desired results uh these are some notification which come on the platform right so the engineering part of that is something which i'll which i'm not discussing how it actually happens in the back end how i get a frame how it's processed how it goes back into the front end but the main point is the cam right i get the camera stream i process that there is a way of pushing that alerts back onto the ui and the end user will get notified for example in this particular screenshot i can see the camera view is being blocked and it says over here the camera is tampered right so this is the end result right so this was just one example of an application we have different applications for different use cases for different customers as a computer vision engineer our job is that we make the video ai application and then push that into production and make it scalable so these two parts making a video ai app and making it scalable requires two different skill set which i'll talk about when i talk about making the video ai application right this could be for any purpose let's say you are using a camera to read the number plate of a car or you're using the camera to count the number of boxes on a conveyor belt in a manufacturing factory right so the application could be anything as an engineer your task would be first to acquire data sets right work on different pre-processing techniques data augmentation to make the data ready for model training then use different computer vision models different frameworks like tensorflow pytorch dark net to train deep learning models on that sometimes it might not need any deep learning at all it might just be possible by basic image processing right but the point is you do a to b testing which is evaluation of different models different algorithms to get results on that once that's done the next thing is pushing it into production which requires different aspects different skill sets of software engineering for example if i talk about my application right once i build an application we have to make sure that it's compatible with different software and harvard architectures it's running on cpu versus gpu we have to containerize it monitor the usage how much ram is it consuming how much gpu course is consuming load balancing on different servers to make sure that the customer is able to use the application with the minimum amount of processing and minimum amount of cost in turn right so now let's talk about what are the requirements what skill sets you need to have if you want to work as a computer vision engineer if you're interested in the field of video ai yourself again it differs company to company based on the product they are building but the general skill set is you need to have basics of machine learning deep learning and computer vision you need to be good at programming again the programming language is it could be again different company to company right but it's normally either python or c plus plus in my given role we need both in my given role we use both so for example we train models in python but to ship that into production we need to integrate that with c plus plus to optimize speed so i would say first pick one programming language let's say python if you become proficient in that then you can start working on c plus this as well most companies if you're good at one programming language you're good at algorithms you could add data structures in that then they would give you a chance because picking up another programming language is not that hard if you're good at one so understanding of programming languages basics of data structures no one will ask you to invert a binary tree typically in startups but they will expect good knowledge of the basics data structures the inbuilt data structures of different languages and how to uh access and manipulate them as well another important thing is knowledge of deep learning libraries like tensorflow pytorch darknet and opencv right these are libraries that you would use to process images to train deep learning models so that knowledge is important so start with one or two frameworks and be proficient at that now when i talk about the software engineering part right that would require knowledge of docker and kubernetes and some cloud platform like aws and web frameworks and microwave frameworks like flask or django these are some things that you would need to push your model into production so if you're a complete beginner and starting from scratch i would recommend going step by step to target basics of machine learning deep learning and computer vision right do not directly skip to computer vision or directly skip to deep learning start with machine learning first right so because the basic algorithms of machine learning let's say linear regression or logistic regression these are the most fundamental these are the most fundamental algorithms without that deep learning is going to be hard for you to understand or you wouldn't be able to understand that very clearly next is once you have the basics cleared uh work on basic computer vision projects let's say the most basic project is is classifying handwritten dates using the mnist data set you get an image of a handwritten digit you predict if it's a seven or a four or ten right that's the basic project and then slowly move to more advanced projects broadly there are three types of problems like classification where you're basically distinguishing if it's a cat or a dog or if it's a seven or four or the other uh other domain of problems is of object detection where you're detecting objects in a frame and lastly segmentation so work on one domain and then slowly work on more and more projects and the fourth thing is having some knowledge of pushing things into production right so if you make a machine learning model know how to integrate that with web pages using frameworks like flask or django so know how to containerize your models and how to deploy that onto platforms like aws or gcp right so these are some areas which you can start targeting and now if you want to target jobs in this field i've made a video previously on that as well go to the angeles website i have a link for that in the description as well and search for computer vision engineering related jobs and then directly reach out to recruiters and then talk about what projects you have done and how they fit into the product that the company is building so that is a good approach which worked for me as well and uh if you're interested in the work i am doing my company averros is also hiring for computer vision engineers and you can find a link for the job in the description as well so you can apply for that if that is what you're interested in but that was all for this video i hope it was knowledgeable and gave you some insight so as to what a computer vision engineer would do and if you did like this video do like this video and subscribe to this channel and i see you again in the next video
Info
Channel: Nachiketa Hebbar
Views: 16,114
Rating: undefined out of 5
Keywords: computer vision engineer, video ai enginner, AI Jobs, how to become AI Engineer, startup jobs, what does an AI Engineer do, computer vision requirement, ai job requirement, Ai in production, video ai requirement
Id: mUnPZLcyGeg
Channel Id: undefined
Length: 12min 26sec (746 seconds)
Published: Sun Jul 31 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.