Labeling images for semantic segmentation using Label Studio

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone i'm back with another video on the topic of python tips and tricks and in this one i'm going to talk about how you can properly label your images for semantic segmentation and that one i'm going to focus on using label studio as the tool that we're going to use for this labeling and i'm doing this video because many of you exactly asked for this how to generate ground truth labels for your semantic segmentation purposes using your own images because you're tired of working with public data sets and at some point you want to work with data that you have in your hands so let's jump in and i am going to talk about how you can install this label studio via python of course and how to get started and a couple of notes in case uh you already don't know first thing labels are generated by subject matter experts if you are labeling roads and you know you know people most of us are subject matter experts because we know how a road should look like in a satellite image and so on if you are trying to label biological images you need to know how mitochondria looks like how different organelles look like in your images or if you're working on a geological samples you should know how clay looks different from hard minerals versus soft minerals and so on right so you got the point so it is by a subject matter expert so do not just share your data with anyone who is offering labeling services because then you are going to spend a lot of time trying to explain to them exactly what different labels are but for certain tasks it may be easy for anyone to be an expert and labels why are we doing this because they provide ground truth information for the supervised machine learning so when you're doing your segmentation for example using unit or using mask or cnn or any other approach these are all supervised machine learning algorithms that require ground truth information along with your original images and this is exactly what we are trying to generate using using these labeling or annotation annotation tools now one other thing that i want to remind you is these label values for a given class they must be unique if you have a tissue with mitochondria the tissue pixels should all have a value for example of zero and the mitochondria should have a value of 1 or 255 whatever it is but every pixel corresponding to mitochondria should have a unique value for semantic segmentation for instant segmentation they still have unique value but each object has different values for instant segmentation for now i'm going to focus on semantic segmentation please be aware that when you are saving your labeled images wherever whether you use label studio or any other technique when you paint your images when you save them that they're saved as a binary and they're not actually interpolating between these regions what do i mean by that well let's say you have an image that looks like this which is exactly what we're going to use as part of this exercise this is the original input image and you go ahead and you paint your pixels right so this is what you do you paint all the roads in red and you paint houses in green and you paint something else in some other color and eventually you would like to export them so you go ahead and hit the export and you get a labeled image that looks like this this is what you're trying to use as part of your machine learning or deep learning pipelines now when you zoom into this little region if the image looks like this where you have a heart instead of having a hard edge you have like other gray scales this is the wrong labels because you have a value of 255 in here but then around these regions it's kind of interpolating between your dark area and the bright this often happens when you save your images as pngs and other formats uh and also if the if the toolkit has like a bug where they haven't uh focused on not interpolating this using some sort of a bicubic i mean if whatever the tool that you're using they have to make sure that when you save these as numpy areas or images that this interpolation is disabled but by default unfortunately this will be enabled so please make sure that this is not the case if that is the case i'll show you how to take care of that as part of this tutorial anyhow how should this look like this should look like this on one side i have my road pixels corresponding to the road on the other side i have pixels corresponding to not road so you should always have this hard edge that's just one uh caveat and with that let's go ahead and jump into our code in fact let's go through the process of getting label studio ready and then annotate a few images and then take care of any issues that may occur with these images so i hope you'll continue this video so let's go ahead and jump in okay first just to show you a couple of places where you can get your hands on to these annotation tools i am going to focus in this video on the label studio because it's free and i have been using it on my system for a a couple of months already and i haven't explored every little bit of this because it can do a lot if you are from enterprise if you're from a uh a place where you have multiple users you may benefit from their paid version because then you can have like different users and uh working on the same you know data sets and so on but i'm going to show you the core of what label studio can do as a personal researcher or a personal individual user yeah without any collaboration capabilities that's the that's the goal for today if you uh can pay uh i mean this is another label boxes i think is something that you may have already heard and this has a parade version i'm not sure if it has a free version i have no experience with this but i just want to highlight that this is another tool that people often use obviously as part of your biological image but any any of these other tasks so let's go back to label studio and if you are not just interested in image analysis but if you are interested in some other applications uh this this can do pretty much everything so you see you can do semantic segmentation labeling you can do labeling for object detection and image classification and you can also work with audio files where you annotate your audio files and it of course for natural language processing you can also use this and time series and other examples and they walk you through by giving you right examples once you log in so uh that's that's what i appreciate with this uh with this software so let's go ahead and see how we can get it into into our system so we can go ahead and get started with it so it's as simple as pip install and then launching it but a couple of uh couple of points before we get there so to do that i mean i added all of that information as part of this this python file that i'm going to share with you so don't worry about writing anything down as you're watching this video so first thing first to get label studio we need to pip install i usually do not try to install in my main environment i create a new environment so let's go ahead and do that and you can use windows command prompt but since i have anaconda already installed i am going to open anaconda prompt right there and i'm going to work from here okay so from now on just keep an eye on this screen only let me zoom in digitally in post recording uh so let's start with first of all what environments do i have and some of this is probably pretty basic for you but it's worth going through this condo and list gives us what environments we already have and i should have a couple uh you should see label studio as an environment right there because i was testing it out and this is in fact the environment that i work with so let's go ahead and create one so i'm going to copy this instead of typing the whole thing and let's go ahead and paste it here and let's give this some names so let's call this uh tutorial just this there you go so normally when you create an environment you just do conda create name and then whatever the name that you want to give i'm adding pip because as soon as i create the environment i want to be able to use pip to start installing the packages especially obviously label studio so let's go ahead and run this it should take a it shouldn't take that long let's see how long it takes it should ask me whether yes or no question there you go let's continue and i'll pause the video for a few seconds this shouldn't take that long anyway and as soon as i said that it's done uh creating an environment in fact uh it should take a little while like maybe a minute or so when i do the pip install so let's uh go ahead and first of all change the environment from our base to the one that we just created so go ahead and do conda activate and what are we trying to activate we call it a tutorial right so let's go ahead and do tutorial so now that we are in the right environment let's go ahead and do pip installation so it is pip install let me copy that okay there you go clip install label studio this is where i'll pause the video for i don't know one minute or so and i'll continue that okay so there you go it's finally done and let's clear the screen and all we need to do to start label studio is just type label studio and it should open up a browser and it should go directly to this page and if not go ahead and open a browser and type this in the browser so for now let's go ahead and see if it's going to open so for me my default is microsoft azure sorry edge browser and allow access and there you go you'll see a couple of tests that i was doing but in your case you should see a different screen where it's like a welcome screen hey you have no projects go ahead and start a new project and let's go right there so i have my projects right here and let's go ahead and create a new project and this is the type of screen that you should see when you first log in and it's pretty intuitive you can just say okay my segmentation tutorial yeah so and you can add some description to keep track of your project and once you give it a name go ahead and import data so for now let's uh in fact we should have talked about data while it was installing so i have just these three images and i would like to annotate and save the annotations to my labels as png i will also show you what happens when you save labels as numpy just so you know how it looks like obviously that would be numpy aries so let us go ahead and load these three images so i'm going to click and drag it should be pretty fast because i'm not uploading them to some cloud location they're like right on my drive so this is not a some cloud location this is your local host on your system itself okay so now that i have this let's go ahead and set up our labeling scheme how do we want to label these images so here let us first it walks you through this is what i like about this label studio so i'm in computer vision but if you are doing some other tasks go ahead and select that whether you're annotating videos or time series but let's stick with computer vision and within computer vision these are the different ways you can label and probably the most common formats i would think of is the first two and i am a big fan of the second one this is what i normally use where i have a paintbrush and you just brush around the object that you are trying to label so let's go ahead and select that and it labels it it of course is loading a test image and you have these two classes so the first thing first let me delete both of those and add my own label so in my case i have white houses and roads and let's also do water okay so these are the three classes in fact in my code later on uh i think these are the classes i yeah so let's do that why is it important because i've written this code to go through post labeling handling of data so i just don't want to miss too many things here okay so houses roads and water are the three things that i would like to annotate and if you want you can change the color for example of my water right there and houses i can give a different color let's give it a green color and allow zoom using control plus wheel i like to zoom into images when i'm annotating so why not enable that that's enabled by default anyway i think that's good you can go back and browse the templates and everything but this is a good start now let's go ahead and save it and uh go to our images okay so let's uh start uh picking let's look at the first one yeah that's not i mean this is the one i showed you earlier anyway so let's go ahead and start annotating initially when i downloaded this i started annotating like painting i'm like i don't know what that is i mean it's not painting if you scroll down you can see the classes and you need to select one so let's go ahead and select one rows right there and i'm zooming in so i'm using control and the wheel and this is where you can control the size of the brush so right now the brush size is that big and if you want it to be smaller you can do it that way yeah so this is how you control so let's actually do obviously in your case you should be more careful i'm just showing you especially near the edges yeah so normally when i annotate this is how i do it in fact i go all the way down in size and make sure i get the edges perfect so you can go like one up and you can make sure the edges are perfect yeah but for now let's be a bit sloppy because i want to show you the philosophy behind this and not much focus on the accuracy of my although the title of my video is how to do this properly i'm telling you how not to do it properly and then how to do it properly later okay let's go ahead and do this for now okay so that's good enough in fact let's do that let's do this okay and uh once i'm done i this also may be a road let's go ahead and paint that so i'm done with that so i'm going to go ahead and for now let's go ahead and submit this and i it's it initiates the process now i'm going to just change this to houses and let's select a few houses right there this is where i need to make sure the pen is not that big and oh that is pretty big but okay fine i just want to show you the process anyway so here is the annotations that i have for my houses and oops wrong way and let's get back let me zoom out and do this obviously do not do this yeah make sure you spend a good enough time in annotating these images uh use polygons if you want to use polygons because that gives you much more control in annotating these images uh you can adjust the polygon uh you know pretty close by and so on anyway so let's uh let's do that and that's enough for now let's update it and right now let's do water do we have any water anywhere i don't know this this is probably water so let's go ahead and select water and paint this yeah so we just have like one area that is water and we are fine i'm done with this image and now you can go to other images and do exactly the same okay so let's leave it right here now you get the point you can go to other images and you continue doing this and what do you do after you're done uh there are advanced things that you can do you can connect your uh i mean this the take the api from here connected to yours i think you have to pay for that normally my workflow is just downloading these so once i'm done with this i just go to export and i export them so first let's go ahead and export them as the labels to png so export and there are various formats of course coco format for example depending on if you're doing object segmentation that can be a good way and csv and tsv and json files so it's up to you what you want to do i like either numpy or png for semantic segmentation that's not a bad way to start so let us go ahead and let's see if i have if i go to my downloads this should be the latest one right there and let's get these images into our png right there okay so there you go so we have our labels and when i open them you should see this is the house that we painted and this is the roads and this is uh i think the water right so this is these are the labels that we have right now so uh you may have multiple in fact let's go ahead and let's do something let's go ahead and open another image because i want to see how this and let's just do a couple of annotations so for water let's just do that is water yeah and that is water i think we are done with water for now submit and let's take one more let's just take a road and let's do that okay that's our road let's not do uh yeah why not let's just do a few houses right here and let's do houses that's one that's a big house uh that's another that's house that's a house okay i'm done update and now let's go back to our labels export as png it downloads everything into a zip file so you don't have to worry about handling all those thousands of individual ones so let's delete the ones that we imported from before and let us um no not that one sorry uh which one this yeah there you go labels as png let's delete the ones that we had before and let's go ahead and get these one so now you get the idea so there is task 9 annotation and you see how it says houses and task 9 annotation roads task 9 annotation water and then for the next one task 10 annotation houses roads and water so our goal is to go ahead and collect all the ones that actually says houses the all the ones for roads all the ones for water and then stack them into individual numpy areas by the way you can download let's go ahead and do that you can download numpy right there and go ahead and export pretty much the same instead of dot png it will be dot numpy so since we are we started the process let's go ahead and finish it off so that would be i think it's the latest one right here and let us go back to our numpy let's delete the ones from before and add these ones right there so these are all our numpy arrays so basically the same information you don't have to do both pick one whatever it doesn't matter and this is uh png's that's it and both are all identical so there's no reason for me to suggest you to do one over the other okay so far what did we do we just annotated on label uh studio and downloaded png and also numpy now let's see how we can kind of open them in python and then see if there are any issues with this and i'm talking about that because there will be issues so let's go down and i'm going to import our standard libraries that we normally do nothing tricky there and i am importing so let's actually let me move this a bit to the right these images were from yesterday when i was trying to compose this uh come up with this code but let's copy one of these images for example let's copy this one the name opening a test image so we can study how that looks like and also numpy let's open pretty much the same numpy array the second one was the second one task nine annotation six rows zero yeah that one okay sorry if some of these are a bit boring but very necessary okay so let's go ahead and import the image and i am using psychic image i o to read this image and there you go that's uh i already have it so if you don't believe me let's go ahead and plot so you can see how that image looks like there you go yeah so that looks obviously pretty much the same as this image right there yeah so that's the image that we that's the mask now what is what's the problem with that well if i let's go ahead and print the unique values you know how to do that in numpy and p dot unique gives us all the unique values in this image we should only have 0 and 255 or 0 and 1 uh however they saved this and if you if we print this unique it's going to print a whole bunch of values that means it has like 0 1 2 3 a whole bunch of values and this is exactly because of the screen i showed you earlier as part of my presentation when you zoom into one of these areas the edges are interpolated so you have like all kinds of values this is not how it should be we need to convert that into binary and initially when i started with that that obviously irritated me so i downloaded the numpy arrays because usually when you download numpy areas it's downloaded as ease but apparently not the case when i open this numpy array let's go ahead and do that and show the array you see this is the image from numpy array and when you print unique values let's go ahead and print unique values it's still showing me pretty much the same thing so that means the problem exists with both the images and numpy i don't know how label box does it but with label studio this is what i figured out okay so how do we handle that so all we need to do is take your mask image and then say any pixel value above 0 just convert that to 255 or convert that to 1. in this case i assigned it a value of one because if you want to combine all these masks into a single mask for example roads having a value of one house is having a value of two and what are having a value of pixel value of three then this is a good idea but if you are keeping them separate then uh it's customary to uh to change the pixel value or it's customary to have a binary image where background is zero and your actual objects have a value of 255. it's up to you it depends on how you have written your rest of the code but all i'm trying to do here is any value that's above 0 i'm just converting that into 255 that's into a value of 1. that's pretty much it so when i do that and when you look at your unique values let's uncomment this and when you look at your unique values obviously you should see only 0 and 1. this is how it should have been right from the beginning but we are just doing an extra thresholding here to convert our image into binary that's all okay and let's go ahead and look at the mask it should look pretty much the same in fact this looks brighter you see if i go back to the previous one this one especially if you focus in this region right there you can tell the difference okay so now i have a true binary image that's it this is how you go ahead and work with your labels and uh here i just wrote a few lines to go through each in case you need help with coding this part of the coding i'm pretty sure this is basic most of you know this so all i'm trying to do is label as png i'm going through this folder label as png and going through every image and seeing if houses is in the text for my uh label file name if roads is in my file name if so do something so what i'm trying to do is if houses is in the file name go ahead and load the image and convert that into binary like i just mentioned and append it to my masks list so i have a uh so i have all the images captured as part of uh in a list and then i can convert them into numpy and then load them into the rest of my workflow that i'll do with uh when it comes to machine learning uh and in this example i assigned houses a pixel value of one i assign roads a pixel value of two and water a pixel value of three you can handle them as 255 to 55 to again it completely depends on how you have written the rest of your deep learning code so when i run all of these i should end up with with what the three lists house masks road masks and water masks so if i go to houses masks i have two because we have two annotations right there so they are the two and pretty much the same for all of these images yeah and from this point on it's up to you how you would like to proceed okay so uh thank you guys uh very much for your attention if you like these type of videos please leave comments on my videos and i'm doing this like i mentioned already i'm doing this because you asked me uh for this and both via emails and via comments as part of the youtube videos that i upload and uh unfortunately i cannot respond to every email because i get like a few tens and sometimes like 100 emails per day there's no way i can respond to every one of those but i do get a theme of what the questions are being asked and i try to do these type of videos for you so thank you very much and if you feel extra generous hit that button that says you can donate you can uh and it's for a great cause it keeps all these videos free and all the donations and everything i get i'm going to give it to charity anyway so please support uh if you don't want to give it to me support other charities that you that you really like to support thank you guys please hit the subscribe button
Info
Channel: DigitalSreeni
Views: 57,475
Rating: undefined out of 5
Keywords: microscopy, python, image processing
Id: UUP_omOSKuc
Channel Id: undefined
Length: 27min 8sec (1628 seconds)
Published: Sat Mar 12 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.