3D Point Cloud Feature Extraction Tutorial for Interactive Python App Development

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey friends welcome to this wonderful tutorial on 3D Point Cloud feature extraction and interactive python app development we are going to do something very powerful which is first taking a point cloud and extracting a bunch of features both based on principal component analysis and also relative featuring techniques and then after that we will actually leverage a python library to create a interactive thresholding method that can extract part of the point cloud based on the threshold directly in our application that we're going to do and that for two main purposes the first one being able to have a workflow to label 3D data and the second one having the ability to leverage 3D machine learning and create AI models right so whenever you're ready let's get started on what I got cooking for you all right so the workflow will be very simple we'll first gather the data set and then move onto python environment and setup I will go very quick there if you miss some element we will go onto other sessions then after that we're going to check out the 3D data EO fundamentals specifically with P Vista then we move on to the experimental sides of things first we will deal with pre-processing and structuring the data speaking about KY trees or trees and so on after that we will leverage this data structure to compute features by first Computing Point uh principal component analysis uh based features and study the impact of the neighborhood definition that we actually choose thereafter we will go onto featuring based on relative uh methods and the experimental part will be done we then move on to automating and scaling all of that will be done at a small scale and we do that for the full point cloud and create a loop that take that and does that automatically for you so that you have features aside from your point Cloud we use that and we visualize directly in Python and we'll create a sliders that will allow us to actually choose part of the feature points um and parts of the point Cloud itself that we can prepare and Export to then do some kind of segment with thring technique all right so that is the goal for the full series on 3D Point Cloud feature extraction I have a small warning though I'm splitting this tutorial into two videos the first part will deal with python environment up until um feature extraction relative featuring and the second part will deal with automation visualization preparation and segmentation okay so this is what is happening here all right so whenever you're ready let us get started before jumping right into the code we start by gathering the data so today it's very simple for you you just have to go onto the Google drive folder that is linked below and you get yourself your hands onto a very nice data set that was created by University of twinter and given to me by S UD Elber my colleague Professor there it's an MLs data set and let me show you what it is so here this is this nice Star Wars Point Cloud if I zoom out of it you will see that it's a nice MLS Point cloud with a bunch of trees roads and some po object really nice and we will want to make a lot of stuff with this specific point Cloud but of course to be able to process that it will be nice to extract some kind of features that we could use to detect the trees the road the Pake objects the cars whichever element you want to have um identified in this data set right so now once you have your data set just make sure that you put it comfortable inside your data folder I have a structure that is very simple I have three folder code data and results at the same level in the data folder I put the data in the code folder I have exactly what I'm showing you right now all right so now let's move onto the first step which is python environment and setup for this specific tutorial I will skip keep a bit and not show uh specifically all that is linked with Anaconda setup or other setup if you are are lacking a bit on that please check out the other videos that I made for example the light of vectorization that covers this specifically now let's say that you have an environment setup um and you just need on this environment to install some libraries okay and these libraries are the following it's neai it is cpai p and it is p Vista a library for 3D visualization and that's it three Library will make wonderful feature extraction on you with that all right so Base Library import NP as NP once it's installed of course you can import that to install this Library you just use peep pep install then P pip install sipai and pep install P Vista once that is done in your environment you launch your IDE P install spider and you can play around with that one little thing though is that actually Pista is not working super well with Jupiter notebooks so I encourage you firsthand to try on the local environment um like this one spider Visual Studio code whichever you prefer all right let's now get started too much talking so import NP as NP then I will import time to get an idea about how much time it needs to compete such or such things and um scipi I will use it to specifically extract KD tree and for that I will just import the KD Tre fre module from scipi and that's it you see I have Auto completion and for the p uh 3D Library it's import P Vista as PV I did a lot of videos with open 3D and others and P Vista is another 3D librar so it would be nice to play around with that so that's it for the first step python environment and Library setup make sure that you set that as your working directory and you execute and that's exactly what I'm doing and now it's executed now let's move on to step two which is data loading and fundamentals specifically with P Vista so the first thing that we are going to do is create a variable called PCD PV and this one will uh basically take inside the loaded data with uh P Vista so pv. read and then I will pass the path to my data set so I express everything relatively data folder and inside that I have my MLS W2 super ply right and I just need to do that to make it a string and that would be my variable so now we could execute that but it will be better to also plot what we are going to have so to plot you just write PCD doore PV so my variable plot and here I just want to have iom lighing so kind of um sense of normals with Computing them directly and that's it with these two lines I will load it and display it so I press shift enter and you see that we have a little window here that allows us to handle the point Cloud directly within P Vista right really really nice and at this stage we are very happy you see that you have a little uh idea about where the Z up is and here we have the ramp of RGB information that is in our data set so I can close that now what is also interesting is to check out what is the structure of our variable PCD pvn and you see it's a poly data type with a specific number of sale with bounds and so on so it will be very handy to handle that and if I do PCD do points I will get the point clouds um of what is considered a poly dat object small note is that you have an object called P Vista ND so it will be scary to maybe have another object that is different from a classical lpai ND AR but in that case it's just because for some function internal function it's better for them to differentiate between pistan and array and classical array but that's an Empire down the line so you don't need to worry about that and as you can see you have XYZ and the number of points and that's it so that's very nice for the loic fundamentals sorry about that I will put that here okay let's move on to exploring py Vista capabilities so um we have the ability to store some variable for Pista to be able to to visualize it and it's uh pretty easy and I actually really like the way they they do that so you just write the name of your variable and you will pass here the name that you want to give to your specific features that you want to display so here if I were to put elevation and what would that be you just need to pass some single uh scalar field with the exact number of point that you have so 500,000 and that's it so I would take that dot point and I would take all the point and only the third colon sorry about the little Smiley my keyboard uh is in the fun mood I guess and then all that we need to do after that is render that a sphere so we could use exactly the same line as before which is p Vista plot so pv. plot I pass my specific varable and I will ask him to use the specific scalar that I created here right as a way to display uh the color information in my little window and then after that I just want to render my points as sphere uh give some kind of let's say um Point size and also show the scalar bar and that's it I just execute that and as you can see very nicely you have again your magnificent Point CL but it's it's different now it's called by elevation you see all of that is purple and all the above is yellow and I really like that right so that is a very nice way we could do something else for example um random right and I could do zero uh multiply by what PCD so x.y right and I will then use instead of elevation random and let's re-execute that and as you can see we have a coloring that is a bit different that is super handy whenever we need to prototype and also have the hand on the features that we're Computing and things like this this is why I really love P visa for these types of um um little experiments now we can move on to the third step which is pre-processing and specifically here I will focus on 3D data structure creating one data structure will not speak specifically about o tree we will use KD Tree in this uh little experiments so kd3 maybe I can stop a moment to explain a bit what a kd3 is let's say that we have some points uh 1 2 3 4 5 6 7 so 3 4 5 6 and seven right classically in our system you could have that as a list which is ordered by the index that I gave here which is pretty random it could be another thing right so let me now Define the original space which is the one that we're going to consider and that we will translate that into a tree structure or CAD tree which start with the root like with a tree that has a nice root that's the root right that Encompass all of the data point there right from one to 7 now we'll split that along the main axis and I will use this color for that I will split that by using the median usually you have different way to split but here I will uh get to the median so very close here that will be my initial split where I have four point on the left side and three point on the right side and here it's in an Iden space so two space that R from 0er to X and from zero to Y right so that's the first split so the first split creates two what we call node which each hold all the other points right now we can refine the process a bit and split another time right so for that I would split uh again using the median and here using the median I will be here right and that means that here I have two more not and here I will have also two more nodes now what's happening after that we can still continue to split until a certain Criterion is reach let's say we want to have each um Note have one point maximum at the end so I will split another time and here the point uh 3 is here and seven is here so I will have one split here one split here and here one split here and here we don't need to split anymore so that means that here the split will be like this and here on the leaf because that the lowest level we have the0 one here at2 um then also here we have the split that gives us the ability to have the point 7 and three here we have just the point 4 and this split we have the0 five and the 6 and that's it so why is that super handy because whenever you search then if I say okay I want to know where I place if I insert a point right Insertion I want to insert a point here and I know the coordinates like two and four four for my coordinates right two will be there and four will be there it means that already from there knowing exactly the extent I know that I will go only there and then I will go only there and that's where I will insert my point right and the same thing for searching you are cutting so much time when you search for thing like nearest neighbors so you know it's here you will get things like this so it's a very important structure to handle and um I will not get much deeper there just for you to get a very nice understanding about what we're dealing with right let us get back to or horses which is our code in this um case so to use kd3 you can definitely cut that from scratch it's not so so complex but um it would be much quicker to use existing implementation so the first one that comes to mind is using py Vista implementation the big problem with py Vista is that actually you only can um get one point at a time let me explain what I mean so here I created a 10 variable and I use the fine closes Point function of of P Vista I give my candidate point which is one with coordinates 11 1 Z and I say okay return to me the 20 nearest point right and then after I print it so if I do that this is exactly what I get I get the indexes of all my points which is super nice and I could use that then to to retrieve that and to display it to to use that the problem is that in our Point clad we have 500,000 points and we can only pass one point at a time so we will have to to pass a loop and make that over and over again which is absolutely not efficient so the second solution and that's also what I will recommend if you read the let's say documentation so what we are going to actually do is uh build a k tree for our Point Cloud right for each point in our Point cloud with CI so let me show you how to do that first we create a tree object right um and I will call it tree because I'm very original and I will just pass my function which is called KD Tre and here I pass the data set on which I want to uh fit my KD tree right and this is PCD pv. points right that will be my tree now I want um to basically query for each of my individual point in my point to get the index of all the 20 closest points so that is exactly what I'm doing 3. query I pass my query data set and I I I say I want my can neighbors based on this tree right so here I will return it delivers two variable the distances and the indices right I will keep them both okay I reformat it a bit but now something that is very important to address is that actually I want to time to have a sense of how fast that is so you can just create a t zero variable time do time and then at the end we will extract the neighbors right so for for getting the neighbors we'll just pass the indices to our Point cloud and then we want to also get the timing um at the end of that and we print the difference right so let me do that first Computing the tree goes very quick and then executing that and it will tell us exactly how much time it takes and as you can see we computed our neighbors in 4 a second which is not small but it's okay for offline processes right and at this stage already that's marvelous we have a structure on top of our data that we'll be able to leverage and if I look at what is inside neighbors variable and as you can see every time I have 20 neighbors for each of my points with the coordinates so I will be able to use that um right off the bat because if I look at the length of the neighbor I have 500,000 so I actually have for each index of the original point I have the coordinates of all my neighbors so I can use that directly so I have small uh KD search around that right now that's all done we can breathe a bit and move onto the stage four step four which is point cloud data featuring focusing on PCA so principal component analysis I will take a step back there and maybe show you how I could conceive PCA right the idea is to optimize or minimize the variance along axis based on your point distribution that is totally nonsense but let me explain maybe quickly out of the bat what I mean so here uh this is our data point right and one outlier there all right which will screw the distribution right so um this is the classical axis that we have right the xaxis which is here and and the Y AIS and principal component analysis is actually trying to detect the principal component in our data distribution so we like to have um something like like to fit some kind of line that will minimize the variance so this is two dimensional the same applies for three dimension Point cloud or end Dimension um it is very used PCA for as dimensionality rediction technique but here we just want to use that as a featuring technique so the first thing that we are see here is that you have a line that looks like it's minimizing for the most part the variance so it will be this line right and what I mean by that is that afterwards if I were to um take the distance to this line this is what I have and when you average or you sum that you want to have a distance to the line which is minimal right so that is what PCA is doing and that's the first axis and then you have also um the second axis which is orthogonal and we also plot it here which will be this one right so this has um the largest ion value and that's the second largest igon value um and that's the vector V1 and that's will be V2 right and that is our principal component analysis with the two igen vector and the igen value that are linked to that and this is all that we are going to use to actually compute some kind of features that best describe what is happening with for Point Cloud the distribution and all of this is based on the covariance Matrix that is just falling down behind all of this mumbo jumbo right so let us now go back to our coding session so we are in search of Point CL data featuring specifically PCA so the first thing that I'm going to do is just for clarity concerns I will create a um variable called Big X which will all my points that's not very clever thing that I'm doing because I'm not making a deep copy or anything anything but still I'm doing it nonetheless so now after that I will compute the mean you know the drill we'll use n for that we just make sure that we compute that on the um X um axis for each of the columns so we don't have a single number we want to have separate x y and z now we Center the data which is very important always normalizing or centering x minus mean and then we compute The ciance Matrix that will allow us to compute the igen values and igen Vector so the covariance Matrix we we have nump that provide mp. COV we pass the centered data and we don't compute the row variance that's all that we need it and for the igon value and I Vector we'll use the Lin Al so all that I showed you is actually done exactly here right and after that the last thing that we want to do is actually to sort our IG vectors and igen value by decreasing IG values the first one will have the highest I value and the lowest one will be uh at the end of the spectrum so to do that you just write the following code sorted index where I sort based on the arguments uh the I values and I start uh with the last one and then I will use that to sort out igen values and IG Vector that is a very smart trick if you need time to understand that pause and try your best to figure out what is this black voodoo sorcery right and after that all is done we can basically print out um what we have and here I give you also some kind of line that would do that I would execute and then as you can see my sorting index is now okay 012 I value biggest to the lowest one and in that case that's my vector and one super interesting element is that the last I Vector is actually your normal right which is tied down to the lowest igen value okay we can briia uh that is very very very nice except we did something that is really not that smart what it is it's that we actually computed PCA for the full point Cloud so we are exr strengthen some kind of features for the full point Cloud but here we need to identify some local distribution right local variation local features that best describe local entities because we cannot use that to say this point belong to the the tree or to the ground we need to reduce the data point that we use on which we base our PCA and that is something that is actually not so strong straightforward I will show you some methods but not all and it's normal to have some kind of difficulty to really know exactly how we could identify the best strategy to find the neighboring point that we will use to compute these features all right so we're just testing out on one single neighboring point and you remember that we computed the neighbors so for each point we have all the neighbors so we take a selector which will be one right and we'll Define not C1 but one and we will Define um a function called PCA of the cloud and basically we will reuse all that we have above except that um I'm going to just pass the selector uh somehow at the end but I'm just taking out all that I have here and putting that that I return only the sorted value and sorted EG and vector and I'm testing that and I'm printing the test to see if what I want to have is exactly what I'm doing so I run this little cell and as you can see I have my ion values ion vectors out so it looks like it's working for the selection one if I were to change the selection let's see if it works till we see that we have another local um let's say featuring of our Point Cloud so that is very very nice it's working for that now this is some kind of features that we may not uh have so distinctive so we'll compute some other features based on PCA namely four of them plus the normal vector and there are planar a feature that aim at describing if an area is mostly planar or not then linearity that aim at describing if a feature is linear like a PO okay then omniv variance to understand if in the local area we have an variance in kind of the same direction um the normal vector and the verticality to know if we are in a vertical region or not all right so these are based on a paper um not maybe myself directly in Martin V which wrote a very nice paper explaining a bit about all these features I did also wrote a paper uh voxal based um representation something V deep learning methods that also show you exactly the distribution of this specific features and their impact basically they are chosen already at this stage because they are very relevant make sure to explore the other features around that there are a lot that are very useful and it's good to understand what you are playing with right um if you don't understand it's also fine but it's always better to understand what we're playing with all right so um in this case we'll Define a function called uh PCA featuring and this PCA featuring will take into account Val and VC so ion value and ion Vector right and inside of its planarity is expressed like this it's the Lambda 2 minus Lambda 3 divid by Lambda 1 right you can have all of that well let me show you directly maybe um one of the papers for example yes here we have that um as you can see that's the equation that I'm using so planarity linearity surface variation I'm not using that verticality yes and omn variance right so that's what I'm Computing I'm just basically um translating what we just saw together omn Varian uh I forgot l it that I will compute just after the normal Vector I just just drop that because I just want to retain the z um component of the vector right and the verticality this is right so on the ion Vector I have the first the second and the third I take the third as the normal and here I take only the Z component of the normal and that's it I will return in a specific order um my various features so planarity linearity variance verticality and the three component of the normal and to test that basically I will just print one letter for each of these features and see if it's working right and as you can see here we have all of our features so wonderful that is a massive step already that we did this is absolutely wonderful but you don't see just yet so let's push a bit what we are actually doing right now so understanding feature neighor definition which is step five which is what we are going to cover right now so first off let's speak a bit about kest neighbors search right I spoke about the need to have a specific strategy if I were to do the Canon search which I'm just executing right now you see that I will have everything set up for me in around 4 second right now what about a radius search so what is a radius search this is exactly the same code that we used before except that I will use a instead of query query underscore bore point I pass my data set and instead of giving the number of neighbors that I want to have I pass the radius that I want to have so one meta in my case it does not return distances as Cann search does but um it Returns the indexes as a a list right and I'm I'm going to execute that and wait a bit for it to complete okay wonderful so it took 35 seconds to do the regular search much more than the KD tree but still I guess it's okay and now what I'm going to do actually is doing a hybrid search knowledge driven custom search what I'm doing here if you look at the code that I wrote is I actually create a tree of 2D data and I'm dropping the Z value right for my tree and what that means means is that I'm creating a tree that will take for each point in my uh base Point Cloud actually all the points right per cylindrical that go from Full down to full up and this would be really useful to understand some kind of of local Maxima local Minima taking into account that is it's a top down lighter data sets and the trees and so on I will be able to detect them much more efficiently so that's why I'm doing that pce yourself whenever you execute this you will have a 2minute wait more or less um to get the result all right so please do not panic just execute and wait for it to finish and that's it it finished in a bit under 1 minute as you can see here I have my hybrid tree and search that is done now we can do this relative featuring that I really like which is my step six right so um in that case again I'm going to time as a first stage then I will take a selector the first index as before and my selection will basically take my point Cloud uh the index of my tree search and I will take only the first element there so take a a moment here if you do not understand directly what I'm doing but you will see uh if I execute what is my selection now this is this and the array of points right that um are the neighbors of of my candidate point for uh yeah definitely uh with a selector and if I check that you see that's the list of all the indexes so if I take my first point that is what we have as indexes right so that's exactly what I want to have within my selection now having done that we can compute the distance to the highest point of my selection and to the lowest point so it's the array of NP Max selection all the point minus the pcpv points selection right uh and the low is the same thing okay so if I execute that let me check what is my distance High okay 18 meta and the low 9 cm so it looks like the point for example is very close to the ground right and far away from the top of a tree or something that is 18 m High that's it for the point CL feature extraction relative featuring right this is the part where we actually split because there is already so much information here that I believe it's good to have that as a standalone video so we stop at the feature extraction relative featuring you can already use all of these features in some of the application that you may have and the second stage of and the second part of this video we deal with automation scaling visualization interaction preparation expl reporting and segmenting with resting hope you enjoyed this tutorial if you did and if you want to have more information and courses and tutorial like this don't hesitate to subscribe to the channel or leave a comment about what you would like me to cover thereafter and I will do my best to get on that in the meantime have a great day [Music] bye-bye
Info
Channel: Florent Poux
Views: 4,041
Rating: undefined out of 5
Keywords: Point Cloud, LiDAR, 3D, Data Science, Modelling, Geospatial, Python, 3D Python, Vectorization, Segmentation, AI
Id: WKSJcG97gE4
Channel Id: undefined
Length: 32min 53sec (1973 seconds)
Published: Mon Jan 22 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.