Photogrammetry / NeRF / Gaussian Splatting comparison

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video we're going to talk about three technologies that have become very popular lately photogrammetry neural Radiance Fields often called Nerfs and gaussian splatting these three Technologies are only really superficially related however because of the sudden surge in popularity tied mostly to Nerfs and artificial intelligence hype there's a lot of confusion about what each of these methods can accomplish and how they're best used I've been using photogrammetric modeling techniques for the past 15 years but neural Radiance fields and gaussian splatting visualization for only a few months and furthermore I'm not a computer scientist so I couldn't do justice to the complexity Behind These methods but I'm going to try to make them easy to digest and understand to do this we're going to look at the same image set processed and or visualized with each of these methods the data are 345 frames extracted from a video of this Sandstone column called Church Rock I'll put a link to a zip file containing the data in the video description if you want to try processing it yourself if you have any questions leave a comment below and I'll reply likewise I'd love to see any results if you process the data on your own in this video I'm going to use agisoft meta shape for the photogrammetry processing Nerf studio for the neural Radiance field and Nerf video render and a method described in the paper 3D gaussian splatting for real-time Radiance field rendering by Kerbal copenus detrakis at all which is linked below and I'll also put links to all the different software in the video description let's start by taking a look at the photogrammetry mesh the first thing you'll notice of course is that we're looking at a polygonal textured mesh in other words a 3D model of Church Rock the subject of the the data set here and we've got our data open in agisoft metashape here which as I mentioned is a photogrammetry software or a structure from motion software and you'll essentially what you do is you load in all these frames which are the extracted imagery from that video that played earlier and the first step which coincidentally is also the first step in processing or training a neural Radiance field is to estimate the camera positions in space and this is important for both photogrammetry and neural Radiance Fields because in the case of photogrammetry we are estimating where each photograph was taken from in relation to each other and in relation to the object so in metashape if I turn on the camera locations we see all of these blue squares which are the estimated data locations in a neural Radiance field the artificial intelligence model the neural model neural network has to have a starting point in order to likewise estimate the novel views in other words views from in between each of these data the frames so in both photogrammetry and a neural Radiance field you need to begin with camera poses or like the camera locations the data locations here in agisoft we have our viewer window so if we switch to the sparse Cloud here this is all the points between these images the that have been projected forward to represent the the surface of this object Church Rock so we're looking at a point Cloud here I have already gone through and processed the polygonal mesh which is this and I've reduced it down to about 500 000 polygons we can look at the full resolution here and you can see there's a fair amount of detail particularly for the quality of the data set which was not great as I mentioned it's frames extracted from video rather than high resolution photographs which would make processing take much longer but it would give us a lot better detail I've also gone and textured the model so that uses the source data the photographs to texture this in other words it creates a JPEG image file that is UV mapped onto this object to make it appear photorealistic and you can see that we have this shadow in here so in photogrammetry the lighting conditions are baked into the 3D model the final model the texture at least if you turn off the texture then you have simply this 3D model without any Lighting in it so you can 3D print this you can bring it into a game engine and texture it yourself you can paint on it so on now one of the great things about photogrammetry is that it is accurate it's it can be measured so in this case I'm using the the Drone GPS and this 3D model has a scale in other words it you could take measurements from this you could estimate area and volume and so on those are things you cannot do at the moment with neural Radiance fields or gaussian Splat rendering both of which are volumetric and those have their own strengths but currently they are not sort of metrically valid you would not use them to for example create measured drawings or architectural plans for example whereas you can use photogrammetry for applications such as that the other thing you'll notice is that we do have a lot of context here we can see the landscape around the the subject of the data set but we don't have any sky and even if I go back to the tie points the point cloud the detail is really just concentrated on the object here and that's simply because that's the way that photogrammetry works it finds common points um on a surface there are no common points that can be found for example in photographs of the sky it's just a single featureless expanse and we'll see how this contrasts to the neural radiant the radiance field uh representation of this scene so photogrammetry you tend to get a discrete object on a ground for example quite literally in this case without this sort of scene context we're here we're just getting the immediate sort of context now if I wanted even more I could have flown a much larger area and we could have gotten the mountains in the distance but then you get a much larger model it takes a much longer time to process and the model itself is uh much more complex whereas if all we want we're worried about capturing here was Church Rock we would just focus on that which is exactly what I did now let's take a look at the neural Radiance field output I have Nerf Studio open here in the web viewer and the neural model has already been trained so using those same inputs those same video frames I've trained the neural network on that data and what we're looking at now is the resulting Radiance field and at first glance this looks very similar to Azure soft meta shapes workspace we have this 3D view here where we can rotate the view around the viewing position and we also see the representations of the camera locations in very much the same orientation that we had in metashape and that is because I actually use metashapes camera pose estimation the initial sort of alignment of those cameras photographs I use that as the starting point for this neural Radiance field and that's because the alternative would be a structure promotion solution called coal map however that can take a really long time I didn't have it installed on my computer and I already had the project in metashape so I just exported those camera poses which is just a text file an XML file that describes where each of those photographs is located in space and that can be ingested by Nerf studio and used as the starting point for training the neural Radiance field so I can turn these images off and let's zoom in on Church rack here now this looks very much like the result of the photogrammetric processing however there are a few big differences that are immediately apparent the first is that we we have the sky we have the Horizon so the the scene here is much larger it would appear than the photogrammetry output not only do we have church Rock but we have the Horizons in the distance the mountains in the distance we have um some clouds in the sky in fact we even have the blue sky so we are appearing to get much more data here however if we zoom in on this we can see that this is not actually a textured surface even though it may appear solid it's not a mesh this is a volumetric cloud it's a volumetric representation of the scene it's a Radiance field basically and that Radiance field refers to the the color of any part of the scene when we move the camera to a novel Viewpoint so photogrammetry allows you to essentially rotate your view around a 3D model the way neural Radiance Fields work is they allow you to generate novel viewpoints so if we turn on the images if I click on one of these images it'll Zoom us to that view however I can move the camera to a Viewpoint that was not necessarily captured in the original data like right here for example and we we still get a fairly good representation of the scene it's not perfect because we can see the sky above I never photographed that at all so there's no data to be represented there however for portions of the scene that were photographed like this Viewpoint right here even though there was no data explicitly from this low angle the neural Radiance field is able to estimate what the scene would look like from this view so it's able to estimate the colors and the the scene from this Viewpoint even though there was no photograph taken exactly right here now something else interesting that we can do with neural Radiance Fields is to set up a new camera path and I've already done that here but I'll just show you really quickly how this works so we can load a path that I've already set up and this really goes hand in hand with Nerf's ability to generate novel views of a computed scene so here we can see this camera path that in some places follows roughly the the two orbit flights that I made of this with my UAV but in other places it takes a completely sort of New Path such as going directly over Church Rock here and we can adjust the resolution the duration the frame rate camera type you can export an equirectangular 360 camera and then we can get a new video so let's take a look at that new video right now [Music] as you can see neural Radiance fields are great for when you want to quickly collect data of a scene or an object or an environment so I think neural radians field is a great application of them is for virtual production where you may not have time on location or the resources on location to get the perfect shot but you can very quickly capture the scene and then later take the time to set up the perfect camera path using the virtual environment furthermore you can do things with the virtual camera that you might not have the ability to do with the actual drone or UAV or gimbal that you're using on location for example here I don't have an fpv drone so I simply flew an orbit shot of this and then later I was able to generate this very stylized in terms of the camera movements at least video that emulates the the look and feel of a of a first person view drone type video now let's take a look at the last technology that we're talking about which is gaussian splatting or specifically 3D real-time gaussian splatting of Radiance fields and when we're talking about gaussian splatting we're not talking about Nerfs anymore but rather just a Radiance field because the the point cloud has been exported imported into in this case a real-time game engine unity and it's just a static Point Cloud at this time it's no longer it's finished training it is no longer interactive in the sense that the artificial intelligence network is doing any sort of training on it so it's just a point Cloud that's been exported it's been gaussian splatted using the method described in the paper that I Linked In the description and I'm using a great Unity project that I'll also Link in the the description as well in unity we've got our 3D gaussian splatted Point Cloud imported and loaded into this scene using the unity project that I will link in the video description and which makes use of the method described in this paper and huge credit to RSP the the former Unity developer that um made this project this Unity project available for for free use on GitHub this would not be possible without that so definitely check that out and give him thanks if you end up using it and what we're looking at here is essentially a real-time view in unity which is a real-time game engine visualization engine of the point cloud that has been gaussian splatted so if we zoom in on Church route here you can see that the Splats become more and more visible so what is happening here is the the point cloud instead of being represented as points it's being represented with these Splats which are derived from the input data which again is the the photographs the frames that were extracted from video so here we're sort of getting the best of Both Worlds the best of the the neural Radiance field rendering which is the Horizon and the completeness of this scene but also the ability to to move around this in something like Unity a game engine and set up camera paths in here which I have I've done as well and rendered out a video so let's take a look at that video which is rendered out as an image sequence of jpeg frames that then I stitched together in Premiere and this was done using the unity recorder [Music] foreign as you can see the gaussian splatted video looks a lot like the Nerf video to my eye it's a little bit crisper a little bit more clear and you have the added benefit of using Unity for this so you can add additional elements to the scene for example in that video I added in a panoramic Sky a geosphere that has a spherical image map to it that I took on location there so I was able to fill in the sky a little bit let's go back to the photogrammetric model uploaded to sketchfab and take a look at some tricks that you can employ to give it some of the the benefits that neural Radiance fields and gaussian splatting have specifically the the sort of broader context of the mountains and the Horizon and the sky one of the the great things about photogrammetric models is that they can be easily shared in 3D which is something that can't be said at the moment for Nerfs and gaussian splattered Point clouds photometric models because they are a mesh with a texture can be very lightweight they can be 5 to 10 megabytes all the way up to hundreds of megabytes obviously if you want a really detailed model they can also be shared on the web in a web viewer like sketchfab which is what we're looking at right now and here what I've done is simply added that same panosphere to the the background of this photogrammetric model so you can see the the sphere that has the Panorama map to it and what this looks like in practice is basically your photogrammetry model that you then bring into an editing program like blender or 3D Studio and you simply make a sphere geometry around it and then you map the panorama to that sphere and then you of course you can set up viewing limits in sketch Fab so that you can't zoom outside of the spear for example which is what we're looking at right now so that's basically what this looks like we have the photogrammetry model surrounded by the sphere to give it the appearance at least of some broader scene if you made it this far thanks for tuning in hopefully it was an enjoyable and educational video I had fun making it and I'm looking forward to producing a lot more of these neural Radiance field and gaussian splatted videos in the future because I have a ton of photogrammetric data sets that I've collected over the past 15 years and the great thing about these is you can reuse that data and get completely new visualizations and new effects so if you're interested in this stuff check out the other videos I have this is a bit of a departure for me I usually work with art and Architectural history and Archeology but I really enjoy playing around with these new technologies so if you enjoyed it subscribe leave a comment below and I'll see you next time
Info
Channel: Matthew Brennan
Views: 111,633
Rating: undefined out of 5
Keywords:
Id: KFOy354zf9E
Channel Id: undefined
Length: 23min 29sec (1409 seconds)
Published: Sun Oct 01 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.