3D Gaussian Splatting: How's This The Future of 3D AI?

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

just a few weeks ago we saw the long dead gbd4 hype come back to life with a new gb4 V which not only can see hear but also speak on top of that the opening as image generator Dolly 3 being revived out of nowhere made me start to believe that lk99 might as well be real so what's next the metaverse after Zuck the Lizardman got roasted for dumping an insane amount of money on a project that looked like the remaking we avatars this is the metaverse now feels old yet if you haven't seen the recent Lex frean video he interviewed zck for the third time but this time with a little bit of a Twist interviewed in a virtual reality and it just looked like the scene from The Matrix demoing what is perhaps Meadow's first big step towards bringing metaverse into reality and really looking like they finally are getting somewhere except they probably need something to generate realistic 3D environments like Nerf or what is this 3D gaan splatting 50 times faster than nerve stateof the r what 3D as short for 3D gal and splatting released only a few months ago might have just replaced Nerf in rendering 3D scenes from 2D images this resurface technique that does not rely on neural networks has just blown the whole field away not only with its Superior optimization time but also its realtime rendering quality yes you heard it right you get an insane amount of FPS in real time while browsing through the extremely detailed 3D render instant NGP does not even come close to the FPS and the quality it has 3D GS might just be like the next diffusion model that jump started the photo realistic image generation field by replacing Gans as 3D gaussian and diffusion were pre-existing Concepts but we proposed with the addition of some newer techniques so is it really Time to Say Goodbye to Nerfs well let's discuss how they actually work first before you start to believe my unhinged AI bro statements for a quick and dirty explanation Nerfs are basically R tracers and 3dgs is a rasterizer for an actual explanation both start with using a photogram technique code structure from motion to determine the positions of your images based on your camera's point of view it then outputs this sparse Point Cloud for you to work with for processing with nerve it can basically be broken down into three main steps first marching the camera race through the scene and collecting the simple points on the way back then it projects these points to a high dimensional space which basically enclos the information so the neural network can process them more efficiently then finally these points are used as the input for the neural network and it will generate the RGB andt density value for the input points which if you put them all together and convert it into pixels you would then be able to see the object the only AI part which is also the slowest part in Nerf is just the neural network repeatedly comparing the AI generated RGB plus the density points against the ground truth POV so when you're rendering The View while training you get to see this and blurring effect which is a common view when you use instant NGP there are other methods evolved to solve the speed issue like smaller or better MLP representation in the hash grid which is used in instant NGP or completely dropping the AI component and use a sparse foxal grid to interpolate the density field like what you see in penoil but most of the time these have trade-offs with quality so might not be an ideal solution 3D gaussian on the other hand creates these three dimensional gussian volumetric Splat on top of those spars Point clouds and it contains information like position covariance Matrix and opacity to model the scene and it uses this tile-based rasterizer to render and optimize the 3D calcian with speed that is fast as boy and instead of having an AI coloring the scene for you it stores the spherical harmonics within the 3D Gans to represent the colors this then save tons of time and resources while being able to generate quality that is comparable or even better than myip Nerf 360 which is the current stateof the art in terms of quality while taking only 150th of its training time on the official 3D GS paper the psnr is lower than MIP Nerf 360 but the official result definitely look a lot cleaner than MIP like for this bike the thin metal rods on the wheels are so much cleaner along with finer details of the grass and the tree branches it is just obviously a big step up from what I've seen on the internet not only is the speed comparable to instant NGP but the quality is just miles ahead with details that can be as small as eyelashes hair strands or details for scenes as big as a factory with super good water Reflections 3D gussian splatting has got it for you a few people even made some pretty cool effects like gaussian and splatting where the scene just falls apart burning Gan splatting where the scene just burns in and out or in effect that looks like the scene got deleted by Thanos very impressive right and an important takeaway here quoting from the original Nerf author is that 3D gussian splatting is not just a nerve 4D MLP has been replaced by a set of gaussians but the part where 3dgs is using rasterization instead of Ray tracing which is a pretty important point that a lot of people missed anyways not only is the processing much faster the whole research field decided to go Speedy too just after 1 and 1/2 month we got not one but two new 3D gussian based Texas 3D research one is called dream Gan and the other is called Tex to 3D Gan splatting they look on par with some of the latest nerf or diffusion based Tex to 3D and while the technique may look pretty transferable it is still pretty impressive that someone has already coded trained written research paper in this short amount of time on top of that a dynamic galaxian research was published 15 days ago where you can render a 3D scene over time I don't know if it's the same research but someone even threw it into a VR and viewed the scene in real time with 2K resolution and 250 FPS while its quality may not be comparable with Dynamic nerves it is still pretty exciting to see the field has a lot more room to grow and not necessarily set in stone like how diffusion is now much preferred than G for image generation there are a lot of fake AI services like this AI company that raised 1.5 million turns out to have been using humans to make 3D models instead of AIS so let me just quickly point you to a few legit services that have 3dgs implemented right now the first one is poly cam I only came across it when 3D Gan was announced on their app they are the first ones to include it in their services so they get to be the first one on the list I guess they also stated that the processing cost is free so who knows you can go try it out now the second one is Luma AI which I mentioned previously because they have some good tools and using Nerf to generate 3D scenes with lots of other functions like exporting to blender oh yeah I'm not sponsored by any of them by the way so if you do want to play around with 3D Galaxy and splatting while not having to bang your head against the table to figure out how to build the codes yourself definitely check out their services but if you do choose to suffer through the hellish landscape that is installing it locally I do have this tutorial which my team at by Cloud AI made to maybe reduce your pain and suffering down to just a few clicks and copy and paste it takes around an hour training time for a scene on the 390 and requires 24 vam so if your GPU is not strong enough maybe Luma AI or poly cam would be a a better choice and if this topic interests you and motivates you to start learning AI or machine learning in general today's sponsor brilliant is actually one of the best places to get started brilliant is an online learning platform that is basically when your textbooks become alive it provides a way for you to learn interactively with brilliant fun Hon lessons in math science and computer science research has shown that Interactive Learning helps you learn six times more efficiently than watching lecture videos and I totally agree with that interactive lessons not only help you visualize problems much quicker but also are able to illustrate very difficult concepts for you to comprehend faster which is what plain textbook or YouTube videos cannot help you with back in my high school days I actually used brilliant along with my Calculus class because it was much easier to understand what calculus is on about while you are being freshly introduced to a new field of maths brilliant not only have very helpful diagrams for you but also interactive elements that actually helped my understanding much faster than learning the fundamental theorem of calculus through a wordy mathematical definition inside a boring textbook not only that brilliant also provides a clear road map on different subjects for all knowledge levels from basic algebra to advanc multivariable calculus from programming with python to artificial neuron networks brilliant is full of stem classes that are usually a pain to study in but made into a much friendlier and digestable format so yeah you can quickly get started on brilliant by heading to brilliant.org Cloud to get started for free with brilliant ever expanding interactive lessons and to also support this channel the first 200 yall will also get a 20% off in annual membership shout out to all my patreons Andrew lelas Alex J chryst ax Marice mulim Dean fifal daddy W and many others you all have made me able to pay for many of my cloud services I use instead of just taking out of my own pocket so thank you guys again and yeah I guess that's it I will see you all in the next one

Info

Channel: bycloud

Views: 60,021

Rating: undefined out of 5

Keywords: bycloud, bycloudai, 3dgs, 3d gaussian splatting, 3d gaussian splatting explained, 3d gaussian splatting ue5, 3d gaussian splatting godot, 3d gaussian splatting for real-time radiance field rendering, instant-ngp, nerf vs 3d gaussian splatting, what is 3d gaussian splatting, dream gaussian, gsgen, text to 3d, text to 3d gaussian splatting, mipnerf360, 3d reconstruction, structure from motion, Dynamic 3DGS, polycam, luma ai, 3d gaussian, 3d reconstruction from 2d images, instant ngp

Id: C708Mh7EHZM

Channel Id: undefined

Length: 8min 54sec (534 seconds)

Published: Sun Oct 15 2023