This AI Creates A 3D Model of You! 🚶‍♀️

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Today, a variety of techniques exist that can take an image that contains humans, and perform pose estimation on it. This gives us these interesting skeletons that show us the current posture of the subjects shown in these images. Having this skeleton opens up the possibility for many cool applications, for instance, it’s great for fall detection and generally many kinds of activity recognition, analyzing athletic performance and much, much more. But that would require that we can do it for not only still images, but animations. Can we? Yes, we already can, this is a piece of footage from a previous episode that does exactly that. But what if we wish for more? Let’s think bigger, for instance, can we reconstruct not only the pose of the model, but the entire 3D geometry of the model itself? You know, including the body shape, face, clothes, and more. That sounds like science fiction, right? Or with today’s powerful learning algorithms, maybe it is finally a possibility, who really knows? Let’s have a look together and evaluate it with three, increasingly more difficult experiments. Let’s start with experiment number one, still images. Nice! I think if I knew these people, I might have a shot at recognizing them solely from the 3D reconstruction. And not only that, but I also see some detail in the clothes, a suit can be recognized, and jeans have wrinkles. This new method uses a different geometry representation that enables higher-resolution outputs, and it immediately shows. Checkmark. It is clearly working quite well on still images. And now, hold on to your papers for experiment number two, because it can not only deal with still images of the front side only, but it can also reconstruct the backside of the person. Look! My goodness, but hold on for a second…that part of the data is completely unobserved. We haven’t seen the backside…so, how is that even possible? Well, we have to shift our thinking a little. An intelligent person would be able to infer some of these details, for instance, we know that this is a suit, or that these are boots, and we know roughly what the backside of these objects should look like. This new method leans on an earlier technique by the name image to image translation to estimate this data. And it truly works like magic! If you take a closer look, you see that we have less detail in the backside than in the front, but the fact that we can do this is truly a miracle. But we can go even further. I know it is not reasonable to ask, but what about video reconstruction? Let’s have a look. Don’t expect miracles, at least not yet, there is obviously still quite a bit of flickering left, but the preliminary results are quite encouraging, and I am fairly certain that two more papers down the line, and these video results will be nearly as good as the ones were for the still images. The key idea here is that the new method performs these reconstructions in a way that is consistent, or in other words, if there is a small change in the input model, there will also be a small change in the output model. This is the property that opens up the possibility to extend this method to videos! So, how does it compare to previous methods? All of these competing techniques are quite recent as they are from 2019. They appear to be missing a lot of detail, and I don’t think we would have a chance of recognizing the target subject from the reconstructions. And now, just a year and a half later, look at that incredible progress! It truly feels like we are living in a science fiction world. What time to be alive! Thanks for watching and for your generous support, and I'll see you next time!

Info

Channel: Two Minute Papers

Views: 194,522

Rating: undefined out of 5

Keywords: two minute papers, deep learning, ai, technology, science, machine learning, pifuhd, human digitization, 3d avatars, vr 3d avatar

Id: Jy_VZQnZqGk

Channel Id: undefined

Length: 5min 3sec (303 seconds)

Published: Tue Nov 03 2020