Audio2Face Introducing New AI Models

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
foreign [Music] we will go through the new key features in the audit face version 2023.1 in this version as the main update we introduce new types of AI models which performs better lip sync for various voices and languages so let's have a look unlike previous version this version has a empty viewport but instead we have a AI models tab on the right so for AI models we have a mark and Claire Mark is our American male character and Claire is our Chinese female character for each AI model we have a corresponding Network so for the mark we have the two versions version two is the default one and on the right we provide a summary of this network so this network provides a great lip sync quality and good multi-language voice adaptability and great emotion range version 1.1 is also a good network but generally version 2 performs better than version 1.1 and we provide this version 1.1 because it gives a similar motion to our previous audio to face release uh for Clear we have only one network and this network uh provides great ellipsync quality great multi-language and voice adaptability and fair emotion range for audio players we have a regular player and streaming audio player regular audio player support audio files with wave format and it also supports audio recording and live mode streaming audio player works with the streaming audio for example streaming audio data from the TTS server so for the demo let's select Mark version 2 and regular audio player and click get started button and you will see this nice rendering with the real time soft surface scattering on the viewport and this is our default render setup from this version so let's go to the audit phase tools and test some audio files the beige shoe on the Waters of the lock impressed all including the French Queen before she heard that Symphony again just as young Arthur wanted so in this version we're providing a better AI models which can perform good for various voices and languages so let me play a few more audio files so this time I'm going to play English female voice good morning Professor Austin how are you doing next I'm going to play a Chinese male voice oh and finally I'm going to play a Chinese female voice so as you can see here our new AI model works really nice for all these different audiophiles for the UI part there's not much change on this UI but this emotion tab will remove the neutral slider here and this all zero Vector represents the neutral state of the character and you can change the emotion of the character as uh before and you can also mix the different emotion using the slider and we found this all zero Vector setup for neutral is more intuitive to control the emotion of the character other UI is exactly the same as before and you can check the previous uh video tutorials or you can also check this tooltip for each menu and this network part has some update so in this version we are providing multiple choices for the network so so far we're using Mark version 2 but if I select clear version one you can see that the face is changed to the clear space and now we are using clear version 1 Network so this clear is our Chinese female character and she has her own style when she performs this lip sync animation so let's play some audio again the same Chinese female audio to the Chinese mail audio foreign and I'm going to play English female audio good morning Professor Austin how are you doing and finally English mail audio with tenure Susie would have all the more Leisure for yachting but her Publications are no good and again you can see that our AI model performs really nice for all this uh audio data so uh from this version you can test a different network and you can find which one performs best for given audio file or for your own requirement so you can test this network and find the best one so that's the main update for the AI models part and before I finish the tutorial I'm going to show you another thing quickly so this time I'm going to choose clear and streaming audio player and click get started button and this gives you this clear face and with the streaming audio player and to test this streaming audio player you can right click and send the example track with tenure Susie would have all the more Leisure for yachting but her Publications are no good so a new feature I want to introduce is this one audio to emotion so from this version if you check this button we are providing real-time audio to emotion for the streaming audio data so in previous version we couldn't do that because audio to emotion didn't work with the live streaming data but this version we are it's working so if I resend the audio data with tenure Susie would have all the more Leisure for yachting but her Publications are no good you can see that our audio T motion is analyzing this streaming audio and control the emotion uh in real time so that everything that I wanted to introduce in this video tutorial I hope you find the new audio space app useful thanks for watching the tutorial bye foreign
Info
Channel: NVIDIA Omniverse
Views: 9,281
Rating: undefined out of 5
Keywords: NVIDIA, Omniverse, RTX, PixarUSD, OpenUSD, Ray Tracing, Animation, Collaboration, Photorealistic, GPU, Connector, Extensions, Robotics, Generative AI, Simiuation, Digital Twins
Id: DGEIRuXP8hQ
Channel Id: undefined
Length: 8min 3sec (483 seconds)
Published: Fri Jul 14 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.