ElevenLabs Full Tutorial - AI Voice Cloning, Dubbing, Speech-to-Text & More!

Video Statistics and Information

Captions Word Cloud
Reddit Comments
[Music] voice cloning dubbing text to speech speech to speech if you're not aware of these amazing AI capabilities no one does it better than 11 labs and you are about to learn everything that you could possibly know about the platform both the free version and the Creator version affiliate Link in the description let's go I do want to mention that I'm using the Creator account so the first month is 50% off it's $11 a month and the next is going to be $22 every single month I'm just signing up so that I could show you all the various features that it offers and if you're using it for free you're still going to have all of these features but the reason I signed up to Creator is because I wanted to show you the additional features as well so let's start with speech synthesis there are two options here we can do text to speech and speech to speech we're going to start with text to speech and here we're able to convert text into lifelike speech using a voice of our choice so under settings here we can see that we can go ahead and we can select which voice we would like to choose and you can even play and listen to The Voice prior to selecting it obviously would be a good idea I want something that's going to whisper to me a little bit we're going to get a little bit spicy with what we wanted to say so I'm going to select Nicole and then if you select the drop down for voice settings there is is a choice that we have for stability now please remember that if you increase the stability metric here what you're going to get is speech that's more expressive so it's going to sound more real however it can also lead to instabilities so you'll have to tweak it but not too much Clarity and similarity enhancement is this bar here that I have set to 84% and you could see that low values are recommended if background noise is present in the generated speech something to keep in mind and the style exaggeration is if you feel that the voice you're selecting is really plain the exaggerated setting is going to make it seem a lot more exaggerated obviously but higher values can lead to more instability in the generated speed so they said that setting this to 0.0 will greatly increase generation speed and is the default setting but I definitely recommend going above 0% but not too high when it comes to style exaggeration and speaker boost I typically just keep this checked on and it basically is going to boost the similarity of the synthesized speech and the voice at the cost of some generation speed which is fine by me this I want to point out that it's saying we recommend switching to 11 MultiLing and V1 model to get the best possible quality for The Voice selected and I selected Nicole you'll be able to hear her in a second and what this does is right here if you click it you can toggle between V1 and V2 of multilingual or just 11 English V1 so now is probably a good time to mention this if you go ahead and you take a look at 11 multilingual V2 you'll see that there's 29 languages versus V1 which only has what looks like eight or nine languages now the way this works is you don't even have to select the actual language when you type in the text if it's Indonesian if it's polish if it's Swedish if it's Arabic this software is going to automatically detect the type of language that you're using and then it will create a text to speech version of it now another thing to keep in mind is if you are working with one of these languages that you see here in 11 multilingual V1 sometimes selecting 11 multilingual V1 will give you better results than even 11 multilingual V2 and I'll show you that in a second so let me go ahead type out some text and then we'll see how it works on 11 multilingual V1 and then 11 multilingual V2 so I really hope my girlfriend doesn't mind but I'm gonna go ahead and flirt with Nicole right now and ask her to tell me I love me some artificial intelligence nerds talk machine learning to me baby and it should create it in a voice that's whispering in a female voice so let's see what happens remember I'm using 11 multilingual V1 and then after this we'll try it with 11 multilingual B2 and as you can see very very quickly it was able to create it for me so let's listen to it I love me some artificial intelligence nerds talk machine learning to me baby so now let's switch it to 11 multilingual V2 I love me some artificial intelligence nerds talk machine learning to me baby so I actually did like 11 multilingual V1 for that option now let's go ahead and take a look at this speech to speech feature in this one we're able to create speech by combining the style and content of an audio file you upload with a voice of your choice what does this mean well over here we select the voice of our choice the voice settings and the type that we're going to roll with and then we can either upload an audio file or we can just speak into it and record audio and then it will be generated in this voice and by the way right now is a great time to mention what this ad voice thing is here in order to understand it we're going to go over to voice Library this resource has a lot of different voices that was posted by the community and what you could do is you can sample them and then add them to the voice lab so for instance if I want to add Lizzy a refined Victorian old British female I can sample it and then I can add it to voice lab so if I add it to voice lab and now if we go back to speech each synthesis and we say add voice we can go ahead and now select Lizzy because we've added it to our voices but what I'm going to do in this example is I'm simply going to use my voice so let's go ahead and do that right now and then I'm going to click this microphone icon here and then I'm going to say you better subscribe to this Channel or I will hack your computer and then click on generate you better subscribe to this Channel or I will hack your computer so now that we understand speech synthesis let's take a look at this project tab right here and I'll explain to you exactly what this does so in this section what 11 Labs is able to do is it's able to turn into audio long form content so anything from a book to full documents conversations and I'll actually show you how to embed something onto your website so that anyone that goes there can click it and it will read things off of the page for them but first things first let's show you exactly how to create a new project so you create new project then you choose the project type so for this example I actually want to turn a page on my website into audio with Nicole speaking The Whisper AI girl so what I'm going to select is initialize a project from a URL then I'm going to click next I'm going to title the project name web Ai and for the URL to create the chapter from let me do a Shameless plug I'm going to do it from my own website if you don't already know I'm a web developer more of a web designer these days I should stop calling myself a web developer and if you go to the drop down we'll go to artificial intelligence and this is the page that I want Nicole to whisper to you in a very seductive way so that you take me on as a client so if you guys don't know I consult a lot in AI to a lot of the corporations around I'm located in North Jersey in New York City over here you could see I present a lot so that's me right there if you pay me enough I actually take off my hat that's that's what I look like for those of you that have been following this channel for a long time that might surprise you and I also do a lot of presentations in Ai and obviously I incorporate a lot of AI tutorials into my YouTube content so let's go ahead and simply copy this URL and then we're going to paste the URL here and for the default voice we're going to choose Nicole for this example I'm actually going to keep this multilingual V2 so this is a really cool feature if you do the drop down under download settings you can see that you can normalize the volume to meet audiobook standards and this basically means that this program is going to adjust the volume and also Dynamic compression will be applied so that the volume will meet your standard audiobook Style Guidelines and then here it actually lets me visualize all of the text that it's going to go ahead and turn into speech by Nicole so I'm going to look this over I'm going to say okay looks good so we're going to click on convert click on convert again I'm going to select the convert once more with rapid advancements in AI technology embracing its potential has become essential for staying competitive in today's fast-paced world at promo Ambitions we understand the unique challenges faced by businesses in the healthcare so I think you get the gist of it it's pretty awesome so moving right along so what audio native over here allows you to do is you can actually turn any website text content into audio with a simple snippet what do I mean by that well all we have to do is we have to select this right here this is going to be the snippet and we do have to put in the allowed URL so you are going to have to put promo missions.com SL and then it's going to allow audio native to operate when the URL starts with promotions.com okay so now let's talk about dubbing the cool thing about dubbing a project is that you can take any video that you either upload or it's from YouTube Tik Tok X Vimeo or another URL and what it's going to do is it's going to take the source language that you are going to dictate what language it was in here and it's going to dub it in a different language so I'm going to take a source language of English and I wanted to translate it into Spanish and I'm going to use one of my YouTube videos and if I scroll down on the artificial intelligence section of my website you'll see that I populated it with a lot of the videos that I have taken in the past couple of months to years in AI so what I'm going to do is I'm going to choose this video here and then all I'm going to do is I'm going to click on share and then I'm going to copy this URL and then we come back here we make sure that YouTube is selected and we're going to paste our URL in the watermark here just means that there's going to be an 11 Labs Watermark and they're going to reduce the character usage so remember you only get a certain amount of characters on certain amount of plans so if you want to reduce it by 33% you can simply check that here and then there's even advanced settings which is the number of speakers the video resolution I always recommend that you use highest and the exact time range for dubbing I actually like this part because maybe I don't want to dub the entire video so I could do it in the format of the hours minutes and seconds so let's say I just want to use the second to third minute and that's all that I want to dub in this video then what I could do is I could come over here and I could type in 00 colon 01 colon 0 and then 2 00 colon 02 colon 0 0 and that should just give me a minute of dubbing and then we're going to go ahead and create this and you can see here that it's doing it really quickly so it's saying it's only going to take like 2 minutes to go ahead and dub the entire thing so I'll wait for it to fully complete pretty gnarly yeah if I was smart I would go ahead and I would recreate my channel for a Spanish audience and then recreate it for a German audience and go ahead and post the different versions I believe that is what some of the most popular YouTubers do and they do quite well with that so if you run a YouTube channel I definitely recommend trying something like that and if you're curious what I meant by Watermark you could see this in the bottom right corner where it says 11 Labs so if I didn't select that it would have taken more characters but it wouldn't have that so if you don't wish to have the 11 Labs logo you can leave that check mark off so now let's talk about voice cloning you can access that in voice lab and over here what you're going to going to do is you're going to add generative or cloned voices you'll also see all of the voices that you saved from The Voice library in the voice lab section as well so how do we go about creating a new voice well you can go and design The Voice by clicking this first option here and that's actually going to let you dictate what gender it is the age that you wish for it to be even the accent if you want someone female that's young with a British accent you simply specify that you can even dictate how high or low the actual accent is going to be and if you want to see exactly how it's going to sound you can just type something in generate and use that voice and that voice will be saved in that section as well but the really interesting one that I want to show you guys is instant voice cloning this is unfortunately used for a lot of deep fakes but to use it for good I'll show you exactly how to clone your voice so for instance I want to clone my dad's voice he was born in Moscow so he's a thick Russian accent and I'm going to go and name this dad's voice and then what we're going to do here is we're going to upload a file so this can be an audio or a video file but you want this to be nice and clear in speech the whole point of this even though if it's a video file is to have clear speech of the person that you are trying to clone so make sure there's not a lot of background noise that they're speaking steady and that the clarity is good enough to upload and please bear in mind that if you have a video file but it's too large keep in mind that it has to be up to 10 megabytes and not larger if it is larger though don't worry there's a great tool here called free convert and if you go to this tool you can simply upload any video and it will turn it into an MP3 file for you and then you can go back in here and upload that MP3 file which is exactly what I did here so as you can see this is an MP3 file that is only a minute and a half long and the size is 1.3 megabytes whereas the video I took with my dad speaking is 419 megabytes so it's way too long so we're going to go and select this one we're going to write a description for the voice so we're going to say male voice with a thick Russian accent I'm going to confirm that my dad actually did allow me to upload and clone his voice and I'm going to add his voice and now once that's done I'm going to go over here and I'm going to click use and then let's type whatever we want for this so here I'm going to write I want to tell you a story about my childhood and the history of our family let me start with your great grandparents and now I'm going to generate and I'm going to tell you if it actually sounds like my dad so if you go to your history tab you'll be able to see all of the G generated playlist of everything that you've created using 11 Labs so here you can see there's about five or six different versions and I toggled some of the voice settings which actually helped it sound more like him what helped me is raising the exaggerated even though it is now in the unstable area or range of the style exaggeration but it actually helped a lot so take a lesson uh I want to tell you a story about uh my childhood and the history of our family uh let me start with your great grandparents that is near perfect that sounds phenomenally close to my dad I actually had him listen to it and he was really impressed and kind of weirded out so that just shows the power of 11 labs and their AI capabilities when it comes to text to speech and voice cloning so the next thing I want to show you is voice Li library and this is just a repository of different types of voices that people in the community made and offered them up for the community for other people to use and the final thing here which is creating a professional voice cloning where you can create perfect digital replicas of your voice I'm going to probably do this in a separate video I want to follow this step by step but I don't want to make this video too long it's basically for creators that are trying to create a hyper realistic model of their voice and use it for various reasons and again I'll probably make a tutorial separately it is literally 3 in the morning right now that's how hard I work on these videos If you guys appreciate this content please keep me motivated to keep putting out this type of content by liking by commenting by subscribing let me know which platform or which topic in AI you wish for me to cover next and I will see all of you in the next video [Music]
Channel: PromoAmbitions
Views: 10,510
Rating: undefined out of 5
Keywords: eleven labs ai, eleven labs tutorial, eleven labs dubbing, eleven labs voice cloning, elevenlabs tutorial ai
Id: hzyx0JkiAt4
Channel Id: undefined
Length: 18min 4sec (1084 seconds)
Published: Sat Dec 09 2023
Related Videos
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.