Celebrity Voice Cloning With Eleven Labs + GPT4 Vision (Tutorial)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
AI David atenor can narrate your life in real time and it is absolutely insane watch this ah here we observe a colorful specimen of homo sapiens clad in the traditional regalia of the tiedy tribe A peculiar fashion that mimics the Psychedelic patterns one might see on a chameleon after it's had a bit too much fermented fruit this individual has positioned itself in front of what appears to be some kind of communication device a microphone used to amplif its mating call across the digital Savannah what's this the human has begun to move it's a slight twitch of the eyebrow absolutely breathtaking today I'm going to show you how to do this let's go so I want to thank Charlie Holtz on Twitter he's the one who first showed this and it already has 3 million views on Twitter so it's incredibly popular we're using gp4 Vision to analyze the camera shots and then we're using 11 labs to synthesize David aten bur's voice this is the GitHub repo for it I will drop it in the description below and it's quite simple so open up your terminal let's clone the repo first all of these instructions I'll put in a gist and I'll link it in the description below so git clone and then the URL for narrator. git and then we're going to CD into it CD narrator and then we're going to spin up a new cond environment condac create DN narrator python equals 3.11 there it's done now we're going to grab this Command right here and we're going to activate the new narrator environment so cond to activate narrator hit enter now we see it's activated right right there now you're going to need two things you're going to need gp4 API access and you're also going to need an 11 Labs account so here's 11 Labs I've already signed up in the top right you're going to click the little icon button right there you're going to click profile and then you can grab your API key right here then we're going to install the requirements but to do this I want to make sure that we're running the right python so do which python then you're going to grab your python environment right here paste it and then DM pip install dasr requirements.txt and then we're going to do export 11 la La _ aior key equals and that's where you put your 11 laabs API key now in open AI we're going to create a new secret key I'm going to call it narrator YT create secret key I'm going to grab it right here I will revoke this key before publishing the video then we do export open aore aior key and then you enter your open AI API key hit enter and that actually should be it let's give it a try so you need to have two servers running first we need to run the file Python capture. piy and that's going to capture what's in the video so iterm wants camera access you go ahead and click okay and then let's open it again and there it is it says say cheese saving frame so I believe every few seconds it grabs a frame from your camera next let's open up a new terminal and basically what it's going to do is it's going to send each of those frames to gp4 Vision to analyze what's in the frame then it's going to create dialogue for David atenor to read and essentially we're going to pass that to 11 Labs which has been pre-trained with David aten bro's voice so make sure you're in the same folder and then we have to open that environment again so so cond activate narrator now we're going to do Python narrator. and one thing I forgot to do is actually put quotes around the open AI API key token so remember to do that and let's run python narrator. David is watching so we actually got the words right here but we have an error so let's see what's going on here okay it says a voice for voice ID was not found so we do need to actually train the voice and of course David uro has the rights to his own voice so this is purely for educational purposes so I actually found this MP3 of David aten bro's voice that I'm going to use to train his voice and I'll link this in the description below so we're going to download the MP3 here and let's give it a listen in as much then as this Leviathan comes floundering down upon us from the headwaters of the eternities okay perfect let's head back to 11 labs and we're going to add a generative or cloned voice and it looks like we are going to need a paid account of 11 Labs that's too bad but of course you can design a voice for free right here with voice design but it's just not going to be David atenor so I'm going to go ahead and subscribe just to try this out and for the functionality we need it's only a dollar a month for the first month so let's do it it's cheap okay now I'm subscribed let's click here again we're going to do instant voice cloning I'm going to name this David a and now we're going to upload that MP3 and that should be good to go we're going to add voice and let's see if it worked switching back to the terminal let's use this first paragraph of text that had already created to see if it sounds right so copied it going back to here let's paste it in and let's generate ah here we observe the modern human in its natural habitat the digital workspace clad in a vibrant display of ceremonial fleece this is so good okay all right I'm going to pause it this worked perfectly okay and right here under this dropdown right now it says 11 multilingual V2 but I actually want to use the turbo version because it really should be very very fast so 11 turbo V2 so I'll click there and now that's selected let's see how it sounds ah here we observe the modern human in its natural habitat the digital workspace okay so that's actually perfect I'm going to leave leave it like that okay so this was kind of difficult to actually find the voice ID of the David atenor voice that I just cloned but what you need to do is go to this URL right here which I'll drop in the description below then you scroll down you enter your API key right here under the get voices and then you execute then within the returned results you actually have to search for the voice and there's the ID right here so I'm going to copy that I opened Visual Studio code with the narrator project open and within narrator. piy if I look on line 27 there's this voice ID and it is hardcoded which is unfortunate but what you have to do is just paste your own voice ID right in there and then click save now let's run it again so I'm going to go ahead and hit python narrator. piy David is watching okay I'm running into an issue where I don't think the API is actually using my API key and so I think the code has a little bug in it so what we're going to do is we're going to import this little function right here set API key from 11 labs and then we're going to set it manually to make sure it's actually in there now let's try it again and python narrator. let's hit enter David is watching and I think it worked this time for some reason I do not hear anything all right so unfortunately for some reason I can't actually get the audio to play through the terminal but I know it's working because if I look right here the audio files are right here so if I click in here's audio. wve I open it up and now we can hear David aten's narration and here we have the incredible Homo sapiens clad in a strikingly colorful garment that mimics the last pallet of nature itself observe as it Masters the art of Stillness before an electronic device a ritual as complex as the courtship Dance of the bird of paradise but as widespread as the grazing of the Buffalo behold the concentration the Poise demeanor this creature is clearly on the cusp of doing something phenomenally ordinary if you liked this video please consider giving a like And subscribe and I'll see you in the next one
Info
Channel: Matthew Berman
Views: 38,606
Rating: undefined out of 5
Keywords: eleven labs, elevenlabs, voice cloning, ai voice, chatgpt, gpt4, gpt4 vision, ai david attenborough
Id: G2VAWXXk1F0
Channel Id: undefined
Length: 6min 57sec (417 seconds)
Published: Wed Nov 22 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.