1-Click Voice CLONING Tool (powered by Coqui and Bark)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
we've got a more powerful voice cloning option open source using Ki and bark so we're going to learn in this video how to run this web application which has been very kindly put out by Sylvan filon and syvan has kindly put out this hugging face Paces which I'm going to show you how to run on Google collab but if you don't want to run it on Google collab you can just directly go here and click this link and then start running it which is also something that I've credited at the top of the collab notebook so it's easier for you but if you're like me if you don't want to stand in queue or if you you don't want to wait and if you want to just run it whenever you want to be flexibly then you can use this tutorial to run it on Google collab to start with first of all this is using two different libraries primarily Ki and bark and Ki recently added bark support inside Ki I'm not sure if it is supposed to be pronounced Ki so I'm calling it Ki so if it is if it is a wrong um please please let me know in the comment section so they've got this bark integration so this entire code is put out by syvan and we're going to just simply use that gradio and run it before we move forward I would like to first show you the demo so this is the input voice that I've got from Dr Jordan B Peterson you're trying to impress the people that are there and you're trying to get them to like you and so you maybe so this voice is what we are trying to replicate or duplicate or clone and when I'm trying to duplicate the voice I want to duplicate it with this text so I'm just saying hiy Raja lama lama red pajama today was very tiring day and I can't believe how it went and I'm happy that it was not very bad so if you listen to the output voice you would notice that one it takes a lot of pauses second The Voice modulation changes a lot within the voice iner itself and third sometimes it adds unwanted noises to it despite all these things this is a very very impressive voice cloning that I've seen in the recent time like forget about 11 labs and all these kind of paid Solutions proprietary Solutions in terms of Open Source solution this is one of the most most impressive solution that I've seen and probably that is also the issues that we highlighted before are the issues that we are already aware with B so probably bar could improve and then Fix It ultimately but this is a very impressive output let's listen to it hi Roger um uh Lama um Lama um red pajama um today was a very tiring day and I can't believe it how it went in the view save I'm happy that it was not very bad so as I said at the start you could have noticed that it takes a lot of process uh The Voice modulation changes and suddenly there are like some weird noises that are available because again bark is um not like your typical uh TTS system it it has a lot of things inside it but either way this is super impressive so I'm going to show you how to run this first of all this Google collab notebook will be linked in the YouTube description just for free you can directly click it and then get started with it this Google collab notebook runs on the GPU the T4 machine and it manages to run like I at least did five or six iterations before recording this video and it managed to run everything properly so we have got the code cre click this link if you want to use a web application without doing any of this code the second thing is I've created a fork of this myself so we're going to do a git clone off it and after we do the git clone we're going to enter into that folder so we have got this F folder we're going to enter into this folder after we enter into this folder we are going to install all the requirements so if you see the requirements.txt file so it is the TTS which is from Ki but syvan has got the own version like a clone with some fix and then hugging face Transformers torch scipi and P du if you want to add gradio you can add gradio here but otherwise if you can install this everything pip install our requirements.txt then install the gradio here so so technically Google collab will tell you to restart the runtime for me at least personally I didn't restart the runtime and it happened to work completely fine so if you want to restart the run time you can restart it and come back and install again but I wouldn't suggest you to do it so after you install gradio application then all you have to do is do this Python app.py and once you do that it is going to take a lot of time like as you can see it took about like 14 15 minutes it is going to download a bunch of models first so the installation of require requirements EXT and the libraries don't take time but once you run the code for the first time like the app.py for the first time it is going to take a lot of time in downloading models it is going to download the TTS models it is going to download bark model and it is going to download a bunch of other models so after your app.py is done once you open the web application and then do the first cloning again it is going to download a bunch of models which is again going to take time so first like I said it is going to download the required models it is going to give you the public URL for the gradio application you click the URL go to the application and then start the first voice cloning then again it is going to download a bunch of models like the Hubert custom tokenizer and some pych models it is going to download after this is done every other instance would be much faster than the first instance this is like your typical cold start problem so keep that in mind when you do this for the first time it is going to take a lot of time when you do it for the second or third time it is going to take lesser time another important factor a lot of people that do not know about Google collab so all these things that you're doing every time you open a Google collab you have to do this again unfortunately you don't have any other choice and if you want a choice that is where paid GPU Solutions come into picture but with Google collab there is no state that you can freeze in so you have to do this every time again and again once you run this again like for example in this case I can say python app.py with with the bank symbol Bank python app.py then it will again take a couple of minutes but it will give me a public URL which I can click and then it will open an interface like this and I can add the voice and then I can start cloning it adding the voice is very simple you have got the Drop Audio here button or click to upload so you can either like get the audio clip like this you can click get it from your computer and then start adding it so it requires at least a 20 second audio clip for like a good um good input and wave file is preferred so I've got like a wave file the Jordan Pon file that I just played at the start or you can also record from the microphone so if you want to record from the microphone you can do that as well I tested it certain celebrity voices and most of the times when I tested it with celebrity voices it happened to do well and you can try out these examples and also you can pick some of these voice characters and try overall this is an interesting solution and the fact that Ki and bark has come together this is quite amazing thanks to the kooki team and B team and once again thanks to syvan for this web application demo that we managed to run it on Google collab here stop it or click and select disconnect and delete run time stop everything so that you know you're not using extra GPU from Google collab notebook so that you can use it whenever you want again thank you so much for watching see you in another video
Info
Channel: 1littlecoder
Views: 14,082
Rating: undefined out of 5
Keywords: ai, machine learning, artificial intelligence
Id: Mrb8IoBdINI
Channel Id: undefined
Length: 7min 15sec (435 seconds)
Published: Tue Sep 05 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.