Using high quality local Text to Speech in Python with Coqui TTS API

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

good and welcome to this technical walkthrough video and today's topic is one of the most frequently asked or one on the coqts community and how can I integrate cookie text to speech into an existing python-based application and this is what this video will be about [Music] stuff [Music] open white enthusiast open Voice open Future for this technical walkthrough I have not prepared anything yet because I would like to show you every required step apart from having a Python 3 9 or 310 version installed on your system so let's start with the preparations and hop on to our terminal and let's create a new folder let's say TTS and within that DTS folder I'm going to create a python virtual environment and activated with bin activate as you can now see as this TTS prefix um we are now inside our python virtual environment install or update some basic modules like setup tools wheel and pip with that minus uppercase U for update and now let's install the latest release of TTS by running pip install TTS installation is completed now let's check by taking a look to pip list and we can see that the TGs package is installed in its latest version 0.10 and this is the minimum version where that python based API is offered by koki TTS so please make sure you have at least version 0.10 of that package installed if you would like to use that python-based API now it's time to create a simple Python program for that let's go to visual studio code or whatever development environment you are using and let's open our folder which is TTS so and within that TTS folder they are just that python virtual environment files now let's create a hello world python let's check it with a print hello world and open a terminal please make sure that we are inside our python virtual environment um because only there we have installed the TTS package and can use that API so let's just check it again pip list TGs package is available so and now we can run python3 hello world so this is not too complex so not surprisingly this is uh this is working we have to import the python modules for text to speech let's run from tts.api let's import GTS that's all for the import and now let's run TTS equals GTS the model name so as you can see we now have completion functionality so we can see all the code documentation here so let's run our model name because we have to tell our python logic our Python program which TTS model we would like to use and if you do not know which models are available this is really simple we can just run in our command line TTS minus minus list models and now we see all the list of voice encoder or vocoder models and the TTS models so let's go with my German model for this testing purpose so let's copy this one and let's save this TTS and call the function TTS to file needs a text period obviously so thus is test this is a test and if you do not specify any output parameter so where the final wave file audio should be written it will be written in the current directory as an output WAV file so now let's see what happens if we run this one let's save it see here on the navigation part on the left side that there's no WAV file at the moment so let's run our python hello world again we can now see that the model is going to be downloaded so this step will skip once you have it on your hard disk now it's time to download The Voice encoder model the vocoder okay as you can see I run into an exception because one of a missing dependency I have to install e-speak or re-speak NG on my local system so let's fix this one as I'm on a Ubuntu system at get install please speak energy so as especng is now available on the system let's run the command again and this time downloading text-to-speech and voice encoder model should be skipped as it has been already downloaded and is available on the local disk the download has been skipped this is our input phrase the system test and we can see a processing time and a real-time Factor now as a performance indicator and the relevant part is on the left navigation side we now have that output of our file so let's play it because it's time test so as you can hear this worked like a charm so no problem if you have espec engine installed there's another way to synthesize audio without saving to a file so you can run TTS and just TTS so skip the to file option and run it this way but this returns a numpy array so it is a little bit more in detail required so I will show you how you can find more information if you would like to deal with this option but for this video this option is a little bit out of scope so for that let's go back to our browser and here you can see in the code base of coke TTS in the DTs TTS subfolder that API python class and here you can find all the information on how this stuff works and can take some ideas from or whatever you would like to do with it feel free to take a look here and to get more information and yeah that's it for today's technical walkthrough on how to use the koki TTS API to integrate text-to-speech into your locally running python based application so that's all I hope you liked that video if it is so please give it a like subscribe to my channel if you haven't already because this is really really highly appreciated and in general thanks you all for watching my videos for commenting and being part of my wonderful YouTube voice technology community and if you like we might see us next time bye

Info

Channel: Thorsten-Voice

Views: 18,747

Rating: undefined out of 5

Keywords: text to speech, text to speech for youtube videos, deep learning, natural voice, künstliche intelligenz, open source, künstliche stimme, machine learning, voice cloning, Python, API, Local, No Cloud, text to speech software, best text to speech software for youtube videos, text to speech software with natural voices free download, text to speech software for pc, text to speech software free, text to speech software for mac, text to speech software free download

Id: MYRgWwis1Jk

Channel Id: undefined

Length: 7min 49sec (469 seconds)

Published: Wed Jan 04 2023