So-Vits-SVC: Local Training Tutorial (How to make your own model)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello world in today's video we'll be going over how to train your own models on the local installation of Soviets SVC I just want to start by saying if you're not someone who's very familiar with technology and doesn't know how to troubleshoot then this is not really the tutorial for you you're better off just waiting for an actual easy to install version of sofas SVC or just heading on over to the AI World Discord and following the instructions there so you don't have to run so bits manually or you don't have to train your own model that being said we do love troubleshooting in this community both in the comments below and in the Discord but there's a proper way to troubleshoot and I wanted to go over that real quick as well in an example of a message like this that basically just says I typed svcg and it didn't work it doesn't help us at all we don't know what GPU or operating system we're using and even more importantly we don't know what error code you got or what's causing your error and with those little bits of information it can go a long way when it comes to troubleshooting and something I didn't mention previously in the videos is you can always run your errors through chat gbt and it will give you at least a comprehensive idea of what might be happening and ways to troubleshoot that issue if you're someone who's already followed my previous tutorial and has sovitz running local and inferencing successfully then this tutorial should be very easy for you and everybody else go ahead and check those tutorials out before you get here because you'll need a proper running version of soviet's SVC just like our last video this is supposedly just a few lines of code that should be super simple to set up and get running but obviously we will run into a lot of issues and require troubleshooting for me it was pretty straightforward the only issue I ran into was the location of my data set but once I had that figured out I didn't have any issues after that so in this video I'll be talking about how to properly prepare and set up your data set where the location of the data set needs to be and then how to set up and run the training and what to do with the path file afterwards first thing that we'll be talking about is how to prepare your data set and the data set is the audio that you'll be using for training because audio needs to be free of background noise and preferably free of any effects or artifacts like Reverb or stereo shaping And Delay the audio files themselves need to be under 10 seconds and I'll be showing you a quick way to both remove vocals from songs and then also to slice them into 10 second Clips I won't be going into depth on how to use ultimate vocal remover because the data sets I'm using are my own vocals so I won't be needing to remove them from any beat as I have access to the clean acapellas but all you need to do is head to the release page and download and install ultimate vocal remover once you have clean vocals you can head to the audio slicer release page and download the zip file here once you have it downloaded just run the slicer gui.exe and it will open up this audio slicer now all you need to do is point it to any audio files you'd like and then make sure you have an output folder otherwise it will defaultly export into the input folder and give you two copies of the same file which can be extremely annoying to navigate so make sure you set your output folder to make sure that the clips come out under 10 seconds I found the easiest way was to lower the maximum silence length to something like 25 milliseconds and at that low it should give you audio files that are 10 seconds or less a good way to check that your data set was successfully cut down to 10 second Clips is to sort by length and as you can see these are all under 10 seconds and for a good data set you want at least 100 samples that are all clean audio samples under 10 seconds more samples work as well but from what I've seen it's more about quality than quantity so 500 super good samples will work really well but 500 mid-tier samples won't even work as good as probably 50 super high quality samples as well as the amount of steps or how long you train for once you have your data set ready you want to place it into a very specific folder structure otherwise it won't work and that folder structure is as follows create a folder called data set and then inside of that folder create another folder called dataset underscore Raw inside of the dataset raw folder create a new folder for whatever the speaker's name will be for me I just put Petro because it's me and then inside of that folder you can just export all of the audio files and with my environment activated I'm just going to copy this address and then type CD and then paste the address and now if you look here you can see that we are located inside of this and whatever this location here is that's where you want your data set folder to be now that I have my data set prepared and in the correct location this is where it should be easy but as we know we can run to a lot of issues the first thing we want to run is SVC pre-resample if you get this error here invalid value for I input directory there is a few different ways to fix it you can use this input directory command and then point to the correct path to make sure that it points to the correct path but for me this still caused some issues later on when it came to exporting so it was easier to just make sure that this location that we're in was the correct location for it to find the data set all I had to do was move the data set raw outside of the initial data set folder and that fixed it for me but it might be different for everybody else but with that data set raw folder moved if I type in SVC we can see it is now pre-processing 110 samples and if the sample number isn't full for you then some of your samples were either too long or not properly named so it could not pick them up as wav files to know if the pre-processing worked there should now be a new folder called data set or that previous data set folder we created should still be there and there'll be a new folder called 44k and then Petro or whatever your speaker name was and all your audio files and that's how you'll know you successfully resampled the audio samples next we run this SVC pre-config command if it looks like it successfully wrote these files but you can't find the location then just check this location here like I mentioned previously and that's where all these files should be located at and if I go to configs 44k it looks like it successfully created this config file next we're going to go ahead and open up the config file if it won't open just right click and hit open with notepad these are where the settings are contained for your training I couldn't find anywhere online even on the GitHub that properly listed all these settings and what they do but I managed to figure figure out a good chunk of them myself inside this config file there's only a few things we need to look at which are log interval eval interval epochs and batch size log interval and eval interval just mean how often before the loss is printed on the screen and how often before the progress is saved and they're measured in steps instead of epochs and basically a step is the number of times it takes to get through a batch size so one step means it processed six files and that means the number of steps to complete an Epoch is completely dependent on your batch size as well as how many samples you're running for instance if your batch size is seven that means one step equals seven files being processed and if you had a sample set of 70 that means it would take 10 steps to process those 70 samples 10 steps in that case would equal one Epoch but it's different for every sample size because the number of steps to create that Epoch are different but if you set the epochs to one it would go through the entire sample once and then finish off on just one Epoch so if you think that's confusing and hard to explain imagine how difficult it was for me to figure it out with limited documentation but to reiterate everything one Epoch equals an entire cycle of all your samples and one step is one batch size so it could be different depending on what you send the batch size there's no set number for how many steps in Epoch is it's completely based on your sample size the bigger your sample size the more steps it takes to create an Epoch a good test run is around 25 000 steps or so and the way I can translate that to epochs is by dividing my number of samples so 110 samples by my sample size of seven and that means 15 steps equals one epoch so now all I need to do is divide 25 000 steps by 15 Steps so it looks like about 1 667 epochs would be equal to 25 000 samples I'm not sure if I explained that very well at all but basically you're going to raise or lower this number to determine the number of steps you train and then you're going to raise or lower this eval interval to determine how often it saves those steps if you're too scared to mess with any numbers just leave your batch size at seven put your eval interval at a high step count like something like 25 000 so it doesn't constantly save and it only saves at the end of the training and then leave your login your full default and you should be good to just press file save and that'll save that config with the config file in place all you need to do now is run SBC pre-hubert you will know this is successful if it starts downloading the Hubert model there are two common errors that I see during this step often the python utility is not installed for some reason and the easiest way to fix that is to install Pi util in which I'll have the command to do so below it's just condo install p-y-u-t-i-l or piutil with that installed it should run smoothly and if you're seeing a fatal memory allocation error it more than likely has to do with your data set and you have some sort of file in there that is causing the error if you clean out your data set and try again more than likely the air will go away and if not you more than likely do not have enough RAM or vram though as long as you have a 30 series graphics card like a 3060 TI you should be totally fine and you can adjust your batch size lower to help with the inferencing vram with all that said all you need to do is run SBC train dash T and the training should begin it'll download both a d and gpath file for it to begin training on top of and from there it should just begin training my first time training I ran into this error and here's a good example of how to troubleshoot using chat GPT I'm just going to copy and paste the air into chat CBT and tell it that I got the error while training in soviet's SVC and with just that little bit of info chat gbt starts kicking out Solutions and it came to the same conclusion as me which was insufficient memory and to reduce the batch size it also gave us three other suggestions just in case that didn't work to help us troubleshoot and figure out the real cause of our issue if for any reason your training stops mid-session all you need to do is run SVC train dash T again as long as your files are in the same locations it should start training right off of the last iteration just like previously once you have your models created all you need to do is point the interface to where the models are located at so just click browse and point it to the path file and config and point it to the config file and if you just recently trained a model I believe it will open up with these models loaded by default if you're still coming up with an extreme amount of errors or this is too difficult for you like I said head to the AI World Discord where not only will people be troubleshooting but there's a lot more straightforward and simple methods than running sovitz SVC locally on your own machine I truly hope this tutorial helped you guys and if not I really help the people in the comments and in the Discord can help you as well thanks again for listening so much love and peace foreign
Info
Channel: p3tro
Views: 141,967
Rating: undefined out of 5
Keywords: so-vits-svc, ai audio, ai juice wrld, ai vocal tutorial, so vits svc how to, ai audio tutorial, how to do ai vocals, fixing so vits svc, sovitssvc tutorial, so-vits-svc tutorial, how to fix so-vits-svc, sovitssvc, so vits svc, how to so vits svc, how to so-vits-svc, so-vits, svc, ai vocal local, train ai vocals, train so vits svc, so vits how to train, ai vocals how to train, train model svc, so-vits-svc training tutorial, how to train so vits svc, how to train model ai vocals
Id: MDCXJY2zAmE
Channel Id: undefined
Length: 10min 39sec (639 seconds)
Published: Tue Apr 11 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.