How to make an Audio Visualiser in Godot 3.2

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Ah yeah that is super fun!

👍︎︎ 2 👤︎︎ u/CarbsCode 📅︎︎ Jul 13 2021 🗫︎ replies
Captions
so a while back growl engine released their 3.2 alpha one and it comes with a bunch of new features like this is the this is the full list of course we don't have time to read through everything but here I've highlighted once says support for generating audio procedurally and analyzing audio spectrums instantly I thought of audio visualizers now this isn't an original idea because someone called bauxite deaf has already made this in ghetto 3.1 but that is using c-sharp and you have to download the code off github and apparently have to install another package using nugget or something like that but this 3.2 update is what enables us to use features straight out of Godot in order to make our audio visualizer run linetsky made a post going over the audio features that they added and it goes over how to make an audio visualizer that looks like this but the code here is really basic and not that much of it is explained and also the visualizer made here with this code is simply just not a good visualizer and you'll see why in a second a good visualizer requires a couple procedures to change around the values a little bit in order to accurately represent what our human ears hear and one thing that I learned a lot from while making this project while some of these tutorials on how to make a visualizer in after-effects there's a lot of them on youtube but specifically I watched one by parallax music I'll leave a link in the description and uh yes this is the perfect opportunity to shout out this music channel that I made a while back I've been trying to learn some music production so I could possibly make EDM songs film soundtracks maybe even video game soundtracks but yeah this first song that I made was really garbage and the second song that I made was less garbage but still really garbage I'm about to upload a third one as well and if you're interested in what a total beginner and music production can do go check this channel out perhaps you could get inspired to learn some music production as well and you could make tracks for your video games these cover arts are pretty funny as well I just go into paint.net and draw something really abstract with the basic brushes and then I just put it as a video and upload it if you enjoyed this video or learn something new from be sure to leave a like this did take a lot of effort in time and if you want to see more videos like this in the future perhaps also subscribe to the channel as well so without further adieu let's start making this first of all in order to get get a 3.2 it's not available on the download page there's actually a link on this article down here just click on classical build and here you'll see a list of files available to download so the song we're gonna use for our audio visualizer is a song from NCS called how do you know by Arlo and it sounds like this [Music] and I'll skip to the drop to show you how it sounds yep so I have a project open here with this song downloaded as a WAV file remember that Godot can only use OGG or WAV files I'm pretty sure mp3s don't work but to start our scene let's click other note here and add an audio stream player and drag our WAV file into the stream by default this is way too loud to play in this tutorial so I'll just set it to negative 20 decibels and if I click playing then you can hear the song play like that so pretty much the way this works is all the audio played in Godot goes through an audio bus by default all the audio goes through the master bus but you can also add other buses to routes different elements of your game to that and what we're gonna use is an effect for this master bus if we click add effects here you can see a bunch of effects and at the end there's one called spectrum analyzer this is the new feature in get out 3.2 which you won't find in 3.1 so when our song plays in Godot it goes through the master bus it goes through the spectrum analyzer effects and all the other effects will change up the audio in some way but this spectrum analyzer effect doesn't change the audio it just takes the information from the audio to allow us to make a visualizer for its so under our audio stream player let's add a new node 2d in order to draw stuff on the screen let's call this the visualizer and make a script for it in our script we're gonna want to access that spectrum analyzer effect in the master buss so let's make an unready variable called spectrum and sent that to audio server dot gets the boss effects instance and the first input we're gonna put is the boss ID which is which boss we're trying to get the effects from and since it's the master boss let's set 0 and the second input is the effect ID which is which is which effects we're getting from the bus that we've chosen and since we only have one effect let's set that to zero as well now we're gonna make a bunch of variables first of all it's our definition which is how many different bars we want each representing a range of frequencies for example from 0 to 20,000 Hertz if we have a definition of 10 then we're gonna have 10 ranges of frequencies from 0 to 2000 2000 to 4000 etc but if we have 20 then it's 0 to 1000 1000 to 2000 etc the higher the definition the more bars we'll get but the narrower the range of frequencies that each bar will represent let's have a definition of 20 we also want a width and height for our visualizer to define how big it is let's do total width equals to 400 and total height equals 200 now we want the range of frequencies that we're analyzing and the range of frequencies that's audible to the human ear is 20 to 20,000 Hertz although a lot of times when you see for example trap nation with their visualizers they don't look at the whole spectrum they only look at the very low end in order to pretty much only visualize the drums and the bass the trap nation wants to only look at everything below 200 or 150 Hertz or something like that I'm pretty sure but for our one let's just do the full hearing range so minimum frequency equals 220 and maximum frequency equals to 20,000 now we need an array to store the loudness of each frequency band so let's do variable let's call it a histogram since histogram is basically a bar graph that spans over a range of values just like an audio visualizer and set it to an empty array of course we need the size of the histogram to match our definition because if we have 20 bands we want 20 values in our array so and the ready function let's go for I in range definition histogram dot append 0 you could just use histogram dot resize and resize its length to whatever the definition is but that makes it so even though it has the right length all the objects inside the histogram are in all instances and you won't be able to access anything in there so by appending a bunch of zeros into it it makes it so the values all start at 0 which is what we want and we're actually able to access and change them this will only run once when the game starts so that'll set it to the right size now what we're going to do is we're gonna set the values for the histogram in the process function which runs every frame and then we'll have a separate draw function in order to draw out all the individual bars the reason why we're doing it in two separate functions and having an array to store it in the middle is so that we can smooth out the movement of our visualizer in the future and you'll see why in a second so in the process function every frame we want to use our spectrum effect to access the volume of each frequency band in our song so since we're gonna go from one frequency to another let's make a variable to store what current frequency we're on and it will start at the minimum frequency and the interval that will increase it by is the difference between the maximum frequency and the minimum frequency divided by our definition for I in range definition let's make a variable called mag standing for magnitude and we'll set it to spectrum get magnitude for frequency range which is at the bottom of the range to frequency and the top to frequency plus and interval now by default Godot gives us the magnitude in a very weird way it is a vector - that represents how much linear energy the audio has and I don't know what linear energy means but we got to convert that to decibels so let's set magnitude to linear - DB magnitude dots lengths and I guess this is a reason why dog typing is convenient because look for all the variables we just write in variable we don't define whether it's an int a float a vector - so we can literally just change the variable from a vector - to a float just like that anyways once we get that let's set histogram bracket I to that value now with this bit of code we've gone through all the ranges of frequency in between 20 and 20,000 and stored all the values in our array now let's make a draw function which is good as function for drawing shapes and lines on the screen and we'll make a variable draw a position have that start at vector - 0 0 and then of course since we've got 20 bands we will increase that by 1 xx of the total width every time variable width interval equals the total width divided by definition now let's make another for loop draw a line we'll start at our draw position and the end point is our draw position plus a vertical offset so we'll add a vector 2 onto it with 0 on the X axis but on the y-axis we will have whatever the magnitude of that current band is which is histogram bracket I but we're gonna make that negative since we want allowed to go upwards if it's louder and upwards is negative of Y now we need a color let's just go color dot and let's pick a color let's go color chocolate that's an interesting color we need the width let's set the width to 4.0 and whether it is anti aliased yeah why not if we just leave it like this it won't actually run the draw function every frame it'll just draw it once at the start and then that will be the end so at the end of the process function let's call updates oh and we also have to set the audio stream player to autoplay so that if we run this the song automatically starts playing and we're not seeing anything oh I know why because our visualizer thing is positioned at the top left and since it's drawing everything above it we won't be able to see anything so if we run this we're only getting one bad for some reason oh it's because I forgot to increase the draw position oh and I forgot to increase the frequency as well man how could I forget this so yeah in each for loop after every iteration will just go frequency plus equal interval and in the draw one let's go draw position X plus equal width interval in theory now we should get yep we're getting a bunch of bars but the bars are pointing downwards for some reason and we did set this to negative so in theory it should be pointing upwards unless the value is negative basically audio levels in songs are always negative something decibels if you're making a song you pretty much want it to stay below zero decibels at all times and if it goes above zero decibels it's probably gonna be clips and your song will sound horrible and the other thing is here where we set the length of the line to draw we didn't multiply our magnitude value by any length or anything we just straight-up used it and as you can see those lines were pretty long so the magnitude is literally like negative 100 decibels or something in order for those lines to be that long so how do we take care of these really low decibel values so here I'm gonna draw a graph of different volumes over different frequencies the frequency will be the horizontal axis let's go from 20 to 20,000 over here and the volume let's go from 0 up top to let's say negative 100 pretty much if you just find a modern EDM song the volume will go something like this where different parts have different loudness 'iz but from the loudest bits to the quietest bits it's usually from around negative 20 decibels to negative 50 decibels so here I'll play our current song how do you know and I'm gonna run this analyzer plug in in this music production software and as you can see the graph is now from negative 18 to negative 78 let's set the top to zero and as you can see the top bits are at negative 24 negative 23 negative 22 negative 26 and then the bottom bits are like negative 42 negative you know it goes up and down but a good general value is around negative 50 as the bottom value and if we were to listen to candy land by tobu it's a similar story goes from like negative twenty four to forty seven or so let's have a look at spectra by Allen Walker it is negative twenty three or so two negative like forty eight so as you can see negative twenty to negative fifty that's a good value range although if you've got an older song it usually is quieter so watch out for that for example faster the people the top it actually tops out at negative forty something so yeah songs I've just gotten louder of the past decade pretty much so as I was saying this goes from around negative twenty to negative fifty or so if we were to just look at the whole thing from zero to negative one hundred it's gonna be really boring because the thing is only gonna change within this narrow range up here so it's just gonna be like a large block with the bottom staying the same at the top bobbing up and down a little bit if we go from negative twenty to negative fifty that'll be the best case the volumes will pretty much move over the full range of our graph and we'll get a nice and interesting thing so let's make variables for the top and bottom of our range maximum decibels equals to negative twenty minimum decibels equals to negative fifty actually in order to just be safe and make sure everything stays within that range let's make it a bit larger let's go from negative sixteen to like negative fifty-five and for our magnitude we're not just gonna use a simple decibel value we're gonna make it a variable that represents where the volume is in between our range of negative sixteen to negative fifty-five so zero will be negative fifty-five one will be negative Dean and 0.5 will be whatever is halfway between these two variables so magnitude equals to magnitude minus minimum decibels divided by maximum decibels minus minimum decibels and now if we set the histogram to magnitude then ideally is going to be between 0 & 1 so we're gonna have to multiply that by a length or else it's just going to be 1 pixel tall let's multiply that by total height and while we're at it let's draw a line to represent the top of our range so that'll represent negative 16 decibels and it's coordinates is 0 negative total height to total width and negative total height now if we run this we can see we have a line signifying where the top of our range is and this is the bottom of our range as you can see if the volume is below the bottom of our range it'll just be a bar pointing downwards but what's happening why is most of the stuff below our bottoms aren't songs usually meant to be between negative 20 and negative 50 now that is because in the audio stream player we set the volume to negative 20 decibels so in all ready functions let's just do maximum decibels plus equals get parent volume decibels get parent is of course the audio stream player and that's just its volume do the same for the minimum decimals and in theory now we have adjusted them correctly according to what we set the volume to now as you can see we've got a bunch of action at the top here and only a bit of the frequency range is going down the bottom there's some of it going below the bottom and some of it going above the top but if you look at a video with an audio visualizer in it they never show what goes below the bottom whatever is too quiet to be in that range is just sitting at 0 it's just a straight line so here let's go magnitude equals to clamp magnitude and we're gonna clamp it between naught 0 and 1 because if it's 0 then it's just gonna be nothing we want it to have some sort of minimum value so that we actually see the line there and it doesn't just disappear let's go 0.05 and and now if we play as you can see we get a much cleaner audio visualizer now the next problem that we have is if we play that again you'll see that most of the action is happening to the left side or to the lower pitches now why is that because if you look at a bunch of audio visualizers there are two sides are usually pretty evenly distributed with most of the action happening in the center and that is because we are currently representing the frequencies linearly but we shouldn't do that because the difference between say 60 Hertz and 100 Hertz is way bigger than the difference between one thousand and sixty Hertz and one thousand one hundred Hertz if we look at this analyzer plugin and look at the numbers at the bottom it goes from twenty thirty forty sixty pretty big gaps right and then up here all of a sudden you got two hundred three hundred four hundred and then up here you got 2000 3000 4000 so as we go from the left to the right on this graph the frequency differences increase exponentially in fact let's pull up a graphing calculator desmos com and let's write y equals x squared what does this mean the x-axis is the information we currently have which is basically our graph right and for every point on our graph we want to figure out what frequency that actually represents currently we do it linearly so that's pretty easy right for example if we have a range between 100 and 200 Hertz then halfway through the graph that's obviously gonna be 150 but we don't want to do it linearly so we want something like this here the y-axis represents the frequency and as you can see the lower frequencies do have bigger gaps this whole area right here is used to represent from zero to zero point two but from zero point eight to for example one it's only this little tiny bit over here and the majority of the physical graph is used to represent the bottom half of the frequency range so we can use that to get a better visualization of our audio and what we're going to change is here instead of getting the magnitude between just simply frequency and frequency plus interval first we need to figure out how far this value is in between our minimum frequency and our maximum frequency for example if it's halfway in between then we want to get the value 0.5 let's make a variable called let's say frequency range low will set it to frequency minus minimum frequency divided by maximum frequency minus minimum frequency and make sure you put float around both of these or else it'll just be rounded to 0 or 1 and it won't work now we'll do what we did on the graph and just set that value to itself squared for example in our range from 20 to 20,000 if this is halfway between those two then here we're gonna get the value to 0.5 since it's halfway and then now that we square it it'll become 0.25 but then we have to convert that 0.25 back into an actual frequency value so let's set it to linear interpolate will interpolate from minimum frequency to maximum frequency and use itself as the factor now let's get the higher bound for our frequency range so that'll basically be this but we're applying it to frequency plus the interval and not just frequency what we'll do is right after this we add on the interval to the frequency and since we did it here we don't need it down here and then afterwards make another value this time change to a frequency range high now we'll change the range for this get magnitude function from what we have to frequency range low and frequency range high what we should see is that all the action is not as clustered at the bottom range now is now a bit more evenly spread out and it looks better although for now it's still pretty bunched up towards the bottom so we're not only going to square let's make it to the power of four I've tried to the power of two to the power of three and to the power of four I just find to the power of four to be the best if we run it now as you can see it's pretty much evenly distributed now now let me just increase the definition to eighty and see how that looks well what's happening as you can see the bottom bits have really chunky blocks that are all the same volume while the top bits are much more detailed the reason why in the low end there's such big chunks is because of what we did the lower frequencies now have much larger gaps and grow spectrum analyzer only supports up to I think a definition of 20 Hertz so within a range of 20 Hertz for example from 0 to 20 if you want to cut it up into even smaller pieces for example 0 2 2 2 2 4 then all of those pieces will have the same value no matter what because get a spectrum analyzer only supports that level of detail so that does result in chunky blocks in the low end when you have a high definition but usually we won't have such a high definition anyway so we don't have to worry about that I'll set the definition back to 20 and a another thing you might notice is that so the higher frequencies are always still quieter than the lower frequencies the reason for this is that the human ear doesn't hear all the frequencies equally generally higher pitched sounds will sound louder to us and lower pitched sounds will sound quieter and in order to make up for that and make everything sound like it's the same loudness when people make music they tend to boost the bass hence the downward slope that we're seeing so what we can do is apply another artificial slope on to it again in order to even it out and to represent the volume as our ear hears it not as it's actual volume if we were to look at for example candy lands with this visualizer plugin there's actually a setting called slope and it's currently set to 4.5 if we were to set this slope value to zero then as you can see the spectrum is actually sloping downwards just like in our song what this slope means when it's at 4.5 is that for every octave which is a doubling of the frequency for example 400 is an octave above 200 and 800 is an octave above that then for every octave it'll actually boost the decibel value by 4.5 that's why even though the actual volume is sloping downwards like this when we visualize that we can see it'd be pretty even so we can actually do something similar in our code we'll just take our magnitude which is now a value between 0 & 1 and add on a small value for example 0.3 multiplied by how far our frequency is in between the minimum and maximum frequency so if we're at the minimum frequency this will do nothing it'll just add 0 but if we're at the maximum frequency we'll add the whole 0.3 onto it if it's anywhere in between then we'll add correspondingly be sure to do this before you clamp the value since once you clamp it everything will at least be 0.05 and adding on this will just make everything at least go up to 0.35 if we do this before the claping then if there's any part of the song that's too quiet it's still too quiet so if we were to run our game now after apply this change and as you can see there isn't that much of a downward slope any it's pretty even throughout however I find that adding on a slope to our visualizer requires us to change around the maximum and minimum values since we've basically increased the volume of everything I find that a good range that works is actually just from zero to negative forty so at this point we've pretty much finished our visualizer we've got all the important parts and we can visualize the audio in an accurate way to our human ears hearing let's just add some finishing touches such as making the bars go up and down a bit smoother the way we do that is down here instead of just setting the histogram values to whatever we calculate immediately we can make it accelerate towards that let's make a variable called acceleration set that to let's say five for now and here let's replace this with histogram bracket I equals two linear interpolate and we interpolate from itself to the magnitude that we've calculated here and the factor will be acceleration times Delta if we do that and see wall our bars move a lot smoother it's not as messy and flickery as before and it was a lot nicer to be honest if you want to make it a bit snappier and not that slow you can just simply increase this acceleration value I prefer a value of 20 but obviously you can change it around to your liking this is what's once it looks like [Music] the last thing we want to do is let's make a circular audio visualizer let's get our current horizontal visualizer duplicate that all the visualizer around and then in the inspector under scripture we want to make a new script right so go make unique and then now this script isn't saved yet so let's go to the top right and click Save the currently edited resource visualized around Gd and now we can change this script without affecting our original one we're gonna change is in the draw function instead of having this bit of code I'm just gonna paste in the code that I have where instead of a draw position we now have an angle the reason why angle is set to PI is because Godot takes angles in radians and I want the visualizer to start at the bottom of our circle so that's a 180 degrees down and 180 degrees just so happens to be PI radians the interval will be 360 degrees divided by our definition and 360 degrees is just two pi radians the radius is just how far from the center our bars will start and the length is how long the bars will be if they have a value of one because remember the values in the histogram are between zero and one in our for loop every iteration will have a normal value which just represents the direction that our bar will point in it's just a vector to be pointing upwards rotates it by our angle and remember our angle will increase by our interval every time our start position is the normal multiplied by our radius and our end position is the normal multiplied by radius plus the histogram value times the length in brackets and then we draw it out so that's our final products I'll just show the visualizers and let the song play in its entirety [Music] [Music] [Music] [Music] [Music] [Music] [Music] [Music] [Music] what to do from here on is honestly your choice for example I made a couple changes I changed the top one to a crimson color I added a Godot logo why not make the bottom one blue as you can see the possibilities are endless if you learn something from this video or enjoyed it please leave a like or maybe even subscribe for future tutorials like this but until my next video I will see you next time
Info
Channel: Gonkee [OLD]
Views: 7,023
Rating: undefined out of 5
Keywords: gonk, gonkee, gonkmakesvideos, godot, shaders, audio, spectrum, visualiser, visualizer, trap nation, ncs, effect, after effects
Id: AwgSICbGxJM
Channel Id: undefined
Length: 32min 47sec (1967 seconds)
Published: Fri Oct 18 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.