Let's Build an Audio Spectrum Analyzer in Python! (pt. 1) the waveform viewer.

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what's up everyone welcome to part 1 of a series where we build an audio spectrum analyzer slash audio visualization tool so what you see here is a live stream of audio through my microphone into Python through PI audio and then displayed in real time using matplotlib so in this first video I'm gonna walk you through all the steps to get up to this point getting the audio in and displaying it and in the next video we'll look at the FFT algorithm to compute the spectrum and display it in real time and then maybe we'll try and do some more fancy visualizations maybe do a little bit of signal processing to make some weird sounds and maybe some cool visualization techniques I'm not really sure yet because we haven't got there but I think it'll be a good project so let's get started so before we begin this project we need to see if it's gonna be possible to get our audio in through our microphone and into Python process it and display it in matplotlib fast enough to where it's not like the super clunky slow thing we're looking for something like at least 30 frames per second and ideally more like 60 frames per second and actually it's doable you can work with matplotlib to optimize how you draw everything so you can shorten that time and get really fast updates so I'll be showing you how to do that and the audio part we can use a module called PI audio the PI audio doesn't come with anaconda by default but you can just run Hiep install PI audio and you're good to go once you've done that let's jump over to a new notebook like always we're gonna start with our import so first we're gonna import PI audio we're gonna import a module called struct instruct is gonna let us unpack that audio data into integers instead of these binary numbers and then we're gonna need numpy SMP and then also into employers Matt I live I plot as PLT then for the backend one thing I've noticed is that if we use the inline back end the update is a little bit choppy so instead what I used is a different back end which is just gonna display a separate window sort of like how you see when you run Python in your text editor like sublime or add them so the back end I'm going to use is mad lib TK so this is the one that will do just what I said display a separate window for us all right now I wanted to find some variables so the first one movie called chunk and this is how much audio gonna be processed at a time how many audio samples per frame we're gonna display and what I'm gonna do is one or 1024 I'm gonna do four of those so we're gonna have 4096 samples per chunk all right and the next thing we're gonna do is format which is gonna be my audio dot PA equal int 16 so this is I think it's like has to do with the bytes per sample something like that and then for channels since we're using the microphone there's just gonna be one channel or mono sound and finally the rate and this is gonna be the samples per second and the most common is 44.1 kilohertz alright so we'll go ahead and run that alright now let's get down and show you what exactly pi audio does so I'm just gonna I'll go this toolbar out of the way and the first thing I'm going to do is create a class instance called P and it's gonna be equal to pi audio dot FL o py capital a audio so this thing is our main PI audio object next thing I'm going to do is create object called stream and this is going to be equal to P dot open and we're gonna pass it some parameters so the format gonna be equal to format channels gonna be channels you rate will be equal to great input to output also equal to true and then finally we've got this thing called frames or pressure equals chunk all right so that's our stream object and now what I want to do is show you exactly what this what the audio looks like when we get it returned so what I'm gonna do create an object called data data is just gonna be string read and we're just gonna read one chunk and then let's go ahead and display what data is so now when I run this you can see I get a bunch of bytes we're not getting integers we're getting bytes and this means that we can't just plot this in matplotlib we're gonna have to convert it so let me show you how to do that using struct so the way we convert this is I'm gonna create a new thing called data int because it's gonna give us integer data instead of byte data so the thing I'm going to use is called struck unpack so the way unpack works is first we give it a string which signals how much data or what the size of this data is so it knows how to unpack it and then we're gonna pass it the data itself so we're gonna need to pass it a string and it's gonna be two times chunk and then we're gonna also have to add B capital B at the end and then we're gonna pass it the data and the reason why it needs to be two times chunk is because let me just show you what the link AB data is so you can see the length of it is double what our chunk is so our chunk was 4096 and length of the data is double that that's why we need to do two times the chunk rule so let me just get rid of that and now let me show you what this data it looks like so now you can see where we're getting a tuple and it's all integer values from 0 to 255 so yeah now let's go ahead and plot this just to show you what one chunk of audio data looks like all right to plot this what I'm gonna do is let's just create a graph object so it'll be big it's equal to pl T doc subplots and then what I want to do is ax top plot and we can just pass it data int and let's just make it a line and then we'll do the LT got show alright so you can see that the TK backend will actually create a separate window for us but this is what the one chunk of audio data looks like and one thing I noticed is it seems to be like split or clipped instead of it being smooth there's this like discontinuity between 0 and 255 so what I want to do is sort of take this part the bottom move it into the middle and take the top part and move it down below but that way it looks correct it looks like a waveform this is not something about how this thing being read maybe the PI audio integer 16 that I passed it I'm not quite sure I've tried playing around with them and they don't seem to work but I'm gonna show you the technique I use to sort of chop and restructure this audio so the way I'm gonna do that manipulation is by first converting our struct data into a numpy array so I'm just gonna do NP dot array and wrap our struct data in it now what I want to do is specify the D type and I'm gonna say it equal to B and what B is is an integer from 0 to 255 and the reason why I'm gonna do that is because when I add half the range or 127 to it now those values that are gonna be greater than 255 they're just gonna wrap back around and they'll go to like 0 or something less than that so what I can do is display this and now you can see that our audio data looks more correct so let me just sort of zoom in on it you can see that it's we have like a center point and we have the like the data move from the top down to the bottom but the next thing I noticed is that every other sample is going to 0 so it's going up to the value it should be but then it's coming back to 0 for some reason and this is not what we want so now what I'm going to do is show you how to basically slice this array to get rid of those points and it's pretty easy with numpy we're just gonna do colon colon 2 so that means take every other point in the array and throw the rest of it out so now when I plot that you can see we get a waveform that looks more normal you can see that the waves are not coming back to 0 every time or the sample is not going back to 0 every time so this is what I think the audio should look like maybe I could have done it better by using a different PI integer or PI audio integer but this is how I did it if someone can explain a better way to do it then you know feel free to do so in the comments below alright so next thing to do is display this audio data continuously in the matplotlib plot so the way we do this and a somewhat fast or more optimized way is instead of every time creating the graph object displaying it and then clearing it before we start what we're gonna do is create the line and then we're just going to update the line in a loop so this is similar to what I did with the animated matplotlib plot in my jupiter notebooks tutorial series but what I'll do is refer you to this article where this guy he looked at ways of optimizing matplotlib to get the most refreshes per second so the one that we're gonna use is this one here so he claims to get 40 frames per second when I tested it I was getting about 70 or 80 frames per second and he does have another version which he claims to get 500 frames per second but I tried this a lot but I just could not get it to work so not sure what I'm doing wrong but I went with the next best one here which if it's producing 60 to 70 frames per second this I think is good enough so now let's jump back over what I'm going to do is just copy this info here and what I'm gonna do is create a while loop so a while true I'm going to paste this here and then I'm going to move this stuff into the while loop and then also I'm going to take this move it up to the actually no we don't need that anymore so and then we also don't need this PLT dot show so and then what I'll do is replace this line data with this all right so now what I want to do is create an X variable that will be used for plotting so X is gonna be equal to NP dot arrange and we're gonna go from zero to two times chunk and then what we pass it is what's the step size so our step size is gonna be two now and you can see here we're using a range instead of linspace all right so the next thing I want to do is create our line object so line is gonna be line with a comma at the end it's gonna be equal to a X dot plot and we just need to plot some data that has the right length and it doesn't matter because it's gonna get updated in the loop what I'm just going to use is some random data and it's just gonna be chunk length long and the reason why it's gonna be chunk length long is because we're doing the slicing here so even though our struct starts with two chunks in length after the slicing it's gonna be half the size so one chunk link and that's why this is just a single chunk so now that we've got this I think we are ready to go cool so you can see here's our audio but mmm for some reasons going from zero to one oh I think I know that's because that XE limit was not set correctly so when it did this first plot it was not over the right range so what I'll do is just do a X button set excellent and let's make this thing go from zero to 55 and why don't we do the young she actually it should be while in and let's do the excellent this is gonna be from zero to chunk all right so now let me go ahead and resize this thing so now you can see we've basically got a nice waveform viewer in matplotlib you can see it's it's updating pretty fast I'm not sure I happen to measured exactly what this is producing but it's not choppy it's not laggy there's a little bit of latent latency that has to do with my audio interface so it's a few milliseconds and it is noticeable but maybe I'll try tweaking with the settings to bring that down but why don't I show you let's just do a tone generator and let's see so let's just do 1000 Hertz so you can see I'm just playing through my microphone or through my headphones we can lower this get a slightly lower tone so yeah that's it in action so far I think this is a good starting point we've got the audio in we're processing it to display it correctly and we're getting matplotlib somewhat optimized optimized enough for our uses so that's gonna do it for this video in the next one we're gonna actually use the FFT algorithm to build our spectrum analyzer and maybe in future videos we'll try and do some cool visualization maybe I don't know try and make something that you see in your like iTunes visualizer probably not as cool but we'll try our best so yeah if you liked the video hit the like button and if you want to be back for more videos hit the subscribe button thanks guys see ya [Music]
Info
Channel: Mark Jay
Views: 171,429
Rating: undefined out of 5
Keywords: spectrum analyzer, stream audio in python, waveform viewer, pyaudio, rasberry pi, signal processing, high speed plotting, matplotlib, real time streaming, jupyter, anaconda, numpy, python3, fft, sound, electrical engineering, algorithms, machine learning, deep learning, electronics, arduino, project, scipy, filters, digital signal processing, python, streaming, real time, filtering, kalman
Id: AShHJdSIxkY
Channel Id: undefined
Length: 16min 12sec (972 seconds)
Published: Sat Sep 09 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.