The Spectrogram and the Gabor Transform

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] welcome back so I'm really excited in this lecture to tell you about something called the Gabor transform which is used to compute the spectrogram so this is part of our section on the Fourier transform so essentially we're used to thinking about the Fourier transform as taking some signal either in space or in time and writing it in terms of its frequency components okay so for example I can take a audio signal and I can Fourier transform it and I can pull out what are the kind of individual frequency components that that add up to make that audio signal okay and I'm just gonna draw some pictures here so imagine I have some time series data I've got some you know maybe this is an audio recording of you know I have a microphone in some Symphony Hall and I'm listening to you know someone play a piano okay and what I can do is I can Fourier transform this signal and in the frequency domain now so now this is a function of some frequency with units of Hertz the side units of seconds maybe I can see spikes Fourier spikes in the power spectrum corresponding to individual tones on on that piano so maybe each individual key will come as a spike in this frequency with with power determined by how much kind of that that key was played in that song okay but this gets to the heart of the spectrogram and the Gabor transform is this idea that if I look at my signal as a function of time I have precision information about where in time I am but I don't know what the frequency is at that instant in time okay it could be it's some combination of frequencies similarly when I Fourier transform I have exquisite information about exactly what frequencies were played in that song but I don't know when you know maybe this is the see you know key or you know a low C or something I know exactly what frequencies were played but I don't know when they were played in the song that's a big problem so in time domain I know where my signal is in time but I don't know anything about the instantaneous frequencies here I know everything about what frequencies were played but I don't know where they occurred in time this is what the Gabor transform is going to allow us to figure out so the Gabor transform the Gabor transform is going to allow us to compute something called the spectrogram the spectrogram which essentially is a time frequency plot of what frequencies are active in specific instances in time so it's kind of a map evolving in time of what frequencies are being played I always think about this in terms of you know the piano or a guitar music right so the music has frequencies that are that are evolving in time and the skateboard transform is going to allow us to pull out both the frequency content and the temporal information of that signal okay and this is really really important is this idea that the Fourier transform really makes sense for signals that are perfectly periodic something that is completely periodic and repeating and audio signals are not like that they have frequency content but one second of audio is rarely going to be periodically repeated in the second the third the fourth the fifth seconds so someone's voice speaking or someone playing you know a symphony or some song on a piano it has frequency components at each instant in time but they're not periodic in time okay and so so this is what the the Gabor transform is going to give us and so what we do in the Gabor transform is instead of computing the Fourier transform over the entire temporal domain what we do is we take a little Gaussian window some some fixed width window and we convolve our Fourier transform with this little Gaussian window sliding across our signal basically we are computing a waited for you transform only in this window and removing that window across we're sliding that window across the signal in time and so the result of this Gabor transform is we will have a time frequency plot okay so we'll have some resolution in time and we'll have some resolution in frequency and so for example if I take this first Gabor window I basically compute the Fourier transform of this little section of my data and I get some you know maybe these keys are being hit on the piano and then as I slide this forward in time now maybe there's different frequency content in the next instant maybe now there's some different frequency content these keys are being played and so on and so forth as this evolves in time you see this kind of different pattern of frequencies that are evolving in time okay so each vertical strip of this spectrogram so this is my spectrogram and each vertical strip is basically a power spectrum for a little short window of time and I have this moving window this is sliding across giving me a sliding power spectrum of what frequencies are being played as time evolves so this is really useful this tells me not just what frequencies are active in my in my system but when they occurred okay so it's somehow kind of the hybrid between these two pictures we have some temporal resolution and some frequency resolution now you can't have as much frequency resolution or as much time resolution in this spectrogram and so there's always a trade-off based on how wide this window is kind of how much resolution you have in time or in frequency okay but this can be pretty useful so for example you know if I if I make some noise I start with a low frequency mm and then I increase the frequency mmm you can see kind of that that temporal evolution of that frequency going from low to high in time okay and that's really really cool and you can only get that using this kind of Gabor transform which which gives rise to the spectrogram okay now mathematically this is actually really really simple it's basically built on the Fourier transform so you can use the fast Fourier transform to compute this and what you do is if you had some signal F let's call this F of T then your Gabor transform of F is going to be this function f hat sub G this is just telling us it's a good Bohr transform and it's going to be a function about time and frequency so unlike the Fourier transform this is going to be a function of time and frequency so it's a you know kind of a dual function and it's going to be given by basically the Fourier transform so integral from negative infinity to infinity of f of tau e to the minus I Omega tau but now I'm going to multiply it by this little Gaussian Gabor window here this little Gaussian function G of tau I'm gonna multiply it by G of t minus tau D tau okay so this Gabor transform of F is basically just the Fourier transform of F but weighted by this little Gaussian window that's sliding across okay so I get something that's a function of frequency because I'm Fourier transforming but that is in this sliding window that's pinned to this particular time T so now I have something that's a function of both Omega and T in this in the spectrogram okay so this is the Gabor transform it's just a Fourier transform weighted by this little Gaussian that is sliding across in time and so it gives you some resolution of what frequencies are active but it also gives you some resolution of when those frequencies are active in time so it's this kind of this time frequency diagram very very useful Gabor transform and spectrogram okay so I think this is an extremely powerful method for analyzing data okay so oftentimes when you get a time series of data maybe I put my phone on the hood of my car and I'm listening to it do its thing you can use the spectrogram to pull out features in that data so I have colleagues who use the spectrogram to classify sounds in the ocean okay so there they're doing measurements in the ocean and on a boat and they want to tell if someone you know dropped a chain on the boat or if some you know a whale just swam by or if you know if something is happening it usually shows itself as some kind of evolving frequency signature in time in the spectrogram so oftentimes when you want to classify audio signals you want to take your signal and not just transform to the Fourier series or the Fourier domain you want to transform to this spectrogram domain this is this Gabor transform okay this is how is am can classify music so for example if you are listening to a song and you want to know what it is you hold your phone up with Shazam what it does is it basically finds peaks in the power spectrum and it tries to match this kind of sparse template of peaks to a library of known songs and this is kind of interesting I didn't know this until until recently when you listen to songs on the radio they can stretch them out or shrink them by you know maybe 10% faster or slower and the human ear has a hard time recognizing you know those those small changes speed ups and slow downs but the radio stations will stretch out the song or shrink them to fit a slot between the the commercial breaks and so that makes it a little bit harder for the shows an algorithm to match these Peaks and they have to do something a little bit more clever figuring out if I slowed my song down or sped it up how would this spectrogram scale you know in time and frequency so that's kind of an interesting problem that the Shazam out or them uses okay you can also for example classify you know if I have some song being played I can so classify was it played on a guitar or a piano was it played on an electric guitar or an acoustic guitar because you know these individual notes being played might look different they might you know have a little bit of a trill or some signature of what instrument they were played on or even who played those notes so I would pose to you that if you wanted to for example classify you know different musicians so you want to see if it was Jimi Hendrix playing this song you might be able to do that by pulling features out of this spectrogram okay that's pretty interesting and so I'm going to show you how to code this up soon we're gonna look at some examples of how to do a chirp how to do the spectrogram of an audio signal but I'm gonna pull up one of my favorite videos this is actually a YouTube video from from Jan Kay so I hope it's okay to to show this I think this is a great video that they made it's one of my favorites this is a spectrogram of Beethoven's sonata and you can just see it playing here I'm going to hit play I'm gonna turn my volume up so you can see the slider moving across the screen and you can see that every note is one of these dark black kind of signatures in this in the spectrogram plot so it's really neat you can actually kind of see when the keys are being played how they're being played you can tell when it's going to get more active or less active and you know I think it's also interesting I always joke with my class you know Beethoven didn't have fifty fingers these are kind of harmonics that are being generated in the piano from these chords and so you can see all of that rich rich structure in this in the spectrogram okay good so in the next couple of lectures I'm going to code this up we're going to actually work with the spectrogram and look at this this time frequency diagram for a few different audio signals okay thank you
Info
Channel: Steve Brunton
Views: 62,486
Rating: undefined out of 5
Keywords: Fast Fourier Transform, FFT, Fourier transform, Fourier Series, Fourier analysis, Hilbert Space, Complex analysis, Wavelets, Machine Learning, Data science, Linear algebra, Applied mathematics, Compression, Python, Matlab, spectrogram
Id: EfWnEldTyPA
Channel Id: undefined
Length: 13min 15sec (795 seconds)
Published: Sat May 09 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.