Denoising Data with FFT [Python]

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] welcome back so this is our warm up lecture of how to use the fast Fourier transform in Python and so in this lecture what I'm gonna do is create a really simple dataset which is just the sum of two sine waves of different frequencies we're gonna add noise to that dataset and we're gonna use the fast Fourier transform to pull out kind of the the structure and the data and D noise okay so in this example we're gonna see what the fast Fourier transform is how do you apply it to data and how you actually do that in Python okay so this is the code the first thing we're gonna do is we're gonna create our signal with two frequencies so we're gonna have a time step of 0.001 samples per second and we're gonna have a data set that goes from time 0 to time 1 which is the sum of this sine wave with frequency 50 Hertz and this sine wave with frequency 120 Hertz okay so we're gonna add up those two pure-tone sine waves and then what we're going to do is we're going to take that clean signal F and we're gonna add a bunch of random white noise to it okay and in fact this is actually quite a lot of random noise so if I add up two sine waves then kind of the maximum variation I can get as plus or minus 2 I'm adding white noise with magnitude 2.5 so this is a lot of random noise and now I'm just gonna plot both the clean signal and the noisy signal here so you can see here this is a data set in time and this is the signal in white here we have the clean two tone sine wave with just 50 Hertz plus 120 Hertz and red is the noisy data once you add that Gaussian white noise on top okay and so we're gonna play a game we're going to act like someone sent us this red data and didn't tell us anything about the low dimensional structure that this was just two sine waves added up so we're only going to get the red data and we're going to have to infer kind of what the clean D noise signal would be using this fast Fourier transform okay so the first thing we do is we actually compute the fast Fourier transform okay and in in pretty much every programming language this is super simple because it's so fundamental and you want to use it all the time they've made it as simple as possible so in PI you just run this FFT command so numpy fft fft and you give it the data f and the length of that data and how many data points there are and it returns f hats okay so just to remember we have our data F which is an a vector we're gonna run it through this FFT command and we're gonna get this vector F hat Fourier coefficients ok these are complex valued Fourier coefficients that tell you how much kind of magnitude what are the magnitude and phase of the sine and cosine components of increasing frequency you would have to add up to get this data set okay so these are complex values with a magnitude and a phase the magnitude tells you how important that particular frequency is the phase tells you if it's more cosine or sine and and what mixture okay and once you have this Fourier transform now what we're going to do is compute the power spectral density the PSD which is just f hat times the conjugate of F hat divided by N and so I like to think of this if you take if you take lambda some complex number lambda times lambda conjugate this is essentially going to give you the magnitude of lambda squared if this is a complex number I'm just going to write this out for you lambda equals a plus IB so lambda conjugate would be a minus IB and so if you multiply lambda times lambda bar you get a squared minus I squared times B squared a squared plus B squared okay so lambda times lambda bar or lambda conjugate just gives you the magnitude of lambda squared and that's what we're doing here with with F hat so we're computing the magnitude of each Fourier coefficients squared and we're going to get a vector of powers at each frequency okay and you can convince yourself that if this is a function of time then these each each frequency has units of per second these will be in units of Hertz and so you can convince yourself that if I take F hat times its conjugate this will have kind of units of power so this is the power spectral density that we're computing here then I have to compute this vector of frequencies so each entry in here corresponds to a particular frequency from low frequency to high frequency and so I have to actually build a vector of all of those frequencies in this this freak vector and now what I'm going to do is I'm going to plot my power spectrum the magnitude of F squared F hat squared versus the frequency in units of Hertz okay so I'm going to run this code and so we have our data noisy and clean and now what we see is our power spectral density our power spectrum as a function of frequency so please for those of you at home actually label your x and y axes okay this bottom plot is the power spectrum and it has the x axis is in Hertz okay so it's in frequencies and the y axis tells you how much power is in each of those frequencies in the red data okay and so right off the bat what you see here is that even though the signal is noisy the power spectrum has to super-clean Peaks one at 50 Hertz and another at 120 Hertz okay so that tells you that most of the power in this red signal even though it's pretty noisy is in 50 Hertz and 120 Hertz and then there's a bunch of noise in this noise floor that's contributing to the jitter on the data and so what that tells us is that you can actually filter this data so you could actually you could actually draw a line at some value I'm gonna pick a hundred and any Fourier coefficient that's smaller than 100 I'm just gonna zero it out and any Fourier coefficient that's larger than 100 I'm gonna keep it and then we're gonna inverse Fourier transform and try to reconstruct our clean we're gonna try to denoise the red signal and get a clean signal with just these two frequencies okay I'm gonna write that out here what we're gonna do is we're going to filter we're gonna literally take all of our values that have have F magnitude squared less than 100 and we're gonna kill all of those terms so we're gonna get F hat filtered okay these are still Fourier coefficients and then I can take the inverse fast Fourier transform it's literally the IFFT command and I'm gonna get F filtered and this is gonna be my filtered time series it's gonna have units of time okay so that's what we're gonna do we're gonna kill all these small 4u coefficients and then we're gonna inverse Fourier transform and again this is quite easy to do in Python what we're gonna do is get this indices vector by saying indices equals PSD greater than 100 it's literally gonna take that whole power spectral density vector and it's gonna check every element if it's bigger than 100 if it's smaller than 100 it'll give us 0 and if it's bigger than 100 it'll give a 1 so this indices vector will be a big vector mostly of zeros because most of these are smaller than 100 and it'll have two entries of one at the entry corresponding to 50 Hertz and 120 Hertz then what I'm going to do is I'm going to take my power spectrum and I'm gonna multiply it by that indicee vector and it's gonna zero out all of the indices that have small for a coefficient and it's only gonna keep those that have power greater than 100 okay that's going to be this F hat kind of clean here and then I'm gonna inverse Fourier transform to get my filtered signal and I'm gonna plot that for you okay so I'm gonna run this and then finally I'm gonna plot everything together okay good so remember we had this red noisy data we Fourier transformed it we realized that we could get rid of all of the small Fourier coefficients so we manually zeroed those out and we only kept these two white Peaks these filtered peaks at 50 and 120 I didn't tell the computer that it was 50 Hertz and 120 it just picked whatever values were bigger than some threshold okay and then when you inverse Fourier transform the signal you recover this white signal here is the filtered signal that corresponds to what is underneath all of this noise and the original data it's a little hard to see because the the scales are different but this is in fact actually the sum of 50 Hertz and 120 Hertz that's in this white signal up here okay so this is super useful this is just a toy example to show you if you have data you can first compute its Fourier transform which gives you an interesting and intuitive interpretation in terms of the power spectrum this power spectral density function which is just the magnitude of F squared at each frequency this has units of Hertz in the x-axis and then you can use that in this case to filter noisy data by finding whatever Peaks are above some noise floor zeroing out everything below and then inverse Fourier transforming 2d noise the signal okay so this is super useful you can start playing with this right away it's really really easy to compute the Fourier transform the power spectrum and start playing around with data okay thank you
Info
Channel: Steve Brunton
Views: 72,086
Rating: undefined out of 5
Keywords: Fourier transform, Fourier Series, FFT, Fourier analysis, Gibbs Phenomena, Hilbert Space, Complex analysis, Wavelets, Machine Learning, Data science, Linear algebra, Applied mathematics, Compression, Python
Id: s2K1JfNR7Sc
Channel Id: undefined
Length: 10min 3sec (603 seconds)
Published: Tue Apr 07 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.