Image Compression and the FFT

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] welcome back so in the next couple of lectures I'm gonna show you how to use the Fourier transform the fast Fourier transform to compress images okay so this is one of the coolest and most useful applications of the FFT in our modern world is using the FFT for compression in general you can use it for audio compression for image compression when you stream TV or movies or YouTube you're using some kind of a Fourier transform based compression algorithm okay and I'm gonna start with compression based on the fast Fourier transform and then I'm going to eventually move on to the wavelet transform which is kind of an enhanced Fourier transform that's better for compressing natural signals like audio and images okay but before I do I need to point out this is not just the standard FFT in Python or MATLAB this is actually the FFT - it's a two dimensional FFT if we're going to be doing compression on two-dimensional images so I'm going to walk you through kind of how you can think about this this two dimensional FFT okay so let's say you have some some picture I'm gonna say I don't know if this is a picture of my dog Mort that's Mort that looks like a pig okay fine that's a pig and we're going to Fourier transform this pig so the first thing is just how do you compute a two-dimensional FFT then I'll show you how to use that for compression okay so the first thing is the FFT to command FFT - and essentially what you do in an FFT - this is super simple you just FFT every row and then you FFT every column or vice versa it doesn't really matter so the first step is we're going to FFT every single row okay so this is just FFT on rows and we're gonna get some new image and I can't draw this image because it's in some weird Fourier transform and space that's going to look it's gonna look messy some some weird messy Fourier okay and then once I Fourier transform all the rows I'm gonna Fourier transform all of the columns of that new image and now I'm going to FFT all of the columns okay so we're gonna FFT columns and that is going to be how we FFT to an image so this is the FFT of an image and I'm gonna draw it kind of like this because that's how I think of it there's usually you're gonna see when you FFT an image there's gonna be this kind of plus sign and then some stuff and everywhere else and I'll explain that later okay but the basic idea is if you want to compute the FFT to the FFT of a two-dimensional image you FFT all the rows then you FFT all of the columns and you get this this representation and the way you can think about this simply is just like if I FFT --d a one dimensional signal like an audio signal I'm representing that as a sum of sines and cosines in that one-dimensional variable here I'm representing this picture of this pig as a sum of a bunch of sines and cosines in X&Y so you can think of each fourier coefficient here every single fourier coefficient here corresponds to a particular spatial wave number this corresponds to some you know cosine KX x sine I don't know J Y where this is the K and J coordinate of this thing and if you think about it I always imagine for people holding a bedsheet and kind of forcing it you know in one direction at a frequency of K and forcing it in the other direction at a frequency of J and it would create this kind of standing wave I guess this doesn't have to be cosine sine this could be cosine cosine or each of these has a phase this could be cosine KX times cosine KY or sine KX times sine J Y or whatever it doesn't matter cosines and sines but the basic idea is that you know every single point in this two dimensional Fourier transform has an X frequency and a Y frequency and that's giving you this kind of standing wave bedsheet structure and you're gonna add all of those up to get kind of the pixel intensities of this original signal and I'm gonna talk a lot more about this we're gonna I'm gonna work through some examples we're gonna see this working in MATLAB and in Python okay but the basic idea is before you transform all the rows all the columns and then you get this two-dimensional FFT in python and matlab it's kind of a built-in command to do this fun exercise for you at home would be to convince yourself that it doesn't matter what order you do this operation you could do the fft on columns first and then do the fft on the rows or vice versa and you're going to get the exact same two dimensional fourier transform so I think you should convince yourself of that okay good but now I want to show you how you would use this for compression this is this is really cool I'm going to show you kind of mathematically how you use this for compression and then we're gonna code this up in the next few lectures in Python and in MATLAB so the basic idea is that when you have your image this is in image space and maybe this is a megapixel image so it's got one million pieces of information maybe this time I'll draw a cat okay that's a cat happier than most cats I've seen okay so what we're gonna do let's say that this is kind of one megapixel image of a cat so the first thing I'm going to do is I'm going to Fourier transform this thing that's a curly F FFT we're going to FFT to this image and we're gonna get again a million kind of Fourier coefficients so now in Fourier we still have a million Fourier coefficients and I'm always gonna draw it like something like this it's not exactly what it looks like but the basic idea is that you have a lot of Fourier coefficients but we have observed and this doesn't have to be true this is not obvious why this is true this just happens to be true is that all almost all natural images things that you would see like and not not just in that natural like a mountain or a stream or a field but like this pen or what you see right here this this lightboard studio or a picture of a coffee cup or a cat images that you would see with your eyes generally when you Fourier transform them most of the Fourier coefficients are really really small this doesn't have to be true but this is true ok so most of these Fourier coefficients are very very very small negligibly small and you can truncate them and throw them away so all of these Fourier coefficients these million Fourier coefficients maybe only 10,000 of them are large and so I can zero out 99% of those Fourier coefficients I can threshold them out manually make them exactly equal to zero ok so that's what we're gonna do is we're gonna threshold and what I mean is we're only going to maybe keep 1% of the largest largest fourier coefficients so now and it's a little hard to draw this but usually what that means is now it's like only just the very very largest ones Fourier coefficients and this original image we're gonna keep like 1% or 2% or something like that and the the magic is that if you throw away really really small fourier coefficients your threshold n'ver story a transform this is the eye FFT to when you inverse Fourier transform there is very little loss in your original image ok so this is I don't know if you remember parseval's theorem but parseval's theorem says that there's kind of a relationship between the energy and fourier space and the energy in signal in pixel space and so if I'm only throwing away I'm throwing away a lot of these Fourier coefficients 99% of the Fourier coefficients but the ones I'm throwing away are so small that they don't actually change the norm or the energy of this of this this signal very much because I'm only throwing away the ones that are almost zero anyway and so Parcells theorem tells me is that if I only threshold the very very small Fourier coefficients it has negligible degradation of my original image okay and this is the basis of all image compression is the N audio compression is this idea that when you Fourier transform your signal your audio or image signal most of those Fourier transform coefficients are very very small negligibly small and because the Parcells theorem if I zero out those small Fourier coefficients and I only keep the largest ones which oftentimes is only 1% or 2% of these Fourier coefficients if I only keep those then when I inverse for you transform the image looks almost identical to the original full resolution image ok now you might be asking why is this actually give you a compression well when I take a megapixel image actually on my phone I think it's like 8 or 12 megapixels now I am storing a million pieces of information a million pixels of information and if I Fourier transformed there are still a million pieces of information here but when I threshold and I only keep one percent of these Fourier coefficients here this is where the compression comes in so when you save a JPEG image on your phone or on your computer you're not saving the million pixels you are saving only the one or two percent largest Fourier coefficients this is JPEG okay and so on your hard drive you don't have a million pieces of information you only maybe have 10 or 20,000 pieces of information corresponding to the largest Fourier coefficients and essentially the IJ or the JK locations of where those Fourier coefficients are so this is basically a list of you know K comma J comma F hat K comma J comma F hat this is just a big list on your hard drive of only the 1% or 2% largest Fourier coefficients that you need to keep okay and that's actually why the fast part of the fast Fourier transform is so important is that if you had to do an order N squared expensive Fourier transform and inverse Fourier transform this would be too expensive to decompress your image so when you have a JPEG on your on your phone and and you you click on it or you open it it immediately almost instantaneously decodes it into pixel space so that you can see it but it's going very very flexibly from this sparse compressed representation to this full megapixel representation because it can compute this Fourier transform very rapidly ok so if it wasn't for the fast Fourier transform even though we could still compress images it would take forever to compress them and it would take forever to decompress them so so the F the first F and the FFT is why we can do this this image compression so ubiquitously ok so just to summarize you can take the Fourier transform of two dimensional images just Fourier transform all rows and then all columns and then what we find is that when we Fourier transform natural images that you see in nature things that you would see with your eyes generally they are very very sparse in Fourier space so you can get away with truncating or throwing away most of those small Fourier coefficients and by only keeping maybe the top one or two percent of these Fourier modes you get a massive compression you have to store less on your harddrive you have to transfer less when you text some on a picture and because of the fast Fourier transform you can invert or decompress that representation very efficiently into the original image ok so in the next couple of lectures we're going to code this up in Python and in MATLAB and we're going to kind of get a little bit better feeling for you know why this is compressible how it's compressible and what different features why some images are more compressible than others and so that's all coming up thank you
Info
Channel: Steve Brunton
Views: 18,925
Rating: undefined out of 5
Keywords: Fast Fourier Transform, FFT, Derivatives, Fourier transform, Fourier Series, Fourier analysis, Hilbert Space, Complex analysis, Wavelets, Machine Learning, Data science, Linear algebra, Applied mathematics, Compression, Python, Matlab
Id: gGEBUdM0PVc
Channel Id: undefined
Length: 13min 1sec (781 seconds)
Published: Fri Jun 05 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.