welcome back so i'm really excited today to tell
you about the shannon nyquist sampling theorem which is one of the most important
results in all of information theory you're going to see it everywhere now
that i'm going to tell you about it and it's really important especially when we
think about signal processing control systems especially compress sensing and sparsity and kind
of the recent trends in applied mathematics so i'm going to jump in and i'm going to tell you
all about this shannon nyquist sampling theorem which basically tells you if you have some signal
you're measuring some let's say it's some signal that's oscillating and doing something how fast
you have to measure it to perfectly represent and reconstruct that signal okay and there's
really interesting connections to code breaking and other kinds of information theory so um
i i find this a really fascinating topic and i hope i hope you do too so there are two seminal
papers uh by by claude shannon and harry ninequist nyquist was a little bit earlier in 1928
shannon in 1949 both of them worked at bell labs so shannon was an american mathematician
nyquist is a swedish american mathematician and both of them developed a lot of their best
work working at bell labs okay which is kind of this famous think tank where a lot of great ideas
came out of um shannon in particular did a lot of wartime research during world war ii in encryption
and code breaking and just kind of you can kind of think of him as america's counterpart of alan
turing and in fact they were contemporaries and claude shannon loved turing's work because
it complemented his his so well so i think that's that's really interesting uh as well okay
good so they wrote these two classic papers uh communication in the presence of noise and
certain topics in telegraph transmission theory and there was this idea of kind of how much
can you compress information if you want to send it over a long distance and then be able to
decompress it and get the full high resolution uh signal kind of out at the end okay so a really
important problem of their time based on uh on on communication and signal processing over long
distances how much can you compress information and expect to get a faithful decompression
uh kind of downstream on the other end and of course we already had lots of knowledge
about this there was already morse code and some kind of redundancy built into communications
in telegraphing but this really put this on a firm mathematical footing uh claude shannon in
particular uh is many times oftentimes called the father or the the father of kind of information
theory and a lot a lot a lot of what we have in information theory came from claude shannon so
he also took ideas from thermodynamics entropy and introduced that into information theory
so he introduced you know information entropy when you're sending signals over the ocean okay
across the atlantic nyquist was really pivotal in control theory so a lot of our classical control
theory comes from nyquist and a lot of things are named after nyquist so these are two titans that
kind of came together in this idea of of sampling uh for the the sampling theorem okay good um
and i'll just also point out that that shannon you know also was very deeply interested in kind
of this theory of secrecy and the way he said it is that uh kind of communication and cryptography
are inseparable you can't study one without inherently analyzing the other you can't talk
about communication theory without the theory of uh of kind of encoding and coding and cryptography
okay so that's a really interesting perspective uh as well and so i want to kind of just walk
you through what they say the the shannon nyquist sampling theorem is and then how we can
interpret it uh today kind of practically so in in words the sampling theorem says that a function
containing no frequency higher than omega measured in hertz is completely determined by sampling
that function at two omega again measured in hertz so if you have a function and i'm actually
going to say this another way i want to flip it also if you have a function and you want to
perfectly represent that function you want to perfectly resolve all of its frequency content
perfectly then you have to sample that function at twice its highest frequency okay so i'll say
this a couple ways and when we're used to thinking about functions uh like an audio signal you can
take its fourier transform and it's got a power spectrum that tells you where in that signal kind
of what frequencies are active in that signal and if you want to perfectly reconstruct that signal
in fourier remember we can compress that signal and send that information and decompress it if
you want to perfectly represent that information and you have a highest frequency you care about
you have to sample at twice that highest frequency okay and so that establishes what's known now as
the nyquist rate which is two omega measured in hertz or alternatively it says if you care about
a frequency omega if that's the highest frequency you care about you have to sample at least as fast
as a delta t of one over two omega in seconds okay and actually this is really interesting this
is why if you've ever noticed that your audio sampling like your mp3 is encoded at 44 kilohertz
that's because humans can hear up to about 20 kilohertz so you take that highest frequency
humans would ever care about which is about 22 kilohertz and you have to double that uh to get
perfect uh fidelity reconstruction when you uh when you decompress okay so that's that's
why audio signals are at 44 kilohertz. Good, so i'm going to walk you through
pictorially how this sampling works and then we'll talk a little bit more about
applications and where you see this every day. So the idea is that you might have
some signal, and this is a little hard to draw so I'm gonna do my best. Let's say you have
some signal that has a high frequency in it. And by common sense, if you wanted to resolve this
frequency, you might say that you need to pick at least two measurements per period to get the
high and the low. Maybe I'll just draw, to get the highs and the lows of this signal, maybe let
me make sure that this ah that's much better. So you have this signal, and to get the
highs and the lows of the signal you would need to measure at least two points
per period. That's kind of common sense. But I'm going to do the thought experiment:
what if you down-sampled, so you didn't measure fast enough? If you didn't
measure fast enough, so let's say I measure less frequently than I need
to, so I measure maybe now and then now and now. I'm not doing a great job here, but
you would essentially get a signal that could equally well be described by a lower frequency
sine wave. So if you don't sample at this Shannon Nyquist sampling rate, this two times the
highest frequency that you care about, then you're potentially going to miss information, and it
could get even worse than that. Imagine I measure at exactly the frequency of the highest frequency
so instead of at measuring two omega I measure at omega, so I measure just the peak and the peak
and the peak. Then I have no information in this signal, and as far as I'm concerned the signal
might as well be that straight line. And so you can see that there's all this information that's
lost if I don't measure at at least the Nyquist sampling frequency of two omega so that two omega
is critical because otherwise you'll get this phenomenon that is called aliasing. And i'm
going to write that up here. So you'll get this aliasing phenomenon, which basically says
that as far as your sampling is concerned these curves are aliases of one another.
They might as well be the same thing as far as this blue sampling is concerned. So if you
sample below the Nyquist rate you're going to get aliasing and you're going to lose the high
frequency components that you might care about. Now there's a really cool plot that you can
make that kind of shows this idea of aliasing and sometimes this is called frequency folding.
If you look at the power spectral density versus frequency and remember that omega is the
frequency I care about, that's where all of my frequency content... actually I'm going to say
this a little differently. Let's say here omega is what I'm actually sampling at, so I sample
at omega. This is all that I'm able to sample, but let's say that my signal has
frequency content up here at 1.5 omega. If my signal has frequency content
at 1.5 omega, but I sample at omega, what I measure, my measurement at this
frequency is going to look like this. So it's going to look like an aliased version
that's folded over in the frequency domain. I'm going to say this again because this is
a little tricky and a little subtle okay so if i'm measuring at omega at a frequency omega
now based on the sam the the shannon nyquist sampling theorem that says i could only resolve
frequencies that are at half of that sampling rate because i need to sample twice the highest
frequency in my signal so if i sample at omega i can only resolve signals that are half of that
rate but if my actual true data so this is my true data over here had a frequency that was higher
than omega based on this aliasing what's going to happen is it's going to look like i measured
a system that was half of that frequency or 0.5 omega this is going to be kind of the aliased
measurement so from my measurements at frequency omega i couldn't tell the difference between
these two signals and when i fourier transform i'm going to get this erroneous low frequency
behavior so that's exactly what i drew here this high frequency signal was kind of this signal
here but because i didn't measure it fast enough at the nyquist rate because i measured it too
slowly i'm getting this blue erroneous frequency that looks like it's at a lower frequency than it
actually is okay so that's this idea of aliasing you'll see it everywhere in signal processing um
actually if i wore a shirt with like a checker pattern you might get like a really fine like
a dress shirt with a really fine mesh pattern then this camera might actually cause
there to be aliasing so as i walk around from pixel to pixel you'd almost see it
like sparkling that's a weird camera effect that you get sometimes and it's because of
this idea of aliasing so you also see this this is a picture of a curtain i believe this
is at um heathrow airport i think and you can see these kind of these kind of moire patterns
that you see sometimes in optics so this is kind of an artistic and artistic rendition of these
aliasing patterns that you get we also saw this when we looked at the discrete fourier transform
matrix so this is literally a visualization of the 1024 by 1024 dft matrix but when i down sampled
it when i when i just shrunk that image on my computer screen because because i used less
than 1024x1024 pixels you actually see some really interesting aliasing features here i
don't know if you can see it i hope you can it looks like there are almost four periodic roles
in x and three periodic roles in y that's not in the high resolution dft this is purely because of
aliasing because i'm taking this information and i'm not measuring it uh finally enough so that's
another application uh kind of of aliasing okay um so that's really what i wanted to tell you about
is kind of this really interesting uh idea of um of aliasing and and the shannon nyquist sampling
rate it tells you how fast you have to measure a signal to get faithful reconstruction
of the frequency content of that signal uh and so i'm just gonna i should probably write
it out to be completely clear so if you want omega you need to sample at two omega it's
really simple really easy to to remember now what's interesting about this and
i'm gonna kind of close the lecture on where we're at now now fast forward you know
70 80 90 years past shannon and nyquist to today advances in applied math and optimization and
statistics are starting to change how we think about the shannon nyquist sampling theorem so
technically speaking all of the results from shannon and nyquist are for broadband signals
so that means signals that are full jam-packed with frequency content from low frequencies all
the way up to that highest frequency that you care about broadband dense signals that you've
already kind of compressed down so think about signals you've zipped up that is definitely then
the shannon nyquist sampling theorem absolutely applies to those dense broadband signals you can't
beat uh the nyquist sampling theorem and have full perfect signal reconstruction but if your signal
is not broadband or dense if it is only a couple of frequencies that are high frequencies you
technically can in fact sometimes under some conditions beat the the nyquist sampling frequency
on average and i'm going to talk about this more in the context of compressed sensing
and reconstructing audio signals but for now i'm just going to walk you through
at a very high level what we expect to see so here we have a signal f which is the sum
of two sine waves at 73 hertz and 531 hertz and so what the shannon nyquist sampling theorem
would say is that to fully resolve this f i would have to sample twice the highest frequency or
twice 531 which is uh 1062 samples per second that's how fast i'd have to sample to perfectly
reconstruct this signal okay and here's the power spectrum here's the signal now that is technically
only true for broadband signals but this is not a broadband signal this signal is very sparse uh in
the frequency domain there's only two frequency peaks only two tones making up this function and
so you can get away i'm going to zoom in here if i measured uniformly below the
shannon nyquist sampling theorem or shannon nyquist limit if i if i uniformly
sampled below the nyquist sampling rate i would get a terrible signal reconstruction and i
would get this aliasing but if i measure randomly but on average at 128 samples per second
so well below the shannon nyquist sampling rate but if i do it randomly not regularly or
uniformly then results in compressed sensing say that you can actually faithfully reconstruct
the sparse vector in the fourier domain the sparse power spectrum and then you can inverse fourier
transform and reconstruct your full signal essentially with no aliasing so this is a really
cool result in the field of compressed sensing i don't know why this is comped this is
spelled wrong it should be compressed sensing or compressive sampling and the basic
idea is that if i sample randomly in time i can get away with a much lower average sampling rate
than predicted by shannon nyquist if my signal is sparse so i'm going to tell you a lot more about
this you're going to learn about this in kind of this lecture series on compressed sensing so
there's you know there's going to be more about this for you to dig into but the basic idea just
to recap is that if you have a signal f and you care about frequencies up to frequency omega then
you have to sample at two omega or else you're going to get aliasing and you're going to get this
frequency folding phenomenon all right thank you