Distinguished Faculty Speaker: James Zhang - Provost

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
thank you everybody good afternoon so just for times sake I'm gonna dive right into the presentation and when I try to prepare this presentation and I've promised okay I'm going to give three examples not until I started to preparing the slides and I found out of the what that was a very ambitious because truly it is the same application of the same method but two three different research projects so I'm gonna jump right into it so the topic I'm going to talk about today is empirical mode decomposition Hilbert harm transform and their applications so I'm gonna give a very brief introduction and talk a little bit about time frequency analysis and then we're going to talk about the basic principles and a concept of the EMD or empirical mode decomposition and I'm going to give a few examples did anyone feel my voice got that a little bit nervous on this in there that's gonna be one of the applications that we're going to talk about a little bit later yeah so the traditional signal analysis methods including mainly the time domain analysis and the frequency domain analysis the time domain analysis basically is what were used to is look at okay there's a signal at a certain time what is the amplitude and then you can calculate the associated the characteristics energy power or so on and so forth versus in the frequency frequency domain the traditional way of analyzing a signal is to basically take a Fourier transform there you see what frequency contents are there in a signal however in a time domain analysis you completely lose the resolution of frequency for example you for listening to a piece of music right so you look at the signal I say okay well this is the signal of a piece of music but it's the C note in there is a D note in there you don't know because you really don't know what frequency it is versus if you take the free Fourier transform and take that into frequency domain okay here you go you see the c-note you see the denote you see all this notes but you do not know what exactly those notes happened in the piece of music so that's what I meant by time domain analysis loses the frequency resolution and vice-versa so there's some methods that try to address this issue that is basically to look at the signal magnitude simultaneously in the in both time and frequency domains okay so some of the predominant methods including like short time Fourier transform or known as s TFT and including the caber transforms wavelets that's another one that has been you know very popular and by linear time frequency this distribution the modified with no distribution functions but all these methods what they do is they really use that predefined functions wavelets for example right you have a mother wavelet that can be precisely described by a mathematical formula and then you use dilation of scaling scaling basically to slide that function through a signal and see how well that actual signal matches was a predefined function so today I'm going to introduce a new method called empirical mode decomposition and this method was first invented by a person named Norton Huang dr. Huang used to be a professor at NC State University and later worked for NASA for many years and when he retired he was a fellow at NASA so the basic concept of empirical mode decomposition is to identify proper time scales that reveals physical characteristics of a signal in other words what are you are trying to extract these using this method to extract some functions called IMF which why I explained a little bit later the I am of IMF functions that are intrinsic to the signal itself and those functions basically are called IMF's or intrinsic mode functions which process to basically you know for satisfying two basic conditions number one is in the whole data set the number of extrema and zero crossings must be equal or differ by at most one so now if you picture that in your head the number of extremists meaning both either Maxima or minima and the zero crossing points if the numbers are equal or differ by at most one meaning it's kind of oscillatory okay so the second one is at any point the mean value of the envelopes defined by local extrema is zero okay she'll be zero so that basically if you picture that in your head this function is going to be oscillatory in nature but amplitude can be changed and so is the frequency it's not a single frequency and it's not a constant envelope so all of a sudden you're kind of in your head you've got this kind of fm/am type of signals going and those signals are called intrinsic mode functions that represent the basic or physical characteristics of a signal so and I probably should you know talk a little about this EMD process really is an iterative process okay so you basically extract one IMF and then subtract that from the original signal and the go back again apply the same procedure to extract the next IMF and so on and so forth so from that perspective if you think about it basically this method acts like a filter or filter banks so because basically the first path it extracts the highest frequency contents contents not a frequency content so the highest frequency content so basically a small band through the signal and then the next the highest the frequency that's the high highs the frequency band so and so forth so this basically that describes the general procedure of how this you know Emde works and then by some definition now give some definition when you perform this procedure there are multiple parameters that you can use as a stopping criteria say okay this is the enough signals extracted from the original signal stop now and including others but the most important one is what we call the standard deviation okay between the two adjacent IMF's you extract it so that can be expressed as such I'm just giving the background right now even or the mathematics it's okay I'm when we talk about specific applications things will come to together so just for completeness a few definitions okay so first of all the people don't want transform if you look at that suppose we obtained n I nafs imf's from a signal then each one can be used to form the analytic function so if you recall most of you I assure you do an analytic function is basically defined as the function itself serving as a real part of that function of the analytic function and then the imaginary part of the analytic function is the Hilbert transform of the original signal which is the original signal comes off with one of the PI T okay so and then the original signal can be reconstructed or expressed as the summation of the real part of all the analytic functions actually we went around and around if you look at that the Essences yeah a signal can be reconstructed as the summation of all the IMF's so and then we can define the marginal spectrum as such as well as instantaneous energy as such so I'm not going to get into details of this here I'll give you an example of how PMD works let's say we manually construct a function as such so X of T is equals to a summation of sine and cosine functions so this first plot here is the original data that is a summation of all these functions so as you can see you have five different frequencies from 80 Hertz all the way down to 3 Hertz and with varying amplitudes by applying the EMD empirical mode decomposition method basically if you look at that I am F one two three four five representing those five components respectively trying to look for something but that's okay so if you look at the last one for example if you count the peaks right so there are three peaks and that's the three Hertz signal notice the time scale is to one second and if you look at the second one one two three four five six seven eight nine ten you see ten Peaks that's your ten Hertz signal also notice you look at the amplitude to the cosine of ten Hertz signal is about point four on scale versus the first one av Hertz signal with a magnitude of one so you just get the basic idea of how to extract the signals that are intrinsic to represent the physical characteristics of a signal and then associated with that we can plot that by looking at the time frequency domain if you look at that this is your time and this is your frequency and the color basically represents the magnitude so you are representing a signals magnitude in the frequency and time domain simultaneously so for example if you look at this blue line here that blue line is somewhere around here so okay that's your 0.3 magnitude and it also tells you it is at about 3 Hertz and it lasts all the time of the duration of the signal then of course the marginal spectrum can be represented like that and so it's the energy spectrum so the first example I'm going to give you is what we call the voice stress analysis okay some of the voice address basically some of the characteristics of voice stress analysis as it's not intrusive so no physical connection so clandestine functions inexpensive and it can be applied especially like a you know a lie detector priority response sometimes I mean you know you really the emergency responders when they receive a call first of all is this true and a true emergency or not which one is more urgent through the stress in voice basically you make that determination and you know you make a decision what to do next how to do that the VSA basically is a detection of fluctuations of physiological micro tremors so when you toss think about that basically when you talk that muscle moves and there is a micro tremor actually in the vocal cord as well and that under normal conditions that frequency is in the range between 8 to 12 Hertz but when a person is under stress and speaks that voice with stress it shifts the frequency and a change is the magnitude of that micro tremor so therefore by extracting the information of a person's voice you have a opportunity you have an opportunity to actually detect whether or not there's stress in that person's voice so what we did is here's an example we use that that data set that is commonly used to you know for voice stress analysis and this is actually an F of T fast Fourier transform of the signal voice samples and if you look at that the new tool is depicted by the blue and the medium stress is depict depicted by an agenda or a kind of tree you know maroon pepper collar and the high stress the signal highly stress the signals depicted by green so if you look at originally this blue signal the back row tremor is right around 1012 Hertz or so and one the person is under stress those Michael tremors gets shipped it okay shifted I will give you a little bit more example so here's what we looked at say okay the neutral there's no stress so if you look at the magnitude and the frequency versus was the medium stress you see the amplitude change and a source of frequency and so it's for the high stress don't have time I'm watching the time and so don't have the time getting to all the details but if you're interested please feel free to contact me we have the publications that we'll be happy to share with you so we basically detected or experimented with three sets of data okay so we experiment a simulated stress and basically like this so we basically use that information from the database different people with different speaking styles with different accent for example I'm you know male general US accent or male Boston US accent and a New York accent and so it's all in the database and that database was first constructed by MIT at Lincoln lab and basically if you look at the results it's not a very clear but there's a in that neutral and you see those frequencies did not get shifted from the neutral are the speaking styles Alexei softly speaking I'm not under stress I'm just speaking a little bit softly or loudly or you know just the fast so you will see the frequencies to stay in the same range however when you see some angry okay or some really you know moderate stress condition or high stress condition you'll see the frequencies shooting up okay frequency shooting up that means shifting away from where the references and then we tested some actual stressed voice so actual stressed voice basically they the researchers or the database builders that put people or the subjects on a rollercoaster okay so to simulate that free fall and so let people scream and those two rollercoasters used were screaming machine and free fall those are the names of the roller coasters so if you look at that apparently in a female sample the neutral there's no stress frequency was at twelve point seven two seven six Hertz but one is under stress especially under the scream condition could put it on the screen machine condition that voice gets really high frequency which makes sense right that makes sense so and now you look at that for the male sample it's very similar okay very similar and we did you know some of the deceptive scenarios and this actually was performed at my previous institution with the help of one of my graduate former graduate students so basically what we did is we created a scenario and it gave those test subjects very specific instructions so a person comes in now give a pivot okay there's a plastic box in one room go ahead and a tape the $50 $50 in the box as well as the cigar and hide it you know on you and then go to the interview room to answer the questions and then some test the subjects as a control group we instruct him said okay don't do anything just go to the room and go to the interview room so for that matter we're trying to compare whether or not so the the deceptive subject would you know present stress in the voices when they answer the questions so the interview process basically we used what we call the modified zone of comparison or M shock and basically is the quest interleaved questions so you have let's say a set of fifteen to twenty questions one question is irrelevant how you doing or and then the second question would it be relevant to that did you steal that fifty dollars and then next question has nothing to do with the scenario and so on and so forth so that's a one of the ways of constructing the questions question number ten okay question number ten is right here okay right here so if you look like that you know non-deceptive you know subject it's pretty relaxed okay versus here you see the frequency shoots up and that one is deceptive subject you know number two was even more obvious I didn't do it but yeah having the right in my pocket so so that's that's the first example and the second example I'm gonna give you of the application of AMD and hht is what we call they did an involuntary hand tremor detection so and I think I adjust them probably there's a lot of work in that I saw some pictures that are very similar to what I'm going to show you but yours is much more sophisticated I have to say so this is basically for example you know involuntary hand tremor could be physiological let's say for example we can ask somebody okay to stretch your arm and a hold a ten pound you know dumbbell for a minute and then let me test your muscle see it was feel like you know it's a tremoring or it could be pathological they said Parkinson's disease so therefore you need to detect and to see you know provide the treatment appropriately so this was actually a test that we constructed you know doing the research with a graduate student and basically we use the arm and this hand itself was actually from General Dynamics the ballistics lab so with that specific material right that really simulates to you know human hand and we basically implemented two actuators to move the hand and the arm separately or simultaneously so I'll give you a demonstration that just roughly see how actually that you know moves this is that basically the arm control if you look at that so and of course we've got all the electronic gadgets there to capture all the information and here is basically simulated hand control you see moves this way because I'm you know the inspiration is like actually you know if you look at people when they're traveling especially Parkinson's disease they're always tremor is sort of like that so in what we're trying to do is be packed with that magnitude and apply counter force and actually we'll call the digital glove for person to where to stop it and this is a combined so basically creates a complex signal the motions look very simple but the combination actually can be very complex so can be very complex and here are some of the information for example we use the 3 3 X s you know accelerometers to actually detect the motion ok so basically to stack the acceleration all right and to translate that into motion so this is basically the original data we acquired through the accelerometers and this is basically a six Hertz signal applied to use the farm and here is the Fourier transform of that to show truly at three directions or three spots your monitoring you do have the six Hertz signal and you have their high ordered harmonics so now wants you to also look at this is what we actually did using like extracting imf's so if you look at that they're capturing time is five seconds okay five seconds and yeah believe me I'm not gonna count each you know peak for you but this that there are 30 Peaks in there so representing a six Hertz signal but the difference yes if you look at that at each direction the IMF also tells you what magnitude of that movement is amplitude therefore you can apply counter force or treatment appropriately to stop the tremor and here's basically the help the horn spectrum so if you look at that basically this most of the signals that's number five the first zero and the number five on the y-axis you see the signal basically concentrates right around six Hertz range okay six Hertz range and of course we collected you know a few complex drive frequencies and the results are similar okay are similar that gives you the combination of the frequencies at each spot with different magnitude so therefore you have all the components again for treatment you can select appropriate means for that treatment the third example I'm going to give you is actually network traffic anomaly detection so basically you know internet security is one of the you know a pretty hot topics nowadays especially you know you when you when you have all of sudden you could you know take all about hacker or just the whole network would be done and a lot of times the internet traffic monitoring is based on the number of connections to a node just like people are saying okay you have a server and all of a sudden you have thousands tens of thousands connections connected to it and requesting service and then all of a sudden the server would not work right and I actually I'm very proud of this work and based on this work based on AMD we came up actually with three very novel measurements to parameters to detect that attacks the attach the data we use the standard kdd data again that was MIT lab collected those data and for kdd data we basically you know the detection most the people you know to to to come up with the Internet traffic detection uses this data set the first method we came up with is called weighted self similarity based on the first IMF that was actually based on the Hurst parameter first parameter but however the difference here is we actually used an adaptive way of calculating chunks of hers the parameters and do the real-time analysis and the comparison so basically if you look at what we do is that we calculate the Hurst value and we built the Hurst value matrix and then eventually we calculate the weighted self similarity these terms were completely defined by ourselves okay doing the research completely defined ourselves and here's a couple of examples for example this is a normal traffic we extracted the first IMF that's a second picture and we followed the formula to calculate the self way to the self similarity and if you look at that this RAF here is basically a reference or a template so now when we real-time extract this calculated that calculated the self similarity compared with the template and it said ok it's very well within our threshold we conclude ok network is normal there's no attacks right now so basically specific I probably should you know mention this the attacks were specific addressing was that the OS ok T now of service attacks so now here's another example if you look at the original data so remember the y-axis represent the number of connections to a node so normally it's ok ok all of a sudden you see a surge of the connections to that certain node and our detection method basically detected yes here are attacks ok and the third one is basically purely attacks ok this is a pure attacks and then last but not the least and say ok now the text is over that the attacks are over ok so I just give you a snapshot of what's going on here so the some of the past results as you can see were tested a lot of data and the basically for DLS attacks we got about 62 percent accuracy real time remember but we found out it was did not work very well for neptune type of attacks so if we take that out it gives you about 78% of accuracy which is pretty good when you do real time the second one is what are called the Pearson's distance will get based on the marginal spectrum remember we have the EMD and we take that integral that's your marginal span and basically would define the distance and then we used that as a parameter to detect again some of the results are here here's a tack and the last but not the least is the the second method if you look at that the detection rate is actually pretty high it's about ninety percent okay 90 percent real-time detection and again if you look at the size of Windows who are detecting it's not trivial so and the last one is what we call the rate change of an energy density level so that is basically what detected the you know marginal energy spectrum and look at the correlations and then we look at the rate change of the correlation so and again these are two parts and concluded this no we basically look at the Delta of the energy change rate and here's you know attack free and with attacks so if you look at that that magenta kind of you know the line basically shows the slope and the slope represents how fast the energy level changes all of a sudden the green one you see em to the rate changing energy spikes and saying okay here comes the attacks so for this method we got about seventy six percent or so a detection rate so which is not too shabby okay which is not too shabby so those are the three examples that would like to give you and I think I controlled my time pretty well but before I conclude I really would like to express my you know thank you to my former graduate students these three students very proud of them this is just a representative sample of my former graduate students Niaga was originally from Nigeria so he got accepted by Purdue and they decided well he needed to go back to his home country and worked for the government so he did and Brandt received his PhD from Purdue and last I know is working for Caterpillar in the controls makes sense mater you do the hand tremor control and stuff like that and the jiaying just graduated two years ago and they're currently she is a PhD student at Georgia Tech still continuing with the network security type of work so with that I'll conclude my presentation and I do have about ten minutes for your questions thank you
Info
Channel: Kettering University
Views: 9,962
Rating: 4.9333334 out of 5
Keywords:
Id: 8ckX1lNDQsE
Channel Id: undefined
Length: 31min 52sec (1912 seconds)
Published: Mon Jun 01 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.