Mathematics Seminar: Physical Layer Security in Wireless Networks | H. Vincent Poor

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
um hello everyone my name is mehumotani i'd like to welcome you all to the inaugural talk of the seminar series information theory in singapore itis and also well welcome to singapore well um virtual singapore many of you are here in singapore but some are not so welcome to virtual singapore i wish i could welcome you to singapore in person but these are strange times so this will have to do behind me is uh the view from my office um so nice right well i wish it was a view from my office this is the singapore skyline uh i hope this pandemic passes soon and you all can visit singapore and see the beautiful country for yourself so uh this is because uh when i looked at the registration list there were a bunch of people from outside of singapore so i'm really excited about that so the information theory in singapore seminar is organized by a team that includes um han mao from ntu uh tsaikwe and tang from sutd and me from nus and our aim is to promote advocate and spread uh the joy of information theory and coding theory within singapore and now around the world we have five talks lined up for this year 2020 and we'll be putting out the 2021 schedule soon uh so be on the lookout for that and please uh spread the word okay get people to uh to register so we can remind them of um when and uh or when the talks are happening so today is the first talk in our seminar series before we get going let me address a few logistical matters please keep your microphone and video muted for the duration of the talk if you have questions just type them into the group chat and we'll address them in the q a at the end of the talk all right now it is my great pleasure to introduce professor vincent poor as our speaker professor vincent poor is the michael henry stratter university professor of electrical engineering at princeton university where he was the dean of engineering for many many years his research interests include information theory machine learning and network science and their applications and wireless networks energy systems and related fields he has received uh many awards and honors the list is too long to go through so let me highlight a few he's a member of the u.s national academy of engineering and the u.s national academy of sciences he's also a foreign member of the chinese academy of sciences and the royal society he's an ieee fellow he has served as editor-in-chief of the ieee transactions on information theory and as president of the of the information theory society uh recognition of his work includes a 2017 ieee alexander graham meadow and honorary doctorates from universities in asia europe and north america i'm honored to have my colleague and friend vince poor share his thoughts with us today on physical layer security in wireless networks let us warmly welcome professor poor thank you okay thank you very much for that kind introduction uh welcome to my kitchen this is my kitchen this is where i live these days um so i i'm i'm honored to be the first speaker in this series i was in singapore i guess it was about a year ago is that right mayhew yeah that's right and uh it was i had a really wonderful time there and i wish i could be there physically um but maybe that'll happen next year um anyway i'm really happy to be here uh speaking to you tonight and or this morning for you uh and as mayhill said i'm going to talk about physical air security i wanted to talk about something with an information theoretic uh content given the purpose of this uh series so let me see if i can share my screen here uh i hope you can see that can you see that major yeah that's right yes we can see it okay great so um so i'm going to talk about physical air security and wireless networks and again i'm subtitling it an information theoretic perspective because there are a lot of aspects of physical air security that don't really have much to do with information theory and i want to thank a few people uh arsenia or ursie jordy uh in france uh and her student marty herfe and a former student of hers in essex amiro mitev who helped with some of the slides i'm going to show you uh and um marty and ersie and i have a chapter uh that's about to appear in a book on physical air security uh which covers a lot of things about physical air security not only the information theoretic aspects i'm going to talk about today so i'll just call that out because people who may be interested in uh some of some of the other issues can look in this chapter when it appears which should be pretty soon um so anyway uh you know here's uh sort of a view uh maybe oversimplified view of security today in wireless networks of course uh networks are organized in layers uh and today security is primarily applause security measures are primarily supplied applied to the higher layers mainly at application or transport layer and uh you know you and it works pretty well for a cell phone um you know i think your cell phone's pretty secure uh and so you might ask the question why do we even need physical air security when we already have a lot of really good solutions for security uh at the higher layers um well uh there are some very good uh things about standard security and i'm gonna just talk about crypto systems now although there are a lot of other aspects of security uh for one thing there are no known feasible attacks i mean the um the bits of uh random bits are large enough number of random bits large enough that the mathematical uh the computational complexity of trying to solve for the key in in the known cryptosystems is uh infeasible with today's technology and it's been widely employed and tested and authentication is trustworthy so we kind of trust our security today uh we do banking and so forth without really worrying too much about it on the other hand uh just from a broader perspective it's built on unproven assumptions namely the difficulty or the perceived difficulty of certain mathematical uh problems uh and um also the the center of the physical layer is already there you have that as you have a connection of the physical layer that you can use for implementing over which you can implement cryptography it's computationally intensive so for some applications where the terminals are of low complexity that might be a problem and it requires infrastructure which may not also may not be uh in some emerging types of networks may not be something we can take for granted now physical air security has some advantages first of all it's quantum secure it's based on fundamental physical layer principles at least uh what i'm going to talk about today is uh the key can be generated on the fly from uh common randomness in the channel you don't need key distribution and it's really uh lightweight that is you really only need the kind of complexity you need for air control coding to do certain kinds of physical error security at least in terms of data confidentiality now there's some negative things one is a lot of what's known about physical air security based on information throughout security uh make some assumptions about the eavesdroppers channel which may or may not be practical also uh there's some a lot of it is asymptotic theory so finite blocks or not is a finite block length theory is not as well developed although i'm going to talk a little bit about some of that here uh and it's still in its infancy we haven't really seen a lot of uh implementation with physical air security in practical systems uh but a prime example of what's wrong with uh some of the shortcomings of uh more traditional security methods is the internet of things uh where um you know potentially tens of billions of devices you know there was massively deployed networks are going to be out there communicating uh and and sometimes without much infrastructure sort of a grant free type of access with sometimes fairly low complexity terminals like sensors and so forth so here's an example where the current type of higher layer provision of security could break down so this is a place where internet where physical air security might might provide um solutions that we just can't solve can't provide in other ways we might we might and so and for iot security is a major concern there's been a lot of publicity about uh iot security you can read it in the press all the time um about you know hacking of various kinds of iot devices um you know and since these devices are going to be used for security for physical security and for uh things like health care and so forth um there's a serious concern that iot will not the security of iot will not stand up to the societal importance of some of these applications i mean one one example let's see here one example of what can go wrong is under this paper from you snicks a couple years ago um where we looked at what would happen if a botnet could get loose and control a lot of high-wattage iot devices so if you have a nest in your house controlling your air conditioner refrigerator and so forth what might happen if you get a bot a botnet the size say of mariah the mirai botnet which is a botnet that brought down the internet uh uh in in and got control of something like water heaters okay and it turns out that with something like this a mirai sized botnet uh controlling water heaters uh you can change the demand for electricity instantly by three gigawatts okay which is on the same size the same scale as the largest currently deployed nuclear plant so using uh i you know just using the vulnerability of iot it would be possible at least in principle to inflict damage on the electricity grid equivalent to flipping off and on the largest nuclear plant out there okay so you can cause cascading failure or other tripping of generators and so forth all the kinds of things that that kind of change and load would instigate so there a lot there's a lot that can go wrong uh and so this is a serious issue so what can physical air security offer well there's multiple levels of security of course note authentication message integrity message confidence confidentiality and so forth and there are different aspects of physical air security that address these various uh functions now today uh a lot some of them are not information theoretic okay there are information theoretic issues in all of them but most of them are many of them are not information theoretic but one place where information theory has a lot to say is in the issue of message confidentiality and so that's what i'm going to focus on in my talk today so here's a here's an overview of what i'm going to talk about first of all the theme is a role for information theory in this area this is an information theory seminar uh i'm going to focus on two things uh message confidentiality and data transmission uh and privacy and sensing systems so this is focused on on iot type applications and then at the end depending on how much time i have i might talk about a few other issues where information theory has something to to add uh authentication security and mobile ad hoc networks and attacks on sensor networks we'll see how much time we have so let's talk about message confidentiality so first of all let me repeat a little bit different words what i said at the beginning you know security in wireless networks has traditionally been a higher layer issue and that's particularly true also of message confidentiality where the main technique is encryption okay so there you you deal with all the issue of key management and so forth now as i said if you have a massive number of devices especially with no or light infrastructure uh this is going to be a problem and also if you have low cost terminals and increasingly uh of importance is low latency and if you if you do encryption you add the latency right so you know low cost low latency networks urlc type things with massive numbers of them is going to be a problem for conventional approaches to message a confidentiality so again physical air security provides security by exploiting other things than the difficulty of computations and so forth namely it exploits imperfections in the physical channel and in the wireless channel it's we're talking about noise fading and so forth and the idea of physical air security at least for message confidentiality is to replace something you're already doing that is error control coding for reliability with joint coding for reliability and security so the complexity of this kind of thing is more or less like the complexity of air control coding rather than the complexity of encryption so let me say a little bit about i'm going to kill that if i can let me say a little bit about the genesis of using information theory to try to understand security uh and that goes back like every almost everything else in information theory to shannon uh you know shannon in his 1949 paper so not his most famous paper but his 1949 paper he looked at the issue of secrecy in cypher systems so he looked at a three-terminal network where you have alice uh a source that wants to transmit a message to a legitimate receiver bob uh in the in the presence of an eavesdropper eve who can just like bob can seem to perfectly see the output of alice's uh transmitter okay so there's no noise in this model but alice and bob share a secret key okay so they're going to use so alice is going to use the secret key to encrypt her message or to insight for her message and bob knows that key so bob can [Music] um can decipher the message eve does not know the key so shannon was interested in the circum which what what are the conditions on the key um that um need to take need to need to hold so that message these messages can be sent in perfect secrecy and shannon in this context had good news and bad news the first the good news was with such a model you can transmit a message in perfect secrecy um but the bad news is that in order to do that the entropy is the key has to have as much entropy as the source that you're trying or the message that you're trying to uh transmit now so that that's that's kind of bad news because at least for us because if we're in a wireless network that means in order to get perfect secrecy in the decipherment system we need to have another channel or some way that we can transmit as much key as we have message okay so that that means um that i'm trying to get rid of this maybe if somebody could admit this person that would get rid of that here we go there we go uh so um so you know so that's bad news because for wireless network that's just not going to work and in fact uh shannon was interested in a one-time pad system like this but it was a radio telephone system uh between uh during world war ii between roosevelt the united states and churchill in london and they did have a one-time pad system so they they were able to put the key on a recording which could be carried physically to london uh so that they had a back channel where they could carry the key but in a wireless network we just can't do this and in fact very few if any modern crypto systems use one-time pads most of them use a key uh with a much slower uh com entropy than the message that it's going to encrypt and then that key is blown up into a longer key using um you know using something like public key where the security of that is not because of its entropy it's because the difficulty of computing of in reversing the computations that are used to create the key now so this is uh shannon's take on this problem now if we zoom forward into the 70s another a way of looking at information theoretic security is due to aaron weiner so weiner had the same three terminal model alice bob and eve but he recognized that in real channels communication channels there's noise so so there's a noisy channel between alice and bob and annoys another noisy channel between alice and eve not the same one so weiner took away the uh secret key so now alice and bob don't share any secret at all they just have a different noisy channel from from eve uh and weiner was interested in the question as to whether you could transmit reliably to bob while keeping his message in complete secrecy from eve so he looked at the trade-off between the reliable rates of bob versus the equivocation to ease so the equivocation is just the conditional entropy of the message given eve's measurements okay so if this if that equivocation equals the entropy of the message then eve doesn't get anything or if another way to look at it is if that entropy that equivocation equals the rate to bob maybe the rate is less than the full message rate whatever's being transmitted is kept in secrecy to eve so weiner defined the secrecy capacity of this so-called wiretap channel to be the maximum reliable rate to bob that such that the reliable rate equals the equivocation that eve and weiner had good news and bad news too i mean his good news was that the secrecy capacity can be positive in such a setting but that it's it's positive if and only if z that is ev's measurement is somehow degraded relative to y bob's measurement okay so for example you might think about eve having a lower signal noise ratio in a gaussian channel okay so now weiner wasn't thinking about wireless because first of all the name of his channel is the wiretap channel he worked at bell labs they had that was a telephone company and this was in the 70s before the widespread use of wireless and modern communications like we have today but that use of wireless has created a resurgence of interest in these ideas in our day okay um particularly because of some of the reasons i mentioned connecting to iot and other networks of that type so in general uh the message we get from shannon weiner is that somehow the legitimate in order to transmit in secret the legitimate receiver has to have some kind of advantage over the youth dropper it has to be either a shared secret with the transmitter like in shannon's model or a better channel like in weiner's model okay so now that's that's the insight we get from those two studies and now how does that play out in wireless channels okay well wireless channels have some good physical properties uh namely diffusion and superposition that provide opportunities for this so diffusion and superposition cause fading which as we know uh is good for us in terms of um reliability and diversity um but it also provides natural degradedness over time so the eavesdropper can be degraded by fading and if we if we can take advantage of that then we're getting some free uh secret uh transmissions uh without using a secret key also interference which is a part of the superposition property of radio propagation allows active counter measures to eavesdropping that is we can um uh jam an eavesdropper and that degrades degrades it right and also we can use spatial diversity that is multiple antenna systems relays and so forth to help us create secrecy degrees of freedom so all these radio properties that we take advantage of for other reasons or are also useful for degrading the eave dropper uh another property of radio channels is that they're random that's comes from of course diffusion and superposition and the sort of random nature of uh reverberation and so forth and that that gives uh and but they're also reciprocal so that means two terminals who are communicating one another see the same um channel gains uh in each direction so that's a source of common randomness that two two uh terminals a receiver like bob and alice can use for key generation okay of course an eavesdropper has a different channel so the eavesdropper can't really tap into that common randomness so um the first of these three phenomena uh have been extensively explored to develop secrecy capacity regions for all the fundamental channel models that we usually use to understand wireless networks uh so and i'll say a little bit about the third one the fourth one in a minute but let me talk about that now so here's the main models we have a broadcast channel a single transmitter multiple receivers multiple access channel multiple transmitters a single receiver interference channels multiple transmitters multiple receivers relay channels mimo channels and so forth and all these channels have been examined in this through this lens of information theoretic security um there are some distinctions uh for example the multiplexes channel we don't usually worry about cross-communication between the transmitters but in this case those are those are the eavesdroppers if you have two uh transmitting terminals transmitting to a common receiver so this is a uplink problem um you might think of the other transmitters uh in the um multiple access channel as being eavesdroppers on their counterparts so you have to think about a slightly more complicated multiple access channel but that can be used to examine the information theoretic security of this channel so i'm not going to talk about all these i'll talk about the first one briefly but um there's a paper with raphael shaffer in the proceedings of the national academy of sciences that covers a lot of what was known at least up until 2017 in these areas so let's look as an example though at the gaussian broadcast channel okay so here and this is actually the so-called broadcast channel with confidential messages so here uh we have one transmitter and two receivers uh there are two messages one is a message that's a broadcast message for both receivers and one is a confidential message so m2 is a confidential message that should be transmitted to bob one that is the top transmit drop receiver and kept in secret for the the bottom transmitter at the bottom receiver i'm sorry so um with regard to message m1 the lower transmitter is a legitimate receiver but with regard to message m2 it's an eavesdropper so that so for m1 this is a broadcast channel for m2 this is a wiretap channel so you might think about this as a model for a content distribution system where some content like the common message is basic content that everyone in the system receives and some content like m2 is premium content which is intended only for uh those who paid for it okay so that's the way to think about it and so not naturally in this system you don't want those who haven't paid for it to be able to uh to read that message so what i've shown here is the for the case where the signal noise ratio at the top receiver is 10 db and the signal noise ratio at the bottom receiver is 5 db i've shown the uh capacity region so the red all the uh so along the horizontal axis is the the common message rate and on the vertical axis is the confidential message rate and if we don't if we don't impose confidentiality on that message all the all the rate payers below the red line is are are feasible or achievable and that's the capacity region of that broadcast channel on the other hand if we impose perfect secrecy on the message m2 the blue dashed line shows the outer bound of the secrecy capacity region so you can see what happens is the rate for the secret message in this case drops and that additional capacity between the dashed line and the red the red line are is capacity that it's used to confuse the eavesdropper so there are additional code words are used that don't really carry information but rather are used to confuse the eavesdropper and if we look at what happens versus the signal and noise ratio of the secondary receiver it looks something like this so here we fix the primary receiver the top receiver to have 10 db of snr and the other receiver we look at what happens as we bury the snr and in particular as we decrease the snr you can see that what happens is the range of common rates uh shrinks and the range of secret rates increases that that's just because when in transmitting the common message the secondary receiver is the bottleneck so if we lower its snr then we're naturally going to lower the rates at which we can transmit that message on the other hand as we lower the snr at the secondary receiver it's easier to transmit the secret message so you can see that happening here now if note that the signal noise ratio at the secondary receiver is always less in this case than that of the primary receiver so we're always in the degraded situation now weiner tells us that if the secondary receiver has signal noise ratio of 10 db then the sequencing capacity region would collapse okay now we get a little bit different story if we go to the fading gaussian broadcast channel so this is the same channel except now we assume the noise level at both receivers is the same but they're fading there's fading between between alice and the two receivers and the fading is rayleigh uh and it's railing with unit parameter going to bob bob one i should say an israeli with parameters sigma 2 going to the other receiver now if you remember your communication you know that a smaller sigma 2 means more intense fading okay so that's that's like more degrading more um the secondary receiver and here we see the same kind of thing as sigma 2 gets smaller uh the rate of common rates uh the range of common rates diminishes and the and the range of uh secret rates goes up but a distinction here between the gaussian broadcast channel is that we never lose the the signal capacity region never collapses here so you can see that the red line is the case where the two channels are statistically identical and so there's no degradation or anything here but in fact uh the secret message can still be transmitted with positive rate in fact pretty healthy rate okay and what's happening there is basically that um you know even though statistically the two channels are identical there are times when the secondary receiver is going to be degraded just because of the fading fluctuations and so forth so during those times secret messages can be transmitted and when the first receiver is degraded you can use that type of jet to transmit the common message so that's how you can take advantage of fading here to get a positive secrecy rate even when you have a non-degraded eavesdropper so now i want to talk a little bit more about the the wiretap channel though that was the broadcast channel and talk about another issue so just going back to weiner's channel this is just a different picture of it so so here's the basic setup we have um a message now i'm calling it w it's going to be encoded into n channel uses it's going to go in the channel and then uh there are two outputs of the channel there's bob's output and ease output so you get n output symbols for each and then bob is going to try to both are going to try to recreate the source uh and we're we're interested in having bob recreate it uh perfectly uh and eve uh not be able to recreate it at all so we we can look at the leakage we can look at the uh epsilon the error at bob's output which we want to go to zero and the leakage at ease output which we also want to go to zero so liquids will be a measure of equivocation or something like that so the way uh we get weiner's uh result is we let the block length go to infinity so this is a classical asymptotic information theoretic formulation uh let the probability of error go to zero the information leakage go to zero and then we get the secrecy capacity and under some conditions not all conditions but under some conditions that capacity has a very nice formula it's just the maximum over the input distribution of the mutual information of the difference in mutual information between alice this channel to bob and alice's channel to eve okay so this is not a completely general formula but it's it's somewhat general so that's that's basically a weiner's result uh maybe oversimplifying a little bit but um the the limitations a limitation of this result is that it's asymptotic right so the block length goes to infinity uh which is a which is a problem okay so how can we deal with that well let me let me just remind you of something else uh in information theory uh and that is um finite block length information theory so the most classical problem in information theory the most basic one is the top uh picture at the top here so we have a source w that has m possible values it's encoded by an encoder into n channel uses those channel uses go into the channel of those channel symbols go into the channel they create output symbols those output symbols go on the decoder and the decoder tries to reconstruct the source with as much fidelity as possible so in such a model we're interested in an m epsilon code where for a for a source with m um possible values so log 2 m of bits uh which we'd like to be reconstructed with fidelity no worse than epsilon using n channel uses okay we're interested in fundamental fundamental quantity which is the maximum m star that we can get with that we're given in an epsilon so that's the capacity of the channel if you will okay so now we know that if we take the log of m star and divide by n and let n go to infinity and epsilon go to zero we're going to get the channel capacity that's that's shannon's basic result or one of shannon's basic results but what if we what if we're in a situation like iot or some other kind of uh latency constraint application where we can't let n go to infinity and we're not going to get epsilon going to zero either in that case what can we do well in in urine polyansky's thesis ten years ago um he developed a formula for that so basically the log the log of m star uh has a leading term n times the capacity which you know it does from shannon but it also has correct the correction term the second order term is a root n term which is multiplied by uh the inverse of the gaussian tail the unit gaussian tail evaluated at epsilon times another quantity called the dispersion v okay so what is the dispersion well the capacity as you know is the expected value of the information density evaluated at the random variable and has a distribution of the optimal input distribution and the output corresponding output a random variable has a corresponding output distribution x star and y star so the first moment of that random variable is the capacity and the variance of that random variable is the dispersion that is the next term in the expansion of the capacity if you will uh which is a one over root n term if you divided left everything through by n is the variance of the same random variable that the leading term is the expectation of okay so this this is a very good formula uh we can um look at it for a particular case it's an additive white gaussian noise channel with snr of 0db and epsilon this is the the bound on the error probability of 10 to the minus 3rd this is chosen this particular channel is chosen so that the capacity is one half and then what we show here is versus n uh the log of m star over n okay so it's the rate the maximum rate and we don't have the exact expression for but we got we have an upper bound for it converse and we have a lower bound for it which is an achievability result and it has to be squeezed between those two and the approximation that you're seeing there is a modification that formula i just showed you okay and so you can see that first of all that we have a really good handle on what the actual capacity is there's not much room between the converse achievability bound but also you see that for small uh packet sizes like you have in iot down in here there's quite a bit of gap to capacity uh when you go to finite block length okay so so that's kind of a message for general communications and what does that say about wiretap the wiretap channel well um more recently uh with with wei yang and rafael shafer we looked at the wiretap channel in the same context so here's the wire here's weiner's wiretap channel the same model i showed before except now we're not going to let n go to infinity and we're not going to let epsilon go to zero and we're not gonna let delta go to zero so now we have a maximum secrecy rate which is depending on n epsilon and delta uh so just like in the case of the normal transmission we have a maximum rate depending on n and epsilon and then we can look at different types of channels we don't have a general formula for this but we can look at different types of channels so this is the so-called semi-deterministic wiretap channel uh where the uh with the binary symmetric eavesdropper channel so here the legitimate channel is deterministic uh the eavesdropper channel is a binary symmetric channel and the uh the rate as a function of block length fidelity and leakage is has a very similar form to the polyansky formula there's a slight difference here v is a different this is not the usual dispersion it's a dispersion related to the secrecy capacity uh and delta we also have delta and epsilon here that we wouldn't have ordinarily okay in the other case but here's here for this channel you can see a very similar picture this is a the secrecy capacity here the asymptotic sequence capacity of shan of weiner is one half uh but again so we have an achievability bound and a converse and again you can see that for small packet links there's quite a gap between the actual secrecy rate maximum secrecy rate and the secrecy capacity okay now we can also look at this was by the way i t symposium in 2017. but you can also look at the gaussian wiretap channel here's a similar one and this has oh this is the typo here this is bob's snr is 3db ease snr is -3 db so this is a degraded channel epsilon and delta are 10 to the minus 3. i didn't write that on here and now we can see a similar result although the gap between the achievability and the converse bounds are is much bigger but somewhere in here uh is the actual maximum secrecy rate so again no matter where you are in there there's a pretty big gap too to uh the secrecy capacity all right well i'm gonna uh so let me mention just very briefly i'm not going to talk much about key generation but uh key generation as i said can be can be derived from channel reciprocity that is the alice and bob can both estimate the channel gain for example and that's a shared secret uh there are other ways to do that you can use public discussion this goes back to yuri maurer back in the 90s also you can look at relays to help you it's a still key uh and so forth i'm not going to talk about that here i'll just point out a a chapter a book chapter and i'm going to come back to references later but i'll just point out the book chapter that talks about these various issues and that uh chapter i mentioned with ursi and mahdi also has some of this material in it okay so i want to i've got 15 minutes i want to shift gears and use maybe 10 of those to talk about privacy um so uh first of all so far i've been talking about secrecy that is transmitting a message from one transmitter to a receipt a legitimate receiver and keeping it secret from an eavesdropper okay so that's privacy that's secrecy but that's different from privacy and privacy you don't really have an eavesdropper or there may be one there but that's taken care of by security issues in privacy you're worried about the actual legitimate receiver getting information beyond the information that you'd like him to receive okay so the eavesdropper here is not a malicious actor necessarily it's somebody that you trust and you want to share information with but you don't want to share everything now you don't want to apply secrecy here because denial of access that is secrecy would make the data source useless right so uh we have to think about think about this another way than than we were just trying to provide secrecy um and we can do that through the so-called privacy utility trade-off so we can think about iot systems they're generating a lot of data okay so they're data sources and they generate that data is being generated because it's useful right so the and the utility of that data depends on its accessibility um so you at one extreme you'd like to make the data as accessible as possible to get the maximum amount of utility on the other hand accessibility of the data endangers its privacy so at the other extreme you might want to keep the data completely secret uh in order to preserve private's maximally preserved privacy so there's a fundamental tradeoff between these two opposing goals of making the data accessible and giving it utility and making it inaccessible and making it private so that's a fundamental trade-off that we can characterize using information theory um and so an example would be to for example to measure utility in terms of distortion so as an inverse measure of utility and to measure privacy in terms of equivocation so the more distorted the data is the less useful it is uh or the less distorted it is the more useful it is and the greater the equivocation at bob the more private the data is okay and we can actually analyze that it's like a rate distortion problem with an equivocation constraint so we can actually for various models we can derive this uh the region of potential equivocation utility points and that gives us a utility privacy trade-off region and it has an outer boundary which which is a an efficient frontier if you will that tells us what the trade-off is uh optimally so it doesn't tell us where to operate it just says well if we want to have a certain amount of privacy then that's how much utility we can expect or if we need a certain amount of utility we're going to have to give up a certain amount of privacy there's no way around it okay so that's that's an information threatening problem that comes out of this idea of privacy so i'll just as an example i will talk about smart metering so smart metering electric meters uh are used for price aware usage load balancing and so forth it's useful uh there are smart meters everywhere now these days and they're they're useful that's why they're there um but you know it's well known that smart metering leaks information about in-home activity you can tell what's going on in the house to a certain extent by looking at the smart meter trace so there's a utility privacy trade-off in smart metering and it can be studied in this context in particular if we um model the um so one of the things a lot of the privacy is given up by these transients so if you turn on the toaster you turn on the oven the tv or something the coffee pot that's a definite activity that somebody knows someone's in the house or knows where you are in the house and so forth so those intermittencies are somehow the most privacy revealing uh so we can think about modeling uh the trace as a hidden markov model uh and we're trying to protect the hidden intermittency state that's what we're trying to protect from being released and it turns out that we can if we formulate this problem this way we can see that the privacy utility trade-off leads to spectral reverse water filling solution that is we a reverse spectral reverse water filling solution that is we look at the spectrum of the trace and we draw a water level feed and all the um components that are below whose energy or power is below fee we are suppressed and those whose power is above fee we allow through and the idea here is that we're suppressing those transients which have a low average power and we're allowing things like the air conditioner and the refrigerator and so forth uh where the most of the power is being consumed to go through uh in terms of the usage measuring usage and and then the if you move the uh water level up and down you basically trace out the efficient frontier that is lower water level is greater utility and lower privacy higher water level is less utility and more privacy another way to approach this problem is to is through using control we can have if we have a battery then we can control the battery storage in order to minimize information flow uh out the meter to the utility provider and then we we have a trade-off between information leakage and wasted energy because by doing this we may waste some energy we're not optimizing energy usage we're optimizing privacy uh and we this is a problem this turns out to be a market decision process it's very complicated to solve analytically but we can solve it numerically and for certain uh under certain simplifying assumptions we can we can trace out the the trade-off region all these dots are various points in the region and we can get the lower boundary of that which is the efficient frontier which tell us the optimal trade-off in this case now another uh problem in this general area is that of competitive privacy where we have and this is maybe applies to certain sensing applications and things of that sort where we have multiple interacting but competing agents that have coupled messages so i'm sorry couple measurements and each of these agents wants to estimate some parameters that it has its own state but since they have coupled measurements they can help each other by sharing data but on the other hand if they share data they risk uh compromising privacy so each one of these agents has a privacy utility trade-off but it's in a competitive environment so for situations like that we might ask how should such agents interact and just give you a couple of examples i got into this problem through the electricity grid where you have multiple multiple companies sharing the same um distribution grid or transmission grid depending on where which level you look at um and so they they have to manage that grid they want it to be optimized load matches uh demand but um they're private companies so by sharing so if they share data they can optimize state estimation of the grid but then they are at risk of maybe giving up some private information so you know there's a they have a this is a competitive privacy situation another situation might be in a bit of battle where you have allies who are really just allies because they have a common enemy and they don't trust each other but they want to defeat that enemy so they want to share information but they're worried they don't want to share too much because they they want to help each other but they don't want to they're not completely trusting each other so they don't want to share everything that they have so that's a another area of another problem that might come up like this and finally in a sensor network you might have say doing underwater oil exploration you might have two different companies in the same general area looking for oil and each of them could benefit from knowing the others data the others measurements but on the other hand they don't necessarily want to share because they their competitors so again this is a issue of competitive privacy where you have a uh each each agent has a um a benefit from sharing but also has a risk of giving up some private information by sharing so we can look at this problem information theoretically we can set up linear measurement model where each agent measures a linear combination of the other all the other agents state in noise so the utility for a given agent is the mean squared error for its own state so it wants that to be as small as possible the privacy is again an information leakage to other agents mutual information type leakage or equivocation and if we set it up this way we get basically a classical another classical information through any problem it's just the weiner zip problem uh that is the optimum distributed source coding problem so that tells us immediately how to how these agents should exchange information but what doesn't what it doesn't say is how much information they should exchange that it doesn't it just says what to do when you decide to exchange information doesn't say where to operate on the trade-off between privacy and utility so here because it's a competitive problem it's a game theoretic problem and this leads to a lot of interesting problems as well i'm not going to get into it but you can see the value of pricing multiplayer games cooperation and so forth in terms of giving this problem meaningful solutions all right i'm just going to i'm out of time so i'm just going to mention these other issues quite briefly so other primitives that we're interested in among other primitives we're interested in one is authentication um and it's also something that can be looked at information theoretically and i'll just mention that there are ways of developing information theoretic bounds on impersonation substitution attacks using information theoretic ideas also attacks on mobile ad hoc networks where you have mobile nodes and all the um uh messages are being transmitted by peer-to-peer 40. uh well if some of the nodes are malicious uh how does that affect the overall secrecy capacity of a network like this okay and the scaling laws for that can be derived in this paper that i mentioned here that does that and then finally uh we can look at attacks on sensor networks man in the middle and spoofing attacks also from somewhat of an information through ready lens where we um analyze the chromaril lower bound it's not exactly an information theoretic quantity but it's related uh based in the in the uh presence of these kind of attacks so i'm not gonna talk more about that because i'm about out of time and i wanna i wanna um i'm just gonna slip through that and then i'll just wrap up with a couple of comments and then we'll then i'll take questions so so the basic message here today is that information theory can help us understand some fundamental limits of security and privacy and wireless networks uh of course everything i've talked about here is theory it's information theory these theoretical constructs just like other aspects of information theory although sometimes they point to potential practical solutions like in the smart meter uh example um in order to use this in real networks uh we need a lot of things like more finite block length analysis scaling laws for large networks all the all i talked about were just small two terminal networks two three terminal networks practical coding schemes have been a lot of work in this area so there are some things but again um a lot more needs to be done and then as i mentioned at the beginning other security primitives signature certificates and so forth and then one one point i'd like to make here is that one of the issues that also comes up with physical air security is that it's not really iron clad that is um it's based on certain assumptions about eavesdropper or maybe certain conditions for the use dropper which may not hold all the time so um you know if you're gonna do banking or make uh you might want to use encryption but if you're gonna just if you have like a sensing system or a content distribution system uh you may not worry so much about the having bulletproof security so for example in content distribution system all you really care about is disrupting the content enough so that an eavesdropper or you know a non-premium user if you will uh is not going to want to watch the content that's being disrupted all the time by by channel conditions okay using a wiretap code so we can think about a quantity like quality of security as a parameter so we we have one bulletproof security that's one kind of security and we need to you know we need to go all out and use heaviest encryption we can but if we don't if we don't want to pay for that or if we can't do it we might want to use best effort security and then we can just set a level of security quality security just like we do quality of service today and and network so i think we should think about when we think about physical air security we should be thinking these terms rather than thinking about trying to replace necessarily uh the kinds of things that we use when we want to go do our banking and that sort of thing which really ha can't be best effort they have to be to be bulletproof okay so let me just wrap up with the by showing these references i mentioned um there's a book um at the top cambridge university press uh paper and pnas i mentioned early on uh some other papers here um that i mentioned through the course uh i'll just mention one other thing that i didn't mention is the fifth one down here with semi-augli and alex ditso you know one of the when you talk about internet of things one of the things you're thinking about is machine learning because their data collection largely data collection is the main um purpose of uh iot devices and control so federated learning of course is one of the main learning paradigms for that kind of setting and information of course privacy leakage is one of the concerns with federated learning even though it's supposedly privacy preserving it's really not because privacy can leak through the model or through the model updates uh when they're transmitted so one thing you can use information theory here for is to develop bounds on those um uh on that leakage so there's a spot paper that appeared earlier this year that develops that i think with that i'm done and i'm happy to take questions if there are any if we have time thank you uh all right thank you very much vince that was a great talk at this point we do have time for questions so are there any questions if you have just type them into the group chat okay please feel free to ask you know any question you want vince will give you some great answers i'm going to ask mabel to answer any hard questions no problem okay i'm going to start off with the first question okay vince can you um can you elaborate a bit more on the difference between privacy and secrecy yeah so uh well yeah so seek i mean of course seek you can make things private from eavesdroppers by making it secret right so in some ways privacy and secrecy are the same but there's also another aspect of privacy that is when you're sharing information with a trusted party uh you may not want to share everything right so for example on facebook uh or a social network you're you're one of their customers you're in their network you know presumably you've allowed them to you know see your transmission you're not you're not encrypting things to prevent facebook from seeing what's on facebook but you don't want to share everything right so they're like someone like a trusted party but not complete right you so privacy is like you have some things you're willing to put out there and some things you want to keep to yourself and so with security you want to make everything uh opaque whereas privacy you want to share some things but maybe keep other things privacy so private so it's that the with privacy it's more like the receiver is not really malevolent it's just not someone that you want to share everything with i don't know if that i don't know if that helps at all but yeah so yeah i mean so in some sense you think secrecy includes privacy right right yeah i mean you might think of secrecy as being an extreme of price right okay all right great okay thanks um okay let me uh while waiting for people to type out their questions um it really could could be the case that your your talk was so crystal clear like maybe everybody is just enlightened so um let me ask one more question uh when you talked about information theoretic results for physical layer security problems like um so so so so so essentially you're using um like to prove these results the techniques are asymptotic and not and not constructive so what i want to know is like can you maybe just share or maybe give some pointers are there practical coding schemes for these extra problems yeah well i mean there are yeah so yeah just like other information directly you know the proofs use random codes right so they're not practical but you can you just like with you know like with ordinary you know data transmission without secrecy there are codes that approach capacity right there ldpc or i don't know polar codes and so forth same thing here there are other codes you can you can use ldpc codes for example or you can use polar codes people have done use these kind of codes uh maybe they don't achieve capacity but they can approach capacity but the fundamental idea is that you you use some code words okay to confuse the youth dropper so it's not too hard to conceive of codes but you know you've got their issues of you know how good are they and so forth and so on but yeah you might think about you know you you you have some code words that are just pure purely there to confuse the use dropper um and so you can do that with any code you don't have to do it with a random code yeah to prove things you want to use a random code for obvious reasons but for constructing a code you want to use something you can decode and you can easily come with low complexity and so forth so there are ldpc codes for for wiretap codes and so forth um you know but they're they're limitations and they're you know just like in any system i think there there needs to be a lot more work done in those um on that problem before you know you'll have one in your you know your nest yeah all right cool uh so um so let me just uh make sure everyone understands if you have questions just type it into the group chat otherwise um vince are vince and i are going to continue having a really nice chat um so while waiting for people i have one more question can i can i keep going yeah absolutely so so you know this is kind of a broad audience right um and um you know our goal is to sort of spread information theory ideas coding theory ideas to a wide to to sort of a wide audience so let me let me share something that maybe first came to my mind when i started thinking about these problems so typically when you think about like uh about about you know about security what comes to mind is cryptography right like it's not information theory it's cryptography oh i can just encrypt stuff um i don't know just maybe for a wider audience are you able to just share how how it's like cryptography related to some of the things that you spoke about which is also has the word security but it also talks about coding and information and stuff like how are these two related is that something that so cryptography is applied on top of typically in a communication system uh cryptography is applied on top of the system so you encrypt the message before you transmit it through the system so someone entered can intercept a message but they can't decrypt it so that's that's how cryptography worked at least in terms of data and transmission and um so you have to have a key or some kind of way of computing key and and those the way that works i mean if you had if you were to have a enough key um it would be absolutely impossible to break uh encryption and that's right he's just a one-time pass right yeah but if you if you but that's too expensive you can't really have as much keys you'd like so the way encryption system works is they have a small key 256 bits not so small but relative to the message that they're being it's being encrypted it's pretty small because you're sending out megabits or gigabits right so the way the way that the reason that's secure is because that's used that those 256 bits are used in a clever way so that in order to decode it you have to be able to solve a certain mathematical problem uh it's very very hard to solve and so the computers today can't solve those problems so they can't decrypt those messages but in principle it's not proven that those problems are hard it's just thought that they're hard and so far that's held up but in principle if you had another kind of computer like a quantum computer you might be able to solve those problems and then the security would just disappear you wouldn't have it wouldn't be secure anymore so physical layer security takes advantage of physical the physics not mathematics it's more physic based than mathematics based it takes advantage of the fact that the radio channel has certain undeniable physical properties that you can't you can't undo right right and so that's really what this is about but the issue but of course and and also we can use mathematics to prove certain things about it but it all relies on models so that's where the flaw in this is that it relies on models so there needs to be more development of practical systems and so forth so that they can those models can be um verified in practice so i think that that's what i would say the difference is one one relies on the difficulty of computational problems and one one relies on physical modeling typically i'll plug together right yeah yeah exactly all right great all right um so uh i'm i was looking through the participant list and i and and i recognized some names so i'm gonna um uh i'm gonna error them to try to ask a question all right we have one of my phd students a check thing is here so check if you have a question please type it out uh and then there's some other really smart people much smarter than me um who are here so and they're and they work in this area so so john i can i can see you if you're actually online you wanna you wanna pipe in you wanna ask a question so while we're waiting for people to um ask their questions like i said we don't need to have questions we can vinci you and i can just continue having our chat so um one last question for you which is so you mentioned that you are um interested in this because of uh sort of um uh like sort of energy systems right like that's how you got interested in the problem can i check in your experience and talking to the the the the the energy companies do you find that they're open to these ideas or like have you seen any any actually real world implementation or move towards that like to try to build physical their security in well um first of all i i got mentioned the privacy problem through it for energy systems i'm not necessarily the security pro i got the physical security i got interested in because of iot type ideas you know but um you know of course an example of iot as a as a as a smart grid right so i mean there it does apply there there there are um i don't know about energy companies but department of energy is interesting so uh they're they're definitely interested in the privacy issues that i just discussed having to do with data collection at the edge of the electricity grid being used for say machine learning to try to optimize the grid i'm working now with some other people on a department of energy study where that's one of the big issues is privacy preservation at the edge okay so i don't know whether of course the idea of course doe once is interested because eventually they would like for the utilities to be interested in that and we have partners who are you know national labs that work you know on these problems so i mean the idea of course is ultimately to transfer to practice so yeah i think there is an interest in that i mean um as the elect the whole point is that you know the traditional electric grid is very centralized so there is no data from the edge and it's all right there's no decision making yet but with the you know renewables and storage and so forth it's much less centralized now and there's a lot of more decision making at the edge right and that's two-way communication which is part of that is privacy revealing i mean you know it reveals information about the the the end user so all of a sudden for me all of a sudden but you know now there's a lot more interest in these kind of issues they weren't as worried about it when all the control and all the information flow was downstream but now there's upstream and there's consumers involved it's much more important okay um vince there's a there's a question here from one of my colleagues uh john jonathan scarlett he's he's an information theorist who works at the intersection of information theory and machine learning and he has a question i know he's good yeah all right great so i know you've ever met but i know his work okay all right great so john asks um in a way practice has caught up to shannon's communication theory to what extent is that true for these problems and how do you think that will go in the future that's a great question john so i think it well i don't think practice is anywhere near caught up with the theory here i think this is more like shannon's information theory back in the 50s when people were developing codes and everything but they weren't anywhere near capacity or anything like that um i think that's where we are with this i don't think we're you know there's a long way to go before we can feel comfortable that we would we milk this as much as we can um so i think we've got a ways to go yeah okay thank you john thank you vince so anyone else else have any questions i can i'm i can also arrow the my my team hamao or sekwei or tang any questions from the viewpoint of coding well it's okay you know it's late over here man okay great okay so i think great thanks so thank you everyone and thank you vince for the for the awesome talk and uh please um keep in mind vince we're gonna keep you in the loop we'll send you the reminders about the other talks we have uh four more talks lined up for the rest of 2020 and we're gonna restart this again in in in 2021 so vince we'll keep you in the loop and hopefully you can join us probably the 5 a.m talk 5 p.m talks which would be 4 am for you are probably a bit hard but at least the 9 a.m talks because what we are doing uh to le to let everybody know since the speakers are worldwide we are we picked two times 9 a.m singapore time and 5 p.m singapore time right depending on let's say to to make it as convenient as possible uh for the speaker and also for uh people in singapore and and southeast asia okay so uh please be on the lookout for emails for us and next wednesday uh hanma who's on next sorry next tuesday at 9 a.m is do you want to just say who that is all right so the next speaker is uh uh venkat guru swami from uh cmu um carnegie mellon university yeah yeah you know since uh pro talk about uh catching up to shannon's communication theory um well he will be talking about polar codes where the title is a title uh arikan means shannon yeah yeah so about how well polar coast is trying to catch up with uh shannon's information theory yeah information capacity yeah yeah thank you he's a theoretical uh coding theory person so yes yes yeah a computer scientist essentially yeah so please uh come for that and remember every tuesday until the end of the year we have a talk and then we'll probably start sometime in mid-february after chinese new year so that's the plan um okay so everyone let's uh if we don't mind let's give vince um a virtual round of applause i think if you go to the room you can actually do a uh hang club so if you go under more in the participle participate window under more you can see hang clap so i've done my my my hand clap and once again thank you very much vince yeah thank you thank you mayo great to see you and thank you for inviting me
Info
Channel: NTUspms
Views: 1,002
Rating: undefined out of 5
Keywords:
Id: WoiO5AO_lHo
Channel Id: undefined
Length: 75min 57sec (4557 seconds)
Published: Thu Nov 26 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.