Application of AI in Cyber Security by Dr. Mohit Sewak | OdinTalk | OdinSchool

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

foreign [Music] as you know that the topic of today's session is on applications of AI in cyber security without speaker Dr Moore sevak who is a principal applied ai's researcher at Microsoft so yeah uh greeting to our speaker as well and good evening to you so I think you can go ahead with the hosting now sure thank you very much um a very good evening to everybody and welcome to another session of modern talks if you're able to hear me clearly could I have a yes in the chat please okay just confirm if you can hear me clearly all right great um we have with us Dr Mohit Dr Mohit can I request you to switch on your camera please great um all right so today's talk is going to be really interesting looking forward to it because we are going to bring together two areas that have been really uh popular these days because of its applications one is AI um and the other is cyber security and Dr Mohit brings together these two and talks about how artificial intelligence can be applied in cyber security uh so it's going to be a really interesting talk Dr Mohit I request you to uh talk briefly about your journey um in Ai and how cyber security came to be your area of research and then continue with your talk today over to you great introduce myself in my journey into AI so my PhD has been interflied AI is very specific to cyber security and it has been into advanced say the things are and areas like deep reinforcement learning deep learning computer vision LLP these have been like part and parcel of the advanced AI program for me currently I'm the principal Aid Center for Microsoft security which stands in a point where the security and the compliance is like one vertical for a Microsoft which is called a CIF so I started with a lot of applications of large language models into NLP and compliance side of things and also for the security into different platforms starting from Windows to cross player 2 which is Android and Linux brackets I have been working on the applications of AI for iot or machine iot and before that I have a good experience working with the vision related models for e-commerce so this has been my journey in Al uh security is very special to me I started with Security application for CM based systems which is a security event management in which you get a lot of real-time events and how do you use real-time AI to manage those events and detect signals from it its applications into Insider risk management and then went into a lot of different areas of security and currently the topic of my talk is also where I am working on which is in the area of malware detection we'll be covering a lot of those areas in specifics in this talk for just an overview perspective and then I'll go deep dive into one of these areas thank you let me know if you can see my screen also yeah we can see your screen okay so this will be the enter for my talk today and I intend to keep the stock very customized to the audience in this particular session so I might not be able to cover the entire area that I wanted to depending upon what is the feedback that I am getting and I'm very happy to slow down or increase my Pace depending on how do you want me to do before I would actually move I would cover the security uh cyber security defense landscape and then I'll try to juxtapose my talk and if you're seeing from that title uh the talk actually covers a even within security it has a focus on malware analysis and detection and even from a platform perspective this is uh Windows manager so this is like tip of the iceberg for cyber security but I will give you the entire cyber security landscape for different areas per se this picture that you see over here corresponds to not a particular vendor but a generic industry landscape as uh different aggregators combined and show them so on the top we have managed threat detection and response now this is like a uber level Enterprise thing from Enterprise will not have it depending on their site but I'll just rewind it by one more minute so that you can listen to one is broken here so this is the picture that we see on the top we have managed directional response this is a component that you'll see mostly in larger Enterprises smaller Enterprises may actually have the components below it in different parts and Parcels but this is like the complete pieces for some of the larger processor then within the MDR part we have two different parts one is inclusion prevention system that we call IPS engineers and point detection and response which is edl within the intuition prevention systems we have intuition detection system and once they detect the inclusion then different actions could be taken to prevented so just flow of signals will go from down to up so intrusion prevention system would actually detect dropping malicious packets blocking malicious sources resetting collections selecting CM based system cm is the real time security Event Systems for any of the networks or infrastructures and once this inclusion detection system is installed you will also have different types of integer detection system it could be a perimeter integration detection system here perimeter defines the PDF parameter or the network parameter of your in-house versus your cloud or other networks that have been connected to then there's a VM inclusion detection system so if you have any VMS within your cloud or within your servers at your place useful require some protection then there is a host nutrition detection system so if you have like for this example my host machine that will also have may have an inclusion detection system but a larger component that we'll find in most of our researchers or other ADL surveyors have an AI happens there's about Network inclusion detection system or in its uh sometimes Nets and ideas are used interchangeably so if someone is saying only ideas and their response respect to a particular part it's mostly the best it nips the application of AI is into either form of anomaly detection and there are our Networks then they are non-ai systems also something like a signature detection system in which the packets are brought and their packets uh by signatures are sticky and you come to know whether the packet is malicious or not or from the network pattern also the network pattern signature is breaking and you see there's a network itself is malicious or not the in this area the mldl system mostly happen around the network intuition detection system it could be an anomaly based system or it could be Advanced mlai systems like deep reinforcement learning personal learning these sort of applications we are not touching this area for this particular top so we come to the right hand side of it on the right hand side we start with the endpoint detection and response system so EDR is uh we can say global system for different subsystems it's responsible is for investigating security instance and storing endpoints percent then we have within it something called endpoint protection platform this is where most of your malware detection systems and other systems combine so one bigger part of the thing is over here is value detection system that there are other parts as well for example there is some data being located out of a system that also sometimes is part of either Epp from the larger organizations may have a different system for what is called as data compliance and data governance in that case although the system might be different but that system might in turn with signals to disappoint protection platform or it will be the other way around that input production platform is giving signal to data production systems then using GitHub uh loss prevention and endpoint production a lot of different Insider threat protection systems are also powered this is also somehow connected to this particular and within data loss prevention you also obviously require user privilege control and uh data and disk encryption so that if there are some confidential data that is coming it should be automatically encrypted now I'll come to the part where the stocks Focus which is around the malware detection system so earlier what you used to call as antivirus program on your system the more generic form of it is called a malware detection system because it deals with not only just viruses which are very specific types of malware but a lot of different other managers so we have anti-root kit at this pie where at the end where this is all classification from the first generation value system with both Deep dive into and we'll try to understand what the first generation class are there any effective generation then we have something called an advanced threat protection system so Network detection system may work mostly upon signature matching or other sort of a techniques if an AI based systems or sometimes Advanced Aid systems are starting to be used those would fall under the advanced threat protection and fat protection would require two forms one is a head detection so you have to first identify there's a thread and then uh viewers also have to Define what is an advanced subject so I just named the generations of malware so most of the second generation malware which are more difficult to detect using conventional mechanisms those would come under the advanced rate detection and there'll be some Advanced type of other malware like event somewhere which will also come attend constant detection so one is the advancement of trade how to sophisticated or complex or how packed full detected and the second is the mode in which it comes and how it is targeting a particular motion so sometimes there are very high value machines and someone wants to actually bring them down or uh to do some malicious activity on that they may actually use a zero day vulnerability of that particular platform that is called zero data or they may actually have multiple malwares being attacked in that particular machine simultaneously in which case would you come up swarm attacks so these are also the Paradigm of advanced recording so in this particular talk we'll be focusing on this area which is Advanced detection and we'll be talking about polymorphic and metamorphism and to some extent we'll be also talking about Advanced Attack Mode protection like the zero day attack I take a pause and see if there are any chat comments or questions and depending on that we'll go to the next section of the slide where we'll be talking about the background and the preliminary software Windows management okay so so far I do not see any uh questions or comments so I'll just go ahead I hope my screen is still visible so let's define what is this problem of malware so malware is actually a conjoint of two different Technologies which is malicious software so if a software is there which is has entered your machine without an authorization you are not aware of it and it is trying to do an undesirable activity that is what you will call a miniature software obviously there will be many software which are ill design so it's supposed to do something but it's Keith hogging your memory and your CPU but they are running because of your authorization you are aware of it although there are only terrible activities but it is also supposed to do some detachable activities in this case it is not evaluable it's a badly depends software but we are talking about very specific malware software or malicious software which has entered your machine without the history of malware goes back to 1971 where an engineer named Robert Thomas at the Bible broadcasting network designed a software he named that software as Reaper and that software was designed just to Showcase its knowledge about software systems and the priority takes in actually designing this type of system but soon it started being used as a malware and since then our address has developed a team I'll be developers and uh in fact the very first antivirus program that point in time it was quality Wireless program not just another program but the antivirus program that came to as a difference for that particular malware Reaper who was also named as people the writer for that particular paragraph sometime back the founder of the metaphysis system John McAfee tells the problem of malware it's a very temporary problem and as per him it should have been solved in two years but now we understand that this problem is actually a really great problem it's a wrap race between the two parties one the malware dependence will keep evolving and they'll keep uh well looking better malware it will be difficult to detect and as malware Defenders the star duty to upgrade our systems and software so that we can detect it it used to be about signature matching so if you download a particular file you computer check some of it just to see whether you had downloaded the complete file the tricksome is unique to that particular file but of course even in the Unique Systems or which is called hashimage system there are different types of collision based mechanical so there is a system called sha based system where in the Collision could be minimal and also it could be computed more efficiently that is what we call a snake signature of a particular file the signature could be evolved using different techniques try just one of them if it didn't share their different technique but through any of the mechanism if you can get a unique uh you can say value for a particular file that can be called and if you keep a repository of the signature for all the malware that you had seen in the path and you keep matching it to the particular file that you are seeing in real time that sort of a system could be as simple as called as a signature metric system so most of the earlier system that we have seen used to be signature version now the pros of the signature matching is of course once you provide a particular file it's very easy to get a secret of that file and it's also easy to match an absolute signature but of course there are downsides also because we have so many files and if you want a signature matching system that works uniquely for those buildings or petabyte resilience of file that you already have seen you would require a very large memory space to even store them and keep them now we come to the evolution of malware so although we think that malware is a small problem but we see that the number of malware is actually increasing year over year and month over year so recently we have seen the rate of malware revolution has actually crossed hundreds of billions and some years we get even 170 130 150 millionths of malware to that extent the total number of unique markets I would say that we are sitting on on non-unique magnets that we're sitting is over 1 billion and when I say unique malware there will be a lot of different copies of those method that will have different signatures we are not even talking about them and this is actually a bigger problem and inside and see how things are made and how this could be differentiated across different devices to have a different signature then we'll understand how great this problem is also we see that this description is actually getting more and more pronounced because of how we are moving from work from home economy to digital lifestyle and we have no multiples of devices that we have to write from Enterprise table consumer devices mobiles laptops and each of this is actually a different platform or might offer different opportunities for malware developers to perpetuate an attack earlier with mobile devices coming into picture and people using more and more mobile devices and most of the people who started using a wild device for the first time are not as security literate as some of us might be and there was a hypothesis that it's a mobile device that is more vulnerable to any of those attacks and that is where this will require protection a lot of Workforce moved from server-based devices protection to mobile devices production but still we see like the windows malware or the server-based malware are much more as compared to any of the mobile the number one mobile platforms from adoption perspective and the number of rice perspective today is Android if they compare an Android malware or on the perspective volume of known malware to Windows malware we see like currently around three percent of the windows malware in fact even at their Peak when mobile devices were coming new new people were coming and it was very located from us to make malware or it was live at even at a speed around six thousand so that is where this talk is focused more towards the Windows management but per se we can foreign that we see on the signature side of things that we are trying to capture uh narrator signature one is how many malware you can capture two symmetrics and how often those signatures could be updated which is quite more of a technical or engineering problem wherein you might get in your updates for your empty malware or antivirus program and those updates will contain a lot of different new types of signature of the malware that has come listen that's not a problem per se but the bigger problem is once the malware starts evolving which means that I have a base malware today and we'll just slight changes I can keep producing new malware now the problem with the signature biasing thing is that they will do a exact signature match so even if there's some changes even in the packaging of a particular malware so if I strip a file I unzip it and I zip it again pretty likely the signature of the new zip package will change now this is a very random example of a set we do not do a malware detection at this level we do it at a finally but this is just to Showcase how packaging even slide and packaging and packaging can actually change the signature of a particular file which means that even for a malware it would sort of a packaging in the packaging and packaging is one operation you can do like multiple types of operation in which without changing the malware you can keep changing the signatures of that particular and with changing of the signature these malware based external Antebellum system which are based upon signature matching cannot work that is where the role of machine learning comes so we there's not two sides of machine learning that I'll be talking about one is what will be I will be calling is convention machine machine learning or classical machine learning so I know you all are from a data science background you might be learning about a lot of different data science applications and they got them so anything that is not into deep and advanced learning the types that I talked about reinforcement learning deep learning computer reading NLP or applications of deep learning models for computer vision sunlp this is what I will be calling this Advanced right from non-words or anything before that which is like your random followers decision trees even a n which is artificial neural networks which are non-deep will come under the conventional side of things as I used in this particular so the first evolution that we did was we started using conventional ml to detect this particular amount and it has got a lot of success using conventional uh mlbased systems but because of the things that are you just talking about we see that even those conventional mlbased systems are not sufficient these days if to come to the list of problems that we face today even after the evolution in the computer is conventional these are the four points which is the motivation for this particular top going in first the changes that we see in the industry where malware developers themselves have become organization Hardware development is now no more thing of a particular person's hobby or a particular smaller groups initiative or either protecting their own state or acting on the purpose of their state or for particularly grated reasons but it's not full present industry like you have in a corporate like one Department which is thinking about marketing the second department is talking about mergers integration and so on so forth and Financial similar thing is now happening into the world of malware so some sophisticated marriages like an answer now these are perpetuated as when somewhere which means now there'll be some groups that will be doing an initial access broker development initial access broker is a type of malware that would be uh infused at your system or Capital system and they will open a backdoor for other teams who would penetrate into your system take control of it and try to do some other malicious activities so finally a ransomware is strong so there will be different teams or that will be working in conjunction in collaboration with each other to compromise a network or a particular has come up as an industry the malware development and the tax system has come up as an industry and we need more resources to uh deal with such complex and integrate structures the second thing that we talked about is that signature based techniques is still a mainstream most of like 80 90 detection you would say that still happens to a signature business but because of all the techniques that you talked about either packaging and packaging in more sophisticated techniques which is called specification we'll be talking a bit more about obfuscation in the later slides is a technique in which at the file level you change something such that it is not recognized as that particular that something could be changing at the code level you can actually have some blank spaces left in the code and recompile the particular code you can add some comments in the code or you at the lower level when it comes from the high level code which we write in after that is the compiled code and the compile code goes into your kernel at that level it's called an up code or operation codes so at the upward level you can actually introduce something and then recompile it back there assistant screw there all of these are different ways of doing obfuscations and with any of these very minor obfusiations also the signature subject that is where that analogy that I use for packaging comes to the malware malware design initiation so all of the signature based techniques could be evaded with this that is the second problem the third problem which makes this problem very complex is the advanced mode of it so it's not about that now you are having one random malware and you assume that this malware one table in a particular machine that machine will get collected and somehow either I will gain something out of it or for the fun of it that machine is there it doesn't happen now this way you know a lot of countries are fighting against each other and a lot of their first line of difference nowadays is no longer your Conventional Weapons but these are like your cyber systems that are being compromised in one state actors can actually have another state uh actors defense mechanism being brought down with cyber attacks so in these uh type of sophisticated attacks not just one malware that is used so it's quite not only complex malware but also complex mode of attacks and next in the talk we'll be talking about how Windows malware has evolved we'll be talking about the polymorphic measure so this is the core challenge that we are facing that Windows thumbnail has become very very complex for even conventional ml2 detail not just signature nothing but even conventional is not able to detect those type of advancement the fourth problem that we see is like when we use Design Systems with conventional or the classical ml it used to take a lot of expertise to design even the features of it not just the model but even before the models the features that was interesting model requires a lot of experimentation such that you can get a very good model which has a very good detection and also is very efficient to work with now with the speed of evolution that this malware development industry is going through these sort of a systems are very expensive and very slow to them that is also where a lot of motivation of this performance talks with so in this particular talk we will be exploring different types of advanced AI Technologies and how they use them for different applications of one malware defense and we'll be talking about different types of a technology starting from rnms lstms Deep neural networks reinforcement learning deep reinforcement learning and deep clustering for their applications that is one aspect of it but if you would have actually dealt and taken some sessions on deep learning you would have understood or if you try to deploy any of the detail models you would understand there are a lot of different practical implications of doing a deep learning task in production or inference included learning Networks one of this is that they are computationally very expensive so if you have to infer across billions or trillions of files they are going to be very expensive because not only you will have to have those many machines some more but also these machines would require gpus which are much more expensive than the CPU based machines the second part is for the training also you require much more humongous data than you require for classic limits sometimes it can be like 1000 times more data if you're lucky but most of the time 10 000 or a million times much more data for any advanced applications and the third is of course which is very specific to security is that the data that we use so in security is very scarce which means that we do not have enough label data to create these type of systems so for example compare this to a large language model you can crawl the web and from the web-based data you can do the training of the models or it may actually require a crowdsourcing based mechanism to even label those particular data and you can also have supervised learning mechanisms or two train your model but in case of security it requires a lot of efforts and expertise to label each and every data and even with the expert knowledge it's very difficult for anyone just to see a particular file and say whether it's the malware or not an algorithm you would have to do a lot of operation they will detonate that particular file see it in the sandbox what is the applications or go inside and read the code or some pseudo code of it to understand so it requires a lot of efforts to label each and every file which will go into the training of these models so these are the challenges that we'll be trying to solve in some of the works that will be seen at uh I'm not able to see the screen but if there are any chat comments or questions that you have I'll just take a pause for the moderator to let me know and then I'll go ahead with the talks so I'll go back to the presentation so before I move ahead I'll just touch upon very very basic ideas upon the different areas of deep learning that we'll be using in this particular top this is in no way trying to do justice to any of these areas but if you want to just connect the doors and get the very high level understanding of this area these are just one or two sliders of course I know you might go deeper into this area into some of your own courses or we can think about something into these lines so the first way that we'll be covering is like what is deep learning so according to Jeffrey Hinton who is considered as the father of deep learning yeah computational models which are built into multiple layers of proposition and the key aspect of deep learning as compared to Conventional machine learning is that most of the feature transformation happens within the model so you do not require experts to actually tell you which feature is more important than the other and with what goes inside the market you can have your data and let the model itself decide which feature is more important and of course the third aspect is that you can also have very complex non-linear features which is not very cognitive in for humans to understand that it will lock nonlinearity can go into defining a particular output function from an input functions you know there are primarily two different types of deep neural networks one goes into the areas of sequential and the second is into non-sequential the one that goes into sequential learning or is the detail Network and one that requires the current sort of a data architecture and it was a sequence of data not the sequence of the architecture is called a recurrential but now we'll be talking about two different sequences one is the sequencing of the architecture of the network if it's in a straight line well one layer only fits to another layer which is a feed forward Network one is that sequence we'll be calling it three forward networks in this particular case then there is another sequence which is a sequence of data how do you use it whether it's a complete sequence or it's like one record that is what we'll be using the rnl force and then there are domain specific architectures for example in computer vision one type of Architecture is very common which is called a convolutional neural networks which is a 2d space architecture in which you do not have a vector but you have like a tensor of two Dimension and that tension can actually go into different uh layers so we'll be talking about all of these uh later on but we'll be starting with DNN if you can understand this maybe at least one part of the area will cover up very interesting so the first uh deep learning model that we'll be talking about is called a multi-layer perceptron based deep knowledge Network in this particular model you have this as the input layer then you will have n number of for hidden videos you just could be two this could be three four five six depending upon application you can keep increasing this number and most of the magic happens in this deep learning networks will automatically create the best features that is required happens into these layers and then finally you have an output layer this output layer will give you the prediction so if a prediction is something like a binary prediction whether it is yes or no it will have only one unit if it has a multi-class prediction it will have as many units at the number of so for example in this particular case we are just interested in knowing whether it's from a particular file is a malware or not malware this output will have only one node which will give either one if it's a malware or zero as an output if it is on the other hand if we are interested in knowing which type of malware it is and we have like different types of malware design then it can have those many number of nodes in the last layer now we will be talking about three different concepts uh which will be using interiorly in this particular talk one is the activation function loss function and the optimizer let's come back now if you see these arrows each of these arrows will have a particular weight which signifies that from an input to a particular node once a signal is going how much you should wait to so this particular node may be connected to all of these input nodes in which case it will have a weight for each of these connections suppose the value of this output of this or this particular input is A1 this weight is W1 at this particular node and A1 into multi W1 is coming from this particular node similarly to the coming from all of these and this goes into a matrix computation sort of a mechanism in which the input to this particular node after combining from all of these input nodes will be something like A1 W1 later W2 a3w3 and so on so forth and there will be some devices that we attached to this course this if this is the input then the function that will convert this input to an output which is coming on this particular Norm which is output from this particular node is called an activation function in simple words then the second part that we saw was a loss function loss function is very similar to your classical and machine learning loss function which is like once this network is being trade what is that particular function or that particular loss that it's trying to minimize so in case of classification the example that we gave the most common in malware detection it's uh mostly a cross-center of field loss but there are other losses that it will go to in case of linear regression based systems such mostly linear loss or some other loss linear category is in the third part is optimization function or optimization algorithm which determines what is the most optimal way that you should be training this particular Network parties where the whole objective that we have is to train the weights of all of these components what is the most optimal way in terms of the both computational cost also and other parent parameters like overfitting per se to gain this particular Network so how I should be updating each of these particular node with every iteration of training so that I train it in the fastest and the most particular manner to get the best accuracies and best efficiencies out of this next architecture that we'll be talking about and we will use in this particular talk is called an auto importance now Auto encoders are used to transform the input feature so we talked about the Deep learning can do it automatically and even if we give it to the people Network so your raw features to some extent it will automatically create those features but to make the task easier for the different network and if you want to have a different classification algorithms working in parallel we would actually segregate that layer of feature transformation a bit earlier in the settings and we'll have some sort of our future creation program In classical ml you might have heard about principal component analysis or SVD these are the things that are used for pre-feature transformation in deep learning we use a term logic called embeddings for algorithms like CNN and RNA those are very specific to those particular type of tasks but a general type of future transformation that can go into multiple things is what can be used as Auto encoder is can we have our lower dimensional representation of the data which can work almost as good as the higher order raw data that we have suppose this is my raw input feature so for the purpose of this application assume these are the features that we have got from a particular file that you want to design and these features could be said 10 000 features now you do not want to work at the level of 10 000 features because then you require a very large neural network also at this particular level these are sparse features which means that you will have for each of this feature yes or no whether this feature exists in the file or not exist or you'll have just one particular value of this picture so these parts of our features whereas what we want to use in the final model is this particular name which is called a bottleneck layer or a feature layer so this will have much lesser number of neurons in so if we started with 10 000 neurons over here maybe this layer will have only 64 or 128 sort of neurons and this will have a much denser representation of the data now that good thing about deep learning is it works very well with density presentation of data but it doesn't work so much well with many number of dimensions of the data so with this architecture we number one reduce the number of dimensions in the data number two make them a denser representation and how this is done is instead of just keeping this part of the network which we are going to use finally as an input we Mirror Image this part of the network into the output side so what we do is if we have like kept 10 different encoder layers a bottleneck layer we will copy these number of encoder layers into the decoder layer so we'll have 10 different decoder layers also in the same order so if these layer 30 coming down as 10 000 and 8 000 then 6000 is the number of neurons we will mirror image that into a reference order so it will be six thousand then eight thousand and ten thousand and then finally you have the output layer which would be a mirror of the input layer so it's the my input layer had like 10 000 neurons the circuit level will have exactly 10 000 a month and I will train this network in such a way such that if I bring it down from say 10 000 to say this 64 or 128 neurons and then I expand the information from this uh 128 or 64 neurons back to 10 000 neurons what is the loss between my output and input and the whole objective of my training would be that this loss is but think about it conceptually if I can upskill my information from say 128 neurons to the 10 000 rewards and it captures almost same amount of information as it had in my 128th results which means my 128 minerals is the perfect representation perfect dense representation in a multi in a very small dimensional space as it was in my original 10 000 neurons so this is the whole idea of this type of water and quarters it tries to compress their data without having to Bear much losses on the data and then once this network is trained we'll remove this part of the network which is the decoder layer and the output layer and the output from the bottleneck layer which is this particular layer will start going inside your deep dual network or other neural network so for example now this input layer earlier this had they supposed 10 000 neurons which was coming from the raw files if we use this type of an architecture from 10 000 neurons you can bring it down to 128 neurons or any number of small number of neurons and it is this particular thing that we'll be using in surface input so the size of the final Network that is going to happen after it which will have much lesser neurons because now it is getting information from a lower order or Dimension inputs and then the whole size of my network will go down and because of my size of network has gone down I would require that much less computationally in fact let that much less data for me to optimally train this particular Network so this is the benefit of this type of networks or feature transformation then we'll be talking about a different type of deep neural network server we will have information flowing into non-sequential layer in which the data we'll be talking about is not at a one particular time particularly coming into sequences and those sequences is what you will actually go into these Networks I will not go much into depth of this particular architecture this will take me some time but once we come to it uh we'll actually can have a separate session which will go deep inside these lstms and rnns from where the pace we have the credential Network which was which is now known as the family no one uses the RNN architecture per se the original one but there are different architectures which are based upon the RNA some of them are lstm which is long short-term memory then we have something called a GRU gated recurrent unit and it's not called ecostate Network that is very popular in security we'll be talking about them without getting too much deep into the architectural side of it but the idea of this as compared to the previous gain and that we saw was that in DNN irrespective of how the data was originated you have to bring them at a particular non-sequential State per se and then that will be consumed whereas in this particular architecture the sequence of the data is maintained so for example if you are talking about Let me Give an example from a language perspective so these words that I said they have a sequence and if you see these words in the same sequence that I am speaking you can actually to some extent predict whether it's the right sequence or a wrong sequence from a English language perspective and maybe from the sequence you can also predict what is the next word going to be or you can at least have some idea of what topic I'm talking about but if I jumble up this particular sequence and I actually just present you the dictionary of a particular version and I tell you like what was the frequency of the different words that I spoke maybe just seeing those frequency map you can come to know what topic roughly I'm talking about but you will never be able to understand the exact language that I spoke about or predict the next words or next topics in the language because that sense of sequence is so these type of rnns preserve the sense of sequences such that you can do more complex things around it in the context that I explained I use the example of spoken languages in terms of malware or the actual code the language is written in the higher level code or the compiled binary set we talk about that can also be equal the example that will be giving you in this particular talk will be on the operation course or the of course this of course is after the compilation layer when it is tuned to a particular kernel each kernel will have some set of instructions and when you put a sequence to those instructions to absolutely to the processor or contains what is called an OP code so the sequence is that we're talking about oh there's a sequence of this up course that will be trying to analyze using this technology then we'll be talking a lot about reinforcement learning and to be more specific deep reinforcement learning uh there's also a book that I had ordered uh authored in the case of deeply personal learning there are some papers also that will go deep inside it so I'll not go very deep inside it but for the purpose of Distinction there are two different types of uh one is called a value-based reinforcement learning the second is called a policy based enforcement for the very starters who do not know about interpersonal learning just let me give you a deep background so essentially earlier there was two types of machine learning one was called the supervised machine learning the second is called as unsupervised machine learning supervised machine learnings when you're trying to predict a particular things for example in the example that I gave if I'm predicting whether it's a particular file is a malicious file or a non-malicious file that is called a prediction and then that will come under the area supervised machine learning or even if I'm saying like which type of malware if I have to choose which type then also this will be a supervising machine learning on the other hand the type of applications that I spoke where we are just doing a feature transformation or we are doing something like a clustering that we're talking about where we do not know the actual label of the file and the systems are not trained on the label of the file but the customer trying to give me more information about the general distributions or the general price systems themselves or they are being used not for humans but for some other algorithms to get them more informations about a particular file there those things will come under the unsupervised machinery category there's a third category of machine learning which is not much talked about that is called reinforcement learning reinforcement learning sometimes is classified under approximation or you can say optimization based techniques and sometimes they are a technique of its own because of its more proximity towards the AI side of things in reinforcement learning you try to train an agent such that it can take in optimal learn and optimal policy now when I say policy that's at a very specific meaning in terms of machine learning so a policy is a set of actions that you will take depending upon the scenario that you are presented during so suppose you are driving a car okay and in this particular case assume it's an autonomous car now everything that goes inside the learning of how do you drive the car under different scenarios is called in policy so when I have to apply the brakes when I can increase the speed when I can decrease the speed when I have to turn the steering left industry right when I have to maintain the speed it's uh start the indicators all of these will be different actions that you can take when you'll be taking this particular actions will depend upon the scenario so if there is a car in front of me that car is slowed down you should also slow down and when the car has stopped it should also stop when everything comes you start screwing on the screen all of these are right uh different actions within the policy so a policy could be the different set of actions and when we should be taking this particular action this is a very simplified eye level example that I'm giving you no way accurate but just to give yourself what is a policy so there are two ways that you can reach this particular optimal policy and these two ways Define what type of reinforcement learning that you are using the first example that I have is a value-based and what is the utility value of that so for example now in the autonomous car example suppose someone has come in front of me I start giving points to my optimal scenario in this my most desired scenario would be that the car should immediately stop in this particular case there could be other actions also that you slow down there could be other actions like you keep continuing your pace some more actions you take a left you take a right and based upon these outcomes I start giving it some of the utility points now in this particular example I am just giving it a random utility points but all of these are more scientifically designed into a real application so suppose in this particular case I say that if you stop the car immediately if someone has come in front of you all of a sudden I'll give you maximum number of points so suppose 100 points whereas if you slow it down I'll give you say 20 points if you take a left turn and right turn or right turn to evade that particular person I'll give you four points if you do not do anything maybe I'll give you a negative point and if you hit that person I will give you minus ten thousand so this way I am defining the value of each of this particular state in this particular case the reaching of a particular scenario and scenario is called a particular state based upon the actions that you have taken so I do not know what actions you're taking but the state that you have read which is in this particular case top state as the maximum value if I stop giving points based upon the final state that our agent has used is Specter of how it has reached it then this is called a value-based reinforcement which I do not know what is the utility of a particular attraction but I know the utility or the value of particular state that I want into on the other hand there is something called a policy based defense person learning policy as I described is actually the trajectory of State action sequences that an agent will reach based upon what it wants to apply on in a particular sequence of things now the whole idea of training a deep in personality or any personal Landing agent was to learn an optimal policy so in that particular case a policy based personal learning is a more direct application of impersonal learning but as you may have understood from the definition even to define a policy is very difficult whereas it is easier for me to give values to the end state where but from a human perspective and even from mathematically you will understand I'll not go into the deep of the mathematics person that is there in my book and we can take different sessions for it but to mathematically Define a policy it's very difficult and to converge it is even more difficult so this type of reinforcement learning although is direct it's difficult to train but it can deal with a lot of different complex scenarios and the type of scenarios that we'll be talking about are more complex to be dealt with value-based means personal learning so a type of reinforcement and learning invariably in the example that we are going to see next so the policy is whereas for most of the applications that we would have heard about in personal learning is you would have heard about AI being used for personal learning most of the you would have heard about say computer games on other games that would have had about even the game of Mario atani and other things most of these uses value-based reinforcement alert and in fact it will be more precise if this data is coming from uh image so suppose if a real-time agent is playing a game of Mario or Atari these would be coming in the form of the input will be coming in the form of video frames in that particular case they will be using the input in the type of the deeper Network problem which is conventional input in that case it will become a deep reinforcement learning instead of a classical reinforcement but we'll be not dealing with value-based reinforcement learning we will deal with policy based information learning because our user is really understanding yes it will be more complex then another area of deep learning Advanced that we'll be talking about is great velocity this is a more evolving period of the sun this is not very much defined still upon acted upon it's very new in the search domain to speak much about it you have to first I think most of you will be knowing about clustering from a data science background but clustering is a type of unsupervised machine learning in which to try to Cluster or group the data points based upon certain patterns in the data now in supervised machine learning or in classification also you are doing a grouping of the data so for example I talked about whether it's from an algebra or some organizer there are two groups who are in this group one one is group zero but the only difference between your supervised learning or classification in this particular case in clustering is that in supervised learning you know what these groups and you consciously want to get to that particular right group of data based upon your early understanding of the group because of the labels that you have in your data you can reward your algorithm based upon the white prediction what's this one in class plane you do not know what the groups are but you want to understand the data and to understand the data you want to see logically from the attributes that are given all the columns in the data that is there what is the right combination that will automatically group the data into very specific number of groups if you know how many groups you are interested in or sometimes how many groups intrinsically exist in the data person now this analysis that you do can be used for your own understanding like if I'm doing a market analysis and I have like some demographic data I want to understand like for example the students of audience school they want to see like what are the different types of tools such that they can Define the different training programs for each of the group of students they can use thrusting to determine those groups of students or sometimes they already have like training programs that are targeted and they want they can only have like four different groups in that case they can determine that we have only four classes let's see who comes into which type of clusters and then based upon understanding of those cluster going inside those clusters seeing what is the commonalities in each of those questions you can Define that this is the right training program for this type of but you do not Force classify and do the things that you already know now clusting has been used for very much for different applications that I talked about and in different applications of AI even in cyber security but what we are talking about here is deep lusting now there are a lot of different problems in classical roasting one problem again is the load it can take and how scalable it is but the major problems do that is to the similar things that we talked about in case of deep neural network is number one you require an expert who understand the data and the outcomes to actually even make the features that will go inside so in most of the security related application for more data scenarios where you have enough data even outside of security you do not know what is going to be the right feature Transformations or whatever features that should actually go inside those things the second problem is clustering is very sensitive towards the type of algorithms and the outliers that is coming in the data so for where I will not go deep into different types of clustering techniques but now we started with say I think the most common names in cluster screen that you might be knowing about is k-means and hierarchical clusters many of these applications you will see like a lot of different other type of clustering for example noise-based gastric spectral clustering column address cluster screen coming up each has its own drop and that is we are seeing the type of scalability issues and noise issues that we have quickly for the areas where the data is not so cognitive a new field of testing that is coming is D plus string uh and we'll be talking about this awesome a lot of different mobile applications that we have done from this particular art now I am going to the next section of my talk but before I go there I'll just check if there any uh questions that you can take or any suggestions on the speed is the speed file can I go with the same speed or you want me to go deeper slower faster any any questions from the audience so far if you have any questions please feel free to type it in the chat box in the question q a box any of these Concepts that were discussed if you want a simpler explanation or you want to understand them in more detail uh please feel free to ask foreign so as I said earlier there are malware will have different levels of complexity so some time back we used to just classify as the malware by its basic action or the objective that the malware tries to achieve for example if a malware is just trying to pay attention back door for someone else to take the control of your keyboard and that someone will actually uh using the remote access to your keyboard will steal your passwords or type some commands and so on so forth that is called the root notifications or the root of the backup process those would be called as root Gates and so on so so most of these classifications the first generation Awards based upon what action the malware is trying to take and how he's trying to harm you but now we recognize that this classification might not be optimal and from a perspective of Defense it should be based upon how complex the malware is most of the earlier generation malware would be a direct malware which means a program that would be inserted into some of the other software that we might have got or through some of the email attachments or the purposes whereas the second generation malware would be more complex in terms of the identification of this so the second generation malware comes into four different forms one is an encrypted manner Act is very simple so if I take a file I encrypt it so for example now it suppose you give me a document that has some words ABCD in this particular sequence now I start saying I will not call A A anymore but I will call it say suppose number one and I will not call BSB but I'll call it as number so I change this document which is saying ABCD to one two three this at the basic level is called a point for text which is like a text represented in some other form of text a more complex form of hiding this information would be encryption which would do a similar sort of a thing but into a more complex algorithm such that you can also reverse engineer so a lot of information that you see that transfer will happen flows through the internet of course at a level of secure channels is two encrypted information you might have seen your website some of the best website start with HTTP colon double slash some of them would be https not that s stands for your secured websites which uses encryption to transfer the information such that even if someone tries to leak out information through somewhere between your final origin and your destination someone would have like access to your network or some other thing even if they get the information they will not be able to use it because in the tunnel data between origin to destination it is the similar thing happens when malware also so with every attack from malware can actually encrypt itself and once the malware executes the now since the earlier system for using signature matching the signature of an encrypted malware despite the fact it was the same as also manager will be entirely uh different the signature will be different from the organization so people or malware developers started using this sort of a encryption of malware to eraders in nature matching desktop but the problem with this is that you can actually have the signature of the encrypted manager also so if I am using a particular encryption key I know that this encryption key would actually change the smell there to this particular state so for that particular State also you get the same simulation and I'll put it into my simulator data so by signature matching can also be detected so the malware developers became more smart and they started using something called an oligomorphic but it will not have one signature it can use different uh descriptors to actually encrypt the malware into different form and also then decrypt them all the way to different forms you know for one particular malware you do not have one particular signature but you can use different decrypters to have different encrypted manager of the same now that is when the malware Defenders also started analyzing this pattern and instead of then trying to detect the magnets just started detecting the decrypters now since the decryptor was not changing so irrespective of which malware trying to include sort of things became simpler because you can have like 20 different manuals sort of identifying those 20 different models that you are using as a hardware developer if I developed a decryptor I would like to use it for a different values so instead of instead of detecting the malware I started detecting a decryptor and I used to just plug that particular one and then that would be blocked performance 10 years that is when the two most advanced form of malware which is a polymorphic malware and the metamorphic malware were designed by the Malibu developer so what happens in polymorphic malware the detector are changed using the techniques which I decided forms of encryption based technique itself could be not changed using techniques of changing the code or the memory allocation systems or the system such that nor the detector couldn't be identified anymore with every attack there will be some chunk of courts or jump codes people inside certainly Friday obfuscate into a news so now the fight comes back to again identifying the malware and we have to identify each and every manual as different form of polymer although signature matching wouldn't be done but you'll have to use a lot of different machine learning to identify that is where the most complex sum of malware which is called as metamorphic malware started being developed by the malware developer in this the same technique that we were using for the decrypta failure of polygomorphic now this is starting being used to the malware and depending upon how many malware you could have n number of metamorphic copies of them as well while changing or of course getting those malware at that at this level it becomes so complicated even most of the machine learning based techniques that we have are not able to identify different variants of metamorphic malware even if they are coming from the same file that a particular uh malware different system what I've seen earlier okay most of the stock would be focused towards identifying these type of advanced malware and why we cannot use the type of classical machine learning that we have been using earlier why we need uh deep learning as a forms of advanced machine learning this is like the history of how different types of machine learning started coming into picture to detect malware I'll not go deep into that maybe once if you plan something like a workshop sort of a session to use different types of machine learning from other detection you can have some insight from it but I'll skip this slide for now and I'll come directly to the evolution of deep learning based systems and that will help us talk through the first section of difference so deep learning started being used in malware direction from 2015 onwards and so on where they are seen like the MLP DNN based systems being used for velvet detection amazing yet like a lot of high detection rate which is like 95 percent of Nicole started being shown at 0.1 percent appear for those who do not know about this terminology we call it something about if you have injured Marvel out of the sending malware how many malware you are able to detect that is called recall that engine malware could be coming to you in the form of say denial files we have thousand benign files in those thousand benign files will be called as a Recon of my particular model and in this particular case we are biased towards the positive side we are fine to I am not saying that we are trying to block a particular Financial that again goes into false positive but we cannot let any of the malware pass through and that is where the specific terminology of recall is important from a security perspective the form of click on or the false alarm rate or which is also called a call positive so the other terminology that I described was suppose there are no 1100 files in most of the cases your measure will be very few you have magnified will be very high but I tune the model such that it is able to identify each and every email there very precisely what will happen to that malware model is sometimes it will also make a mistake in calling a benign pilot military that is what we call the false alarm and the rate which is like out of uh for so how many of them are late will be out of one if you just divide by the proton profile but if you add a percentage so out of 100 how many of the Earth are false alarms that will be called as a false positive if it's just on it there is no number just so it out of one if the percentage out of it just translate out of Android so in this case 0.1 percentage means that out of then in the same year uh similar studies started happening on the recurrent architectures of deep learning which was RNA in this particular case and a very specific variant of R1 which I described as ecos state Network or ESL you would not hear about this particular variant in any other domains for example but this was very specific to what we can talk to in security and then also it was seen that the detection rate was very high I'm not comparing the 95 above 98.3 percent per year because the data was completely different so you cannot do like any apply to Apple compressor but these were the two prominent researches that fed the steel cyber security to start using deep learning over classical machine what both of these actual tools were the Deep learning at least for the type of data set that they used was performing much better than any of the classical machine learning what's the terms of recall and also the original ambulates then in 2016 onwards we started also seeing the evolution of CNN which is the type of deep learning algorithm that we saw for computer vision and in this particular case it is also started uh proving its goodness in terms of amalgia detection in 2017 like whatever work that was done in 2015 with the RNs this experiment was repeated but other forms of recurrentium networks was also used mainly like the more popular algorithm that was an LP domain that point in time or lstms who are more on the performance side Gru's have delivered like different architectures it was more efficient in use and sometimes we used to give equivalent or sometimes even better performance than this becomes very popular in NLP so they started being applied in cyber security just to evaluate if ESN was the best as predicted by that particular research earlier or they are all these popular uh algorithms are also good and they found out like similar to the LLP domain let's TM is actually more performant in cyber security also as compared to the esm which is not so popular then in 2017 onwards we started also seeing how different sequence based data and comparison of different layers of architectures of material and that is where they established that if you compare between DNN which is like a sequential architecture to it or uh which works on non-recurrent data to lstm which works on recurrent or the sequences of the data in most of the NLP based Romania found out that the whole motivation of moving to lstns and RNN space architectures compared to dna-based architectures that these were more performance in the rna-based architectures were more performing because they would preserve the sequence of words as we speak and that is the sequence of word that gives the language it's very meaningful if we were able to preserve the sequence then you can get better predictions on the worst you're going to speak next or classify the sentiment of the works and interact sort of a prediction of classification or generation thought about but this was like a real example where people found out specifically in the case of security that DNN based architectures which doesn't actually maintain some sequence of these tokens or words as you may call it works better than the one that actually maintain the frequency because what I can unique and normally that was found in cyber security and that's where I put this particular paper and then in 2018 there were other examples of deep learning we started working on as low as op code level so most of the other were a very high level features there are different types of analysis in malware detection one is called as traffic analysis one is called Dynamic knowledge static analysis the ones in which you get a picture directly from the file without making it work like a file or without executing it or without C what is the behavior if you actually execute that file or do some other ways that we get the behavior of the file what is trying to do that will call it the category of dynamic analysis most of the things that I'll be talking about will be in the case of static analysis and in static analysis if you have to go deep you have to go to the deeper structure of the files maybe in the form of the binary that it presents or maybe in the form of course that we talk about so 2018 onwards these things started being using in a very very complex level and what makes a difference on the other applications that at this particular level we are talking about binaries no human can read it and understand what is happening around malicious now we come to our first part where we directly apply some of these neural networks which is in this particular case TPL Network that we talked about for the purpose of detection of a malware from a Windows I'll again take a pause and see if there are any questions and then uh we'll complete this particular section and we'll call it a day for this particular session and we'll complete the other parts in the application maybe in some other so I'm just seeing if there are any questions well so there are one or two questions that you want to take at this point um how are second generation malware created does AI is AI used if yes which part of AI is used okay that's a very interesting question uh with a question to the answer the first part of question how is relational malware created and that will also AI previews in fact in the stock itself if we go to the third section that will have a lot of AI you could create of course the idea of this talk is not to arm you with things that could be dangerous for the whole society so we'll not we'll have like specific protection that we have kept in our AI and if you remove this type of protection you can actually end up with a real AI based driver but today how second generation manager could be created number one is we talk about encrypted malware so different types of encryption and algorithms you take a particular malware and you descriptive with those different algorithms and even within those problems those possibly have in that order which is called an RSA which is a very very popular model in all the file exchanges that you do over the internet if you would have got a public key and a private key most of these things would come in would be coming to you from a RSA based record you will have one key which is public key you can keep taking different types of public keys and you can keep encrypting the file it will give you different uh copies of that particular manual it is as simple if you want to just create a metamorph then there are also virus generation toolkit if you just Google by this particular name as Wireless relation toolkit they will also give you some sort of a z package level I'm not calling it as a close second generation manager but some of the manual relationships can have capabilities of creating a second generation manual mostly of the type of encryption and oligomorphic metamorphic and polymorphic manual it was more expertise in terms of human expertise of defining it but a lot of things that goes into obfuscation the type of techniques I'll talk about inserting junk cores or memory allocation systems that is what will take into creation of those things those could be replicated to a software based systems that level it would be mostly around scalability you can still create it but you will have to actually replicate it uh in terms of the design of those particular systems manually AI based systems are now coming up uh in this talk we'll be talking about some of these techniques then but there's no public literature and obviously no one would leak a public literature if they are actually using AI to develop it but it would be very nice to believe that there is no AI based system if we can do it no one else can do it office so in the third part that we're talking about will be also talking about aib system to generate manual like systems okay this will be sort of a simulation you will not be with an email with the protection that we have kept but then we'll be talking about the differences for uh AI based manager so if someone is actually creating a business how do we defend against very good question thank you um to have another question in HTTPS foreign text to another text in terms of visually but it has the same meaning if you know how to we will solve D there are different types of algorithms being used on internet for encryption other sales place like one of them so you have for this secure system there'll be keys that will be on the servers and those keys what both of the knowledge in the destination service and those keys would be used to either encrypt that particular message that has been sent and then the receiving ads if it is a https system that login information your username and the password that you have entered would actually be encrypted at the end of your receiving servers and before it teaches suppose if it's the Google system that you are trying to uh login into before it hits the Google server all of this thing would be encrypted and not in your username password will go into an encrypted and protocol system and now the destination system will be decrypted back to see if your username matchup we have another technology Business Bank serversions how do the various Bank services I can talk about Microsoft systems how microsofts are depending on the different channels on the clouds that we have I will not be the right person to talk about specific bank system but of course Banks is again a financial system so there is you need to have like a lot of different uh Security architectures in the initial slide if you see I have talked about a generic the security architecture which means to the ADR PPP and other sort of a system so I'm assuming they should have something like that depending upon how big the bank is how much is investment into it uh for the security posture they would have adopted one or different parts of that particular architecture that I showed them okay um in Microsoft what kind of Technology do you use for malware detection means what is your approach what is the process initial information that the real is this particular talk but if you see go to the security.microsoft for more adjustment type Microsoft security into building you will see like a lot of different researches that we have or if you go on some of the Microsoft journal for security that we have or if you search by my name will come to know about a lot of different researchers or public information that I think should go some extent to answering your question but beyond the public information that is available on these resources I am not free to talk a lot of different technology so we'll take one last question probably we'll have to set up another talk to cover the other areas the other applications but one last question this I think might take a little bit of time uh could you elaborate es and ml algorithms why ESN is used in quite often often in cyber security code which ESN ml algorithm [Music] neural network which is called an Eco statement since I have not gone into the depth of the RNA itself it will be difficult for me to differentiate an ESL from an lstm Gru Network but if you actually go into this is like an lstm network if you see the whole problem that was there with the rnm system white prints scale was because of the vanishing and grading system with the layer that it has a once information is passing through different sequences and then through different layers a lot of information couldn't be retained at very well at the level of the contextual memory in rnl because of complex Matrix computation that was happening at that particular level and that is where lstm was proposed in this case lstm would have like different input Gates output Gates and forget kids this work like an architecture that was made into how do you manage the contextual memory that a cell has an information to as a contact First and it can make prediction beyond the input address now the difference between lstm grus and ESG you would actually be linked to other memory management is plugged I'll not go deep into this particular state but if some of your inspector and some of the classes you are going deeper into the architectures of RNN this would be actually a question for that particular okay all right so we have five minutes left um to just a generic question to end the talk to kind of summarize um whatever we're discussing we've been discussing over the last 90 minutes where do you see this feeling in terms of the conjunction of AI and cyber security and we reach a level where AI will be able to detect cyber threats and solve them before it can cause media damage you see you see it having lateral is that the objective of bringing these Technologies together that's a very tough question where it is going but this as I described malware detection and malware development is still there resources being employed towards malware creation but that's you will have to put so much resources into malware detection and management now the problems start becoming non-linearly complex because [Music] foreign yeah so where malware developers can keep changing the water population they have you cannot let your differences down again now we talked about AI being used into Hardware development but there are also where where complexity but also to use some other areas of attack we covered a lot of difference that posture in each of these we talked about AI being used for the paper there is also aid used or could be used becomes more complex than the attack scenario because you do not know where the attack is to score that particular so that is the same case also so that story of practice will continue to depends upon how much resources are going to come into the development as well you have to experience the other type of problem that I could see into the future for applications of AI into defenses number one is the explainability currently signature method technique has very explainable even classical machine learning based things you ask about like how it is working many of the people will able to explain you that these are the important features and based upon these features we'll say that we have made this particular position since most of these features are coming from the domain in domain expert language you can actually explain what is happening and why your malware model is able to classify foreign but the whole idea of deep learning is that the systems are actually creating the whole features themselves in a very complex mathematical model which is not only difficult for humans to create by themselves but also the humans to understand which gets into the area of domain which is called explainable in other domains it may not be so much required but in the case of things where security is involved in security being a conservative field you have to be very good at explaining what uh we have seen these aspects coming up into area of safety and regards of self-driving cars AI is needed to find the fact that the models are complex and more that model is more difficult it is that is another area of AI in cyber security which I see is in the future of in the research that is going on in these tests then there is another area which is technically called as model so we talked about a lot of things around deep reinforcement learning here we talk about an agent which is designing a particular policy to do a particular thing in this particular cases and normalization but uh there are other techniques in which agents can actually work in collaboration so if I have to assume a particular swarm a drone based attack that you are trying to do on particular very very differently so there will be not one particular drone that will be multiple drones that will be working contraction once everyone will be confusing uh different systems that would be doing like a random attacks on some other systems is that the mobile systems could be different system could be diverted over there then there will be finding the set of drones that will be coming in attacking the main radar or these are called very coordinated uh Merl systems a multi-agent reinforcement learning system so similar type of things I see happening to cycle security also where there will be coordinated forms of Plaza that will be talking to each other and kind of see what are the previous things two similar things that an army of very intelligent to humans so it's either way there are a lot of things that I see I have actually complete paper on it if you find by my name and in the end of the fourth section I am actually keeping something that will talk about a lot of different applications or Advanced applications I see currently the research is either focusing or should focus to actually keep everybody um okay so we have come to the end of the session but we'll come back um to go through the rest of the applications now for all the students present in a room today right see learning data science is one thing you learn SQL and python you learn power bi and you learn all of these things but the reason why we do talks like these every week um is for you to understand when you when you finally get that job right when you actually go into the industry what are the different ways what are the different domains in which you take all these learnings and apply it right it's not just about learning Python and SQL so that is the reason why through these different Odin talks we will be talking about how is data science and AI applied in cyber security how is it applied in the banking segment how is it applied in Pharmaceuticals or across different different elements right so some of you are just starting out this course this is probably your second week or your third week or your first week um you might find it difficult to understand some of the things that were discussed here but this is this isn't a very good um exposure right an awareness opportunity for you once you finish a few more weeks in the program come back to this session all these sessions will be available for you on the LMS recorded when you when you kind of understand more of these terminologies as you learn more and then come back to this talk and watch it again then you'll get a lot more clarity about all these aspects right so um data science AI has applications across different domains and through these Odin talks we are it is an um attempt to help you understand all these applications right so um there is so much more to discuss uh in on on this topic on the applications of AI and cyber security both of these are two huge areas and combining them is really really difficult especially to encapsulate all of them in 90 minutes but we will have a series of these sessions more where you'll understand each of these elements in detail again right for those of you if you're coming from cyber security background or if you aspire to get a job in you know to work in this area then this can be one of your start points uh so that's it for for uh now we will come back again with more Odin talks more awareness sessions more domain specific topics in the coming weeks Dr Mohit thank you so much for spending time with us um thank you so much for sharing all that information and looking forward to having more sessions with you thank you so much thank you pleasure being here um yes uh thank you Mom thank you so much and yeah it was a great session and uh it's very thanks to Dr Mark yeah and that's it guys so we'll meet again in the next Golden talk session so feel free to reach out to us if you have any doubles okay thank you thank you [Music]

Info

Channel: OdinSchool

Views: 97

Rating: undefined out of 5

Keywords:

Id: TsqUnB8TIx0

Channel Id: undefined

Length: 93min 9sec (5589 seconds)

Published: Mon Aug 14 2023