Sadia Afroz | Recent Advances in Adversarial AI for Malware

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hi everyone my name is Nadia fruits I'm a researcher at the International computer science Institute in Berkeley in the last couple of months I have been collaborating with Avast for doing adversarial machine learning attack on malware detector and I liked working with them so much that I'm joining them to spend more time there so this talk is about what we have learned so far while trying to attack malware detectors Nicola give a great talk about how adversarial attacks work just to recap a typical adversarial attacks worked like this you take a machine learning model that has a very high accuracy to do some tasks for example here a bird classifier that can take bark picture and tell you that this is a bird but then if you modify the picture very slightly by adding some special noise to it the bird picture still looks like worth to us but the classifier will make a very high confidence on mistake and think that this is a horse this is a very surprising attack because without the modification the classifier works very well right and it could be a devastating attack that can affect all of our lives let's suppose you are in your self-driving car and the speeds a stop sign classifier in your car considers it as a speed limit sign and this is not a far-fetched attack by the way my colleagues from UC Berkeley already showed that you can modify a stop sign by adding black and white stickers to it and a classifier will think that this is a speed limit sign researchers have shown this attack in many other domains for example in the audio domain and you can add a little bit of noise to open the door to how are you and it will become open the door there's also attacks that showed you can wear a special pair of sunglasses with special patterns in them and the classifier will think that this is Milla Jovovich now when we hear about this attack they seem very futuristic and theoretical you know academics toy with their machine learning models tweaking different pictures but actually these attacks are already happening in the real world for example you probably have noticed that spammers tweaked their spam messages a little bit too to evade spam filters that's a real world ad versatile attacks there are YouTube videos that add special patterns to them to evade YouTube's automated content filtering in the malware domain we have very dedicated adversaries who are always trying to change their malware to evade malware detectors there are ransom wires that put on some benign characteristics for example getting signed by trusted authorities to bypass malware detectors so how do you adversely tax work on in the malware doing the attacks are very similar in concepts to the image domain here we have to change some code by adding some other stuff to it but even though these attacks are very similar in concepts there are some fundamental differences between these two domains first of all adding and deleting some pizzas to an image does not change that image so that it's beyond recognition but adding and deleting a little bit of code from a malware might make it not a malware anymore most importantly though evaluating an adversarial example for images is very intuitive for us for example if a classifier cannot find a duck in the picture on the right side none of us would be surprised because we cannot see a duck there but if I tell you that this the left side this is an original malware code can you tell me if the right side is a malware code or not no because evaluating malware takes a lot of time and effort and in some cases it could be impossible because we need to recreate an environment where the malware can work maybe these are some of the reasons why we don't see that much research work on adversarial attacks on malware detectors adver solution learning is a very rapidly growing field and since 2014 there's been more than 1,200 papers on adversity learning but only 36 papers focus on malware even though we know that in the real world real adversaries are trying to evade this malware detectors so let's see more details about how adversary tax work if you have an unlimited access to a malware detector you can modify your malware and query the detector and continue the loop as long as the classifier things and it's benign if you don't have access to them our detector as Nikolas shows that transferability works for machine learning models you can take your malware to a substitute classifier or an insomuch use the example that evaded all of those classifiers to query the real malware detector you want to evade so how can we modify a malware you can modify a mower in any ways that does not change the functionality of the malware right so you can add random padding to it you can add benign files to it and one popular benign file to add to a malware is the Microsoft software license terms because many of the benign files already have that file you can also add malicious functionalities two of the nine file there's research showing that this attack is more useful than the other way around you can also encrypt and compress a file multiple times using a packer to evade signature-based modification signature-based classifications so now after we've made these changes how would we know if this is a valid adversarial attack the current way to figure this out is this you ask a malware detector before modification is this on our and then you apply a modification and as it's the same question is this an hour and if the now the detector thinks it's a malware before the modification but does not think it's a male right after the modification then we consider that the attack is successful this is a great way to evaluate your adversary examples if this is an image because for image we can verify valid adversary attacks just by looking at them but it's not enough for malware because for a malware for an adversary or malware the main question we want to ask is that is this mount modified malware still malicious and unfortunately just the answer of the malware detector does not help us answer this question so let's see two examples in cases where the modified malware is not malicious anymore but it evades classification let's consider this malware it takes it connects to a command-and-control server and does something malicious after it receives the command if you ask them our detector if this is a malware it has never seen this file before so it's going to do static and dynamic analysis to the file create a signature and consider that this is a malware now after it has in this file since there's already signature for this file it will try to match the signatures of the file and consider that this is a malware after creating this signature the malware detector will always match the signature instead of doing any kind of complicated analysis now let's consider five years have passed and the CNC server is dead now is this still a malware depends on who you ask if we ask the same malware detector if this is a malware notice that the signature of the file hasn't changed so it's going to consider it as a malware even though it's not doing anything malicious now now let's do our adversary attack on this piece of malware now the signature has changed so the malware detector does not have any signature to match so it's going to do the analysis of this file and consider it as benign and it will be correct because this file is not malware anymore this is a very likely thing to happen by the way because most malware research is five to ten year old malware sample and many of them might be inactive by the time when we are trying to do adversary attacks on them and that's understandable if you're an academic having a very active malware samples all labels are very hard to come by but using old malware that are that are might be inactive has its own consequences the second reason why a modified malware might evade detection without being malicious is corrupted malware there are some malware that are packed in a way that if you just change a one bit in them it's just going to destroy the entire file so if you ask a malware detector it is corrupted file is a malware it's going to do analysis and does not find anything in it because the file doesn't open to do anything in malicious and consider it as a benign so now the question is what kind of attacks should we focus on while we are doing this adversarial attacks and malware detector adversarial attacks are a great way to debug machine learning based malware detectors but we need to focus on the attacks that are real adversarial attacks and that can help us understand shortcomings of our malware detectors so one way to to find valid adversely examples it starts with an active malware that evades detection and how would we find this active malware that evaded detection we need to perform more attack on the files and analyze them however current research in machine learning adversely attacks on malware detector does not help us answer this question remember in the beginning of my talk I told you about these 36 papers that perform a tax on malware detector about 25 percent of them did their attack on malware files the rest of them did the attack on the feature vector about five percent of them tried to execute the file by clicking on the file but 0 percent of the papers checked whether this malware was actually harmful or not in some cases just clicking on the file and watching that the file works it's not enough to figuring out whether this is still malicious or not so how can we do better first of all for adversarial attacks on malware detectors we need to start with active malware now finding active malware is very hard but we can start with malware that are recent instead of very old and we need to perform at we need to analyze them on a virtual machine and open-source dynamic analysis to figure out that that they are still active these things are much more time-consuming but they can be automated and at least this way we'll make sure that the attack adversarial attacks we are doing are valid adversarial attacks the second thing we need to do is we need to execute our attack and again analyze them on a virtual machine and and with a public with open source dynamic analysis depending on the the goal of your research checking against and add real-world AV might be a useful thing to do and sometimes the goal of adversarial attack on malware or the machine learning is just to find specific problems with a specific machine learning algorithm in that case checking against real-world avis is probably not the best thing to do but in other cases if you want to know what kind of attacks we want to want our users to protect against a real-world AVS can help me answer this question this way we'll make sure that we are not doing research that this commercial systems have already solved now if we want to analyze all of our adversary examples is in real world AVS we have to send over at the adversary examples to this Avis and that might pollute their training data and make them worse to defend against real attacks so we doing this safely it's still a challenging problem now that I told you that current attacks does not help us answer the question that what kind of attacks are useful there are already adversarial attacks happening in the real world and are we prepared for this attacks there are a lot of Defense's proposed to defend against adversary attacks I will talk about two of these defenses their first defense is adversary Trading recall how adversely attacks works you have to take malware and modify the malware so that enough times so that it avoids malware detector and the malware detector considers is @midnight adversary training tells you that after you have this benign sample you retrain your classifier with that sample adversary training can help us identify examples that are that are easily available adversely examples but the problem is that it can also decrease your accuracy so here in the graph the we show how adversely examples can increase the robustness of the classifier but decrease the accuracy of the classifier this X this experiment was done on a digit recognition data set here the x-axis shows the number of adverse L examples we added the blue curve shows the accuracy of the classifier and the green curve shows the robustness of the classifier by robustness I mean how many features you need to change to evade the detection without any adverse so example the classifier is very accurate 97% accurate but it's also very brittle you need to change only 10% or only 10 pixels to change the classifiers decision after you add about 0.1% adversely examples the classifiers robustness increases but it also decreases the accuracy of the classifier this happens because as we are adding more and more adverse own examples the classifier get tuned to classify only those adversely examples and it diverges from the naturally occurring examples to defend against this we need to make sure that adversary training does not diverge too much from the original examples the second difference of talked about is building classifiers that are robust by design let's consider this malware detector is a logistic regression based male by detective it has some features from benign file and some features from malicious files whenever it gets a malware file it analyzes this file and gives it a score based on the number of malicious feature this file has and number of benign feature this file has let's suppose in this particular case this file gets a score of 3.2 it will be considered as a malware because the score is positive now let's think for a second if you were in an adversary how do you evade this classifier the easiest way they made this classifier is to add a lot of benign files to it in that case since the classifier is always weighing number of benign features - number of malicious features and now we have increased the number of benign features a lot it's going to consider that this is a benign file so how would we defend against this kind of attacks one way to defend against this kind of attack is using monotonic classifiers monotonic classifiers ensures that attacker cannot change the classification score by adding a lot of benign features to it this graph here shows a typical classification score of a classifier here the x-axis shows the feature values and the y axis shows the classifier score and you can see that as we are adding more features the score of the classifier can increase that can also decrease monotonic classifier ensures ensures that the score of the classifier can never decrease as you're adding more features here the red curve shows the monotonic classifiers and it as you can see here if even if we add more features to it the score of the classifier is never going to decrease we propose this classifier in 2017 now this seems like a very simple defense but actually adding features to evade real-world in real world malware detectors is an attack that our real-world adversaries are already doing many of you might remember that researchers showed this attack against an AI based detector silence and their attack was also very similar they added a lot of benign features to their malware so how can we make this classifier that the logistic regression this classifier I showed as a monotonic classifier and we can turn this classifier into monotony classifier by just considering the malicious features so in this case even if you add a lot of benign features to it the classifier will not consider it as benign because we are not considering any of the benign features at all this is not a theoretical defense after we published this paper in 2017 Microsoft Defender ATP has adopted this monotonic classifier and they're using it to defend against real world ransom wires this is by the way just only only one way to implement monotonicity there are many other ways to implement monotonicity in the paper we talked about using only monotonic features for example features that can be only added but can never be deleted for example right writing files if a malware has to create a certain amount of file and write in a registry keys it cannot remove those remove those functionalities without removing the maliciousness of the file so monotonic classifiers by themselves have lower accuracies than regular classifiers and that's understandable because to ensuring monotonicity we are constraining the classifier we're making sure that it has certain properties here you can see that the monotonic classifier has lower accuracy which is shown in the blue curve then regular classifier shown in the red curve so one way to solve the problem of having lower accuracy is using an in samba love classifier so that in that case if any of the classifiers consider a malware as a malware we'll declare that this is an hour so to conclude adversarial attacks is a great way to debug machine learning models which might seem like an opaque black box when we're using it but while we're doing ad Soula tax in the real world malware detectors we need to make sure that the attacks are valid adversarial attacks for malware detection we need to ask while we're validating your attack we need to ask what kind of attacks are really harmful to the users fortunately or unfortunately in the malware domain we have real adversaries that are doing adversarial attacks so studying real-world adversaries can help us identify what kind of attacks are useful so that was my talk thank you [Applause] so yeah thank you and I have questions for you do you have a rough estimate of the percentage of malware in the vials that uses adversarial evasion no I don't but I'm sure a lot of them are doing it and that's one of the questions we are trying to figure out there is also one comment from Professor Cavallaro it's super biased opinion that there is a poster upstairs touching the topic of adversarial ml in the problem space but let's go to the next question does the research in the area check if the evasion attacks work against pattern matching detectors non ml research in the area check if there should attacks pattern matching detectors yeah I think so it will because you just you just need to figure out what pattern you are matching and then make sure you are using different patterns and could you use old labeled malware and simulate their CNC servers it's probably possible to use old math but you need to know what the C&C server was doing to trigger the malicious activities in the old malware and in some cases it could be difficult if we don't know what kind of what was the activity it was supposed to do so thank you I think we have answered everything thank you very much shaking [Applause]
Info
Channel: CyberSec&AI Connected
Views: 2,154
Rating: 4.818182 out of 5
Keywords:
Id: pyoTR8LTsWM
Channel Id: undefined
Length: 22min 29sec (1349 seconds)
Published: Fri Nov 08 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.