Meetup: Detecting Money Laundering Networks Using Machine Learning

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so one of the things muna has mentioned we are actively working in the area of money laundering as well that's one of the areas that I would have been working in the past as well where you know it's from cybersecurity to any kind of FinTech fraudulent activities which encompasses anomalous behavior and in in the narrow area of I like to classify it as narrow because you know there are different kinds of money laundering which could include backings of terrorism human trafficking and even for example like tax evasion any of these things cannot come under money laundering in terms of how the activity goes about so this is a kind of a sliver of a problem that we're working on there we have actually developed a solution and that is something that I'll talk about on a showcase as well what essentially happens in while you're trying to figure out things about money laundering is you have a very high false positive rate in the alerting system that exists right now having said that I just want to quickly pause and see how many people are actually working in the field of FinTech trying to apply ml in FinTech okay fair enough great some people halfway there maybe you still developing a model fair enough okay so one of the quick things that I want to talk about when it comes to money to money laundering and rather the the solution itself is the what we saw when we were working with this problem is there was a huge amount of false positives that existed in the system you know different currently there exist different kinds of rule systems which generate a lot of alerts and a lot of these alerts are actually you know not good in terms of quality meaning you have about anywhere close to about 70% maybe even like santa feb center about 99 percent false positives some financial institutions don't necessarily tune it well so which is why you tend to have these really large false positives so what when we started engaging is this we thought it was like a one-off problem that we'd be working sorry I think the anyone ever oh great okay quick point as well if anyone has a burning question please I'm happy to stop your be stop right there and you can go ahead and ask me it's totally fine just you know let's keep it easy having said that so we kind of thought this is a one-off problem that we were solving when it came to you know applying what we know but it seems like this this is a systemic problem across many financial institutions where you know rule based systems doesn't necessarily give them high quality alerts so we actually built a solution and we kind of deploy this all across we've deployed this across quite a few financial institutions around the world so what we've designed essentially is is very simply put it's an into and machine learning based false positive reduction system that this essentially takes you know the existing alerts that you have from the alert generation system and it basically classifies it using you know supplementary information to do this classification and we go about it I'll even show you how it works so having said that anybody probably still new to the concept of money laundering yeah somebody does not know in the end what my loan thing is okay simply put the idea of money laundering is you have you know money that comes out of illegitimate sources and you need to make those you know the money that you procure from these illegitimate source you need to get that into the system into the financial system so that you can use it for legitimate purposes essentially that's what it is and people go through a lot of you know interesting things to do I'm sure a lot of people have seen the show I think it's Breaking Bad is one of the thing and you know they're very interestingly show you how to do it that doesn't necessarily happen the same way but anyway having said that a typical setup of you know the process is broken down into three parts the first is actually the placement then it fall with layering and the third is actually integration so the placement is the first step in your money laundering process this is when you know you are actually bringing in money that you've obtained from illegitimate sources you know trying to bring it into the system it could be you know things that you're using money that you've got from let's say selling drugs or you know trafficking human beings around the world it could be you could be funding interesting you know behavior around the world as well it could be terrorism sleeper cells any of these things that might come in as well it very simply could also be money that you're not necessarily you know you're not being transferred on but this money to let's say the government it could essentially be that part as well so that essentially is the money that you're bringing in and that is what you need to place it into the system so the placement into the system is basically where you deposit the cash into the system or into the financial system so once you actually deposit it let's say you know as a form of a large payment or you know you hand it over to like let's say hundreds of people in your village and then you tell them hey you know come over and deposit it to like one account so that's actually a form of you know placing into the system which is called smurfing where you give you know small chunks of money to like hundreds of people and then ask them to put it back into the system so there are different ways in which you can actually place money back into the system now that's essentially your first process the next one is actually layering this is where you try to lose your trail and when I say lose your trail what you're trying to do is you're trying to actually move the money across multiple accounts in different instruments into different instruments and out of different instruments you know just to make sure that the government that necessarily is tracking or the financial institution that can track you cannot track you anymore based off how you move the money which is one of the reasons why a lot of money tends to end up in like places like Cyprus Malta Maldives for that matter and a lot of islands Iran Iran the world which have very convenient laws about not disclosing you know where the money is coming from or where this is going for exactly for that matter even Switzerland has the same thing so does Lincoln Stein and I think one more place probably which is popular as well yes open oh yes of course yeah thank you yeah fantastic yeah so once that's done your money tends to move around you know these islands and these places and then eventually you kind of get to you know you lose the financial institutions that are trying to track it tend to lose trail and then what you do is it comes you know by some interesting means it comes you know as one of your cousins who had written of will you know he was a millionaire who suddenly passed away and in his will he just had your money and you know that just comes back into the system and you essentially get millions of dollars that's one of the ways of doing it you know there are many ways of doing it that comes back into the into the system and then that's become legitimate and then you know you go ahead and you can buy do whatever you wanted that money essentially most of the financial institutions are actually trying to identify money laundering at this point there are systems that actually tried to detect if there if there is layering happening if there is you know some kind of endpoint integration that's actually happening in the money but most of the work actually is financial institutions trying to find you know placements which essentially is is is like the fundamental part in which almost all rules based system work and these rules based system actually I think it's excellent Oh Bitcoin is something that they have still not worked out with so bitcoin is still very easy to work with when it comes to money laundering you and I could do it as well just saying and so so essentially we have different rule-based systems you know you have things like FICO Fiserv SAS optimize these are all rule-based systems of fantastic rules actually that exists in these systems they try to catch different behavior they are usually very successful in catching different behavior but some of the problems that exist is that they look for you know up takes it down takes you know switches cutoff points they also don't necessarily look for a lot of stateful behavior it's usually a lot of stateless behavior and which is one of the reasons they tend to fall short and the other thing that also happens is a lot of financial institutions tend to run these rules in default mode so you are supposed to customize these rules specific to you know your purpose but a lot of financial initiation don't necessarily do that they just run it in default mode which means that they end up getting a lot of false positives so as I was saying so you know adding to that so one of the reasons why you know using these alerts is also you know rather the quality of these alerts are still bad is because the process of you know the process of how you work with these alerts is still manual there is still a lot of human intervention that happens investigators actually sit through these alerts trying to find out you know what is good what is bad and so there are lots of financial institutions that are much more closer to this than this and it's it's these systems of what you know the cost of these systems are like millions of dollars but they still end up getting about you know 99% false positive which means they really have a very small margin of true positives that they can actually work with but they have so that that that essentially means that they kinda have to go through a lot of garbage which becomes a big problem for them and as I was saying the rule-based systems are kind of slow so it kind of takes some time for them to actually understand what the behave because one of the things that we have seen is that different geographical locations tend to have endemic behavior in terms of how money laundering actually happens for example the way we see money laundering in Asia doesn't necessarily happen in Europe the way we see North America doesn't necessarily happen in Asia and one simple example for that is in Asia you tend to see high volumes small amounts which is one of the most general behavior that you see in Europe it's actually the other way around you see high value low volumes in in America it's you can actually kind of Vic the sector apart in in like let's say you know high-value customers you tend to see how it's in Europe in low value customers you tend to see how it's in Asia so it's kind of a mix of both so which is all the reasons why the behavior can be a bit endemic as well that's not necessarily captured in the rules module so which means that you know you can have to tune it make it work for your work for you and then the rules also have a good amount of gaps which means that there could be different ways in which money laundering is actually happening but the rules don't necessarily capture it which tends to also be a problem and these actually you know and that some level it kind of requires experts who understand these rules to come in you know help you out in terms of fixing these rules and that essentially kind of makes it very limited in how you can work with it so it's kind of a problem as well having said that this is the current workflow this is this is how you know any financial issues in that you know is using that's using a rule-based system actually works so what they do is what they essentially do is you know you you basically get an alert that actually comes in and the person you know that you have an investigator who is evaluating the alerts they use different sources of information they use LexisNexis to try and find out you know if you have any criminal records or any of those kind of information the lots of other systems that actually give you kyc information different countries are different you know in forces of information that they can use I think in Netherlands is one of the cool things about Netherlands is banks can actually share information in terms of kyc but there is this very limited scope of what they can share so it's very interesting so you you can you can share you know how what volume of transfers does this person do how many times is this person transfer you know those kind of things you can share between banks so this kind of becomes a really beautiful thing that you can share but that doesn't exist anywhere else by the way so you are left to you know figure out based off whatever aggregated information that you have so in terms of the analytical data that the investigator looks for LexisNexis is one account databases you know it's like your kyc information transactional card collateral any of these things that are usually valuable once this is done the investigator is able to classify an alert as actually suspicious or not and this essentially becomes you know your your ground truth this is probably the most valuable thing that you would say will help you classify a model and so what essentially happens is that the investigator goes through this whole process the investigator actually goes through this whole process of evaluating the true positives and the false positives and trying to figure it out but you still have this huge amount of alerts that you kinda kind of have to go through there are financial institutions that have I think like about 300 to 350 investigators just to go through this data and try to find out if there are like 3,000 alerts in a day that are available for them you know to prove which which again becomes a big problem for them as well so you know the workload is one of the big deals yes miss yeah [Music] you keep talking about the false negatives what about the like you know what about the false negatives actually like if it's a real money laundered and this misses it or are these rules rigorous enough that it will definitely be cut so that's actually a fantastic question so what we the solution that we have actually works post alerts which means that if you have actually missed something which is the false negative then it is actually it is gone so you so you have lost that information yeah so you at least at this point in time you could you have lost that information so essentially the solution designed what it does is there is some amount of consistency that we've built into the whole process you know it reduces the false positives that we actually have the the way we have designed this is it actually is strategically placed between the AML system and actually the investigator so that you know it doesn't necessarily have to do any kind of modifications to the rules but it actually gets inputs from the output of the role system and it classifies you know as as an investigator would do an alert into a false positive or a true positive but it would use an ml approach and the model the machine learning approach or the modeling approach that we take uses a good bunch of features that are very specifically designed you know essentially to find like anomalous behavior in financial transactions so how does this kind of lead you what are the kind of advantages that this gives you speed is one of the biggest things that you can get usually the amount the time it takes for another to be processed is anywhere between 45 to 90 days sometimes you know even close to six months but in this case it kind of reduces the time down to like a few seconds and it also reduces human human inaccuracies one of the studies that we did was we tried to look at how investigators approve you know across these different well different temperature conditions different times of the day we tend to see in investigators have a faster rate of approval as they approach the evening and a much slower rate of approval you know in in the morning which essentially tells them that you know they either have a target to meet or they just want to go home so that's kind of the interesting thing that we saw and in terms of so for example the temperature that we were looking at you know different days in which temperature changes what we also saw was that cold days things are usually slow and warm days you know things are usually very fast probably because you know they want to go out they don't necessarily want to be in you know no warm day so and it also reduces a lot of you know person as that you kind of that that you would actually put in yes so on daily basis they deal anywhere between forty to sixty alerts I would be I would even say anything beyond fifty is usually on the higher side and some of the things that actually happens is with modeling and with the features that we've built it it it tends to kind of fill in the gaps it helps us identify behaviors that we didn't necessarily see before yes please since everything is rule-based what does the investigator know that the rules cannot quantify yeah fair enough so one of the things that's happening is that let me go back to this one so you see a lot of false positives that are being generated by the system for example you know I could have sent money to my brother which is beyond like let's say ten thousand dollars and one of the things in the United States is if you know your transaction is about ten thousand dollars you have to flag it as a potential you know money-laundering situation and then what happens is that's a false positive because you know I probably had a legitimate reason to send it to my brother unless I was money laundering now the investigator actually has to go through these systems you know he'll probably you know call up he'll probably check up some you know through out of loops in this case or sometimes you know he looked through the system of information that'll tell them what it is for example he have the ability to look through a memo and say you know what is going on here maybe transferring to a sibling or something essentially that kind of helps you know in terms of your investigator but the rule system doesn't necessarily know that so and the other thing is the way we have designed we built the solution is that we've built it of course as a supervised problem which means that we are essentially looking for you know labels that already have existed you know from an investigator's approval there are different ways in which we did the labels to be as well we looked at investigators across different levels you know the first level second level third level trying to see if this this information we can aggregate in a certain man actually build the labels and of course we also look for historic alerts just to make sure that you know that's what we build the model on essentially there are three sources of data that we need the AML alerts the transaction information and kyc everything else that you give is you know icing on the cake so that's essentially how we go about having said that I'll quickly switch Visia i'm karis morty what mr. know what you explained yeah what you do is for whatever procedures that are there your training in ml engine and then use it but you don't go into the fundamentals of money what is money what is money laundry how does government get involved and they do get involved good countries bad countries I can give you a big list for myself and then where the big money is goes not the small money not 10,000 $50,000 these are millions hundreds of millions of dollars given by American banks to pad banks routinely happens lot of money is printed in terms of four five trillion year after year by USA by China China no one knows what number is Japan and so on you are not even touching all of that you're just taking farce already they're training and use that machine that knee so this is I'm not to belittle what you are doing no no of course not I don't even think about that that's totally okay we essentially work within the system basically meaning every information that the financial system has is what we work with which means that we are addressing the problem only from a financial system perspective we are solving a problem that they have not the problem that they themselves are not able to comprehend how to bring it into the system I mean what I'm saying is which is why you know I opened up with the idea that this is a very narrow problem that we are solving this is not the solution can identify things that are based off which we have data if we don't necessarily the suppression if there is no then there is very little that we can do fair enough I mean by any agency you know that is credible not yes so Thanks any other questions but I'm just gonna quickly jump to a demo any other questions yes please and the reasons that you're right thanks do you want money laundering actually or they don't they're willing to look the other way right it's great money is being put into their system that's what they want the reason that I mean not you know not that they're evil or whatever but they will for a long time and did look the other way in the last maybe five ten years regulations have come down on these banks so from fines perspective from a reputation perspective it's become a much bigger deal to catch these things and right now and because I know we solved the same problem are we trying to solve the same problem it's being able to say hey I did everything I could and then the regulator's will leave you alone so that's a big part of this but of course it's yeah you're right I mean it's yeah it's something that the banks don't necessary it's not like fraud and in the same way where they want to catch that because it's fraud losses it's something that harms a reputation or that they're getting fined for but yeah and and here they are accountable for not necessarily something you know that that it's a loss for themselves but because of the stricter regulations now I think there are bigger fines that are coming in which means that you know they automatically have to be more authoritative in terms of how they manage the whole system yes thank you so yes please yeah so is that a question so I wouldn't call it filtering I would just call it alert generation because if filtering requires a process of you know some kind of curation in terms of the data so this is merely you know what I find is suspicious you know something that's interesting could also be found as suspicious by the system and the system just pops it out you know because it can but what the model does is the model actually looks at the data which means that it's looking at your transactional history looking at your card history looking at your kyc background and it's looking at you know your historical alerts as well and trying to find out hey is this actually an alert that you know that's valid or not it's you could say it's kind of like a very mini brain on top of this I wouldn't want to say it's a big brain but this narrow problem that we solved but it's like us you know smarter system that sits on top of it thank you so yes so currently what we can you guys actually in the back you see this clearly because what this is is I'm running how many figures are actually familiar drive let's say a quick question okay okay so driverless AI is actually flagship product which you know automates the modeling process and feature engineering as well so in this modeling process we can i use revell as AI to build the models and what essentially we have done is in this case to build the solution for AML we have actually used driver let's say as the modeling engine you know to do the whole process for us give me one second and in this case let me see if I can if you guess a brick okay so I did a very interesting thing I don't have my internet here give me one second please do yeah so yes so the one that you're seeing here this is actually a supervised approach where which is why I'm saying there are many different problems that we solve in the ML space one of the problems is reduction of false positives and that's essentially what you're seeing in this case yes so that's exactly the the point that is making was we use investigator data to actually help us you know create labels for these alerts so there are alerts historical alerts which have already been labeled by different levels of investigators and we use that essentially to build a model the the model is strained based of the alerts none of the rules of the alerts no no so you don't label transactions so there are alerts that get generated because of a transaction or because of a set of transactions and then what happens is that alert actually goes through the process of let me go back here for a second this might help yes so what happens is so a transaction or a set of transactions actually happen and then an alert gets generated for example structuring right so in that case you need a set of transactions to actually detect that structuring is actually happening so once that alert gets generated by the alerting system then you have an investigator who is looking at you know all kinds of information that he or she can gather about you to make a decision whether this is actually structuring or not so there is a manual process involved so someone makes a decision once someone makes a decision then they classify the alert as suspicious or not suspicious now we take that as the label to build the model on so do you have to see like a minimum number of like your transactions for example or is there anything like that okay before you can say like you know so one of the so what I would actually trying to tell you is we don't we don't necessarily see the problem of you know minimum like there is no problem of like data in terms of what we do because we are trying to identify behaviors from many different vantage points which means that we artificially enrich the data merely by the different vantage points that we try to look at the data from that essentially gives us in a lot more valuable information but we do not do our sampling which is one of the things that we do not do because it tends to affect the whole modeling process in a very negative way so or sampling is something that we very dogmatically avoid [Music] okay okay so people in the back how bad is it I know it's bad I just want to know how bad is it okay let's do it this way helps okay okay so essentially what I'm showing you right now is actually a flattened dataset okay so this dataset essentially lets you this dataset essentially has your email alerts okay and it has your kyc information okay and it has your transactional aggregated or transactional you know sampled certain features built into one flat table and I'll give you a few examples how we look at the data by the way this is synthetic data we have not taken it from any financial institution so please not freak out we do not have any release information all the although the distribution is very very similar to a lot of data sets that we've seen so what do you look at is you know you're looking at someone's account number you're looking at you know what day it is are you looking at the kind of line of business by the way the moment you see line of business I think it was for a lot of people probably ring a bell we are looking at retail data right now in this case its retail banking not necessary wholesale a private banking and then you know you're looking at the typology so the typology is essentially the kind the the family of alert that actually an alert belongs to so for example someone mentions structuring it could be tax evasion or it could be fast cash withdrawals so any of these things tend to fall on the typology under these things you have like rules sub rules and so on which tell you you know did this trigger because of this person you know moved like $5,000 or $10,000 sometimes it could be like someone who is under 18 moving like twenty five hundred dollars gets triggered so these are all sub rules that actually fall under typology having said that there are two key thing that you cannot have to look at even when you're you know trying to build a model on this AML data is something called as a target and something called as a case now I'll go to case verse and then we'll come to target very quickly so what case is is the first of the first two levels of evaluation of an alert where basically you're trying the the investigators in a financial institution are saying hey this seems like a case I wouldn't yet send this to the government for further you know investigation or information but this seems like a case and that's essentially when a lot of investigators actually mark and alert as a case so you know they just change the flag now I'm sitting to your question and ones like let's say a senior investigator looks at it and says hey this actually seems you know like something that we can send to the government and what do you send the government is actually called a suspicious activity report or SAR income guessing you know yes now and that's essentially what is identified by targets so if you've said something to a target that means you've actually it's eventually suspicious and then you're looking at things like is it a withdrawal or a deposit you know and then you have a lot of kyc information in your account prefixes your transaction banking information transaction code your amount your time of transaction you know if this is a manual transaction you Teleco transfer branch so this essentially comes from you know who is doing this or what who actually handled a transaction then you also have certain kyc information which basically you're looking at you know a number of customer number of ATM withdrawals of a customer number of ACH credits that the customer had number of credit purchases the customer dead in the last year to date of the year to month I mean day to month or month late sorry essentially so what we do is we take all this information we aggregated and then we build a flat data set now this data set also contains it could sorry it could also contain your car transactions if you have your information it could also contain information about their collateral loans I kid you not one of the most amazing rules that exists in AML is fast repayment of loans I'm sure you guys probably know about it it's it's like oh your uncle you know you never knew just came over and paid your loan so hey you know I mean but that's actually money laundering in a certain way so you know you have these kind of alerts so you can also look at the collateral your loans you know how could credit worthy you are or or not and essentially go from there so having said this let me try and so having said this now what we do is we'll try to build a simple model with it and essentially what we have done is we have already dropped in the module that builds the features so I'll briefly show you so this is so what we have in this instance of bla bla saya is a sample module which builds features for you purely for AML purposes in this case because we're using driver let's say you need to let drive a let's say I know what is the column that you're going to build the model on okay or your labels for that matter and essentially here we choose target because we want to see what goes into assad's or suspicious activity reports and gravel assay also has some tuning knobs you know we shall tell you your accuracy you know how accurate do you want the model to be how you know how much time do you want the building process to take how interpretable does the model need to be and so on so these are kind of knobs that you can use to tune tune in one of the things that we insist when we build this model is something as something called as an f1 score which means that it minimizes false negatives as much as possible and one of the reasons yes please yes this is a binary classification yes in in this case no no not in this case yes yeah so what we do is we basically take the F 1 as our pinning value the F 1 helps you minimize the false negatives and the reason why that is important is every financial institution that you've worked with tends to have a ml arrow or a money-laundering risk officer and this officer tends to tell you hey I do not have the appetite to lose like 1 percent of my false negatives or five percent of my false negatives and essentially you have to use that as your pinning figure which means that that should be a target not the false positives not the true positives I mean yes of course that should also be good but this is essentially your very first step that you'll have to tune the model on and once you do this let's pray so now what it's doing is yes I think false negative it means that you do that because it means that some human later is going to check it again correct could you repeat the question please if you minimize the false negative yes it means that you are going to accept a lot of potential cases because then they are going to be filtered again by some humans not necessarily because we've implemented this the outcome of this model in two ways some financial institutions that say hey I want an investigator at the end of this pipeline so in that case what you say is right there is a human being who is looking at the outcome and in other case we have finances should we say whatever comes out of this as through positives we'll just file them as suspicious activities so there are both ways of doing this so what essentially is happening right now let me try and zoom this out so what essentially is happening right now is we are actually building a model in this case and the model of your building is essentially based off this data set which had alerts in them and because we have you know pinned ourselves to the false negative scores we are trying to get you know as good a value which you can kind of see here so we're trying to minimize the false negatives as much as possible and that's essentially what's happening out here and you can see the features that have actually gone in to the model we have many different features that actually comes out of you know the alert activity the transactional activity some of them come out from your qia information essentially these features kind of add value and you know build your model having said that let me just so having said that yeah so this is essentially the whole modeling process that we're going through and this solution works with driverless AI as a back-end and let me just switch the slides for a second while this model is running and tell you how the whole pipeline works okay so can the people in the back see this is it okay no maybe not okay so let me let me go the whole thing okay maybe that'll help so what we have here is the solution actually has as you know it's it's of course it's got a set of pipelines within itself it's got its own database you know where when you're a parent when you're putting it into a financial institution you know you have daily uploads of data that actually comes in you're looking at transaction table uploads all our table uploads and kyc information uploads essentially there's a first roll-up set of features where we are trying to actually build the whole dataset into some form of a flat model in some form of lab data set so that we can get driverless air to actually ingest it once this is done there is a complicated set of giants in terms of you know how you join it and the reason why this actually happens is because you kind of have to find the alert specific to a day or you know a day before and so on and then essentially join it back into its transactions and the KYC information so once this is done then I will I say ingests it and you know you can actually build a model and this model tends to get deployed and very simply put we follow the same model in terms of production ization as well where the same ingestion happens feature creation happens you have a flattened data set but now the pipeline is essentially focused on more of this Cora so driverless AI kinda helps you make that switch really easily because it's a one-click deployment of a model into the system and that essentially makes it like really easy having said that this is probably like a brief you know write-up about how we deploy it so as I was saying earlier this aim and solution doesn't modify any of the existing rule based solution it's it's in parallel to whatever rule-based solution you have in your financial institution quite a few reasons for that one of the reasons you know that I would like to give is because these systems are super expensive and they follow amazing regulatory guidelines to see that you know they they get the rules right and all those things right so ml just can't like change this thing in a day I'm sure you guys probably know I mean it takes some amount of time to actually get that whole process through so which is why we tend to work with it you know instead of just against it and much of the model in this case the feature transformations and the model actually gets built using travel as AI and we ingest the alerts that are generated from these systems and there is also automated documentation which essentially comes out of these models so once when you send these alerts the alerts are able to identify what features were actually valid or valuable in terms of you know how this was classified as an alert or not and there's also automatic generation that actually happens in driverless AI itself so that kind of helps either the pipeline about when you're trying to file out your SARS having said that this is a very simple structure of how the whole thing fits in your organization you have your data that gets fed in you know based off an ETL process you have the AML solution that's deployed and then you have either a manual review which is the investigator that you're talking about or you know you have an automatic case filing system that goes through let's switch back to the okay so in this case we have built the model and you can see quiet so if we build essentially the module the module also tell you you know what are the features that were actually valuable in this case when the model was being built and her friend I probably think would be okay so I false negative rate is about two and we've been able to classify the remaining amount of you know two positives and true negatives ID simply find we've had about 40 of them or there are false positives and that's essentially because we were focusing more on an f1 score to try and build our model rather than try and get just maximize you know our classifications better yes because so yes so in this case what we're using is to be using something called as a GBM which is a gradient boosted machine model so that's essentially it's it's a tree based model but it's it's it's quite different when compared to a random forest model essentially here what's happening is every successive build is actually being build based of the errors that you have in the previous step so it's yeah so that's essentially how it is in terms of GBM simply put yeah yeah the second question I have is related to what he sort of had asked and you said you deliberately avoid over sampling but you clearly have highly unbalanced data set with very very few true positives or false positives so how do you balance the data before training because otherwise it's not going to really work so what we have actually seen in this case is we as I said we do not over sample if it is actually much mini much smaller than the amount that we can actually handle as a model we move the problem entirely to a different system which is an omlette system which is like an anomalous model detection system and not a supervised problem anymore so if we move that to an unsupervised problem if it's too imbalanced and that's essentially because the way these alerts are generated and transactions are transactions exists there is a lot of time based dependency that exists in these transactions which essentially means that over sampling these atomic transactions or atomic alerts is not going to help us classify something it will just help us classify it will just give us better numbers in terms of classification but if you were to build a relationship about you know how many times did I do a transaction today at like let's say Starbucks and if that is important for us that won't be captured when you're over sampling your data and which is one of the reasons why we a wide or sampling data but if that's the case we move the problem to an anomalous detection problem which is like an unsupervised approach yes so I will quickly pause here maybe actually I have one more slide sorry certain things to remember before you know you go about like doing these kind of things is your false negative is probably one of the most important things that you have to focus on before you build these models any amount of reduction in terms of false positive is perfectly fine we have seen with a very strict guideline where the risk office has told us 0.1% of false negatives is all they can handle we have seen about 16% reduction with close to about 5% false negative rate which is allowed to us we have seen close to what 87% reduction in false positives so there's a huge range depending on your false negative amount that you can pin on so which is why you know how many false positives you can reduce is secondary try and focus much more on you know what your risk officer actually is telling you base of that you know your model will be much better so having talked about the false negatives the other thing that is also very important this please you got to make sure that your data is of good quality that's one of the most important things if you do not have nfk by see information if you do not have enough transactional information then there's no point in building these alerts because it's not going to be valuable enough for you so yeah having said that I'll probably open it up with some questions if anybody has any Thanks no I'm just like to understand the type one error is the false how do you say that type one error so in this case we actually have like false negatives and we so basically we break it down into false negatives and false positives in this case so what you're saying is what is more important is to is yeah focus on your false negatives to keep it as minimum as possible if you have a good enough model your false positives any true positives will get classified so it's more risky to have a missed opportunity yes okay so you said the existing rule based system is very expensive right million yes yes yeah so why don't they improve it I mean if you're paying that much money and you got if you if I buy a system and I generate nine that descend up wrong that's wrong data I want improvement right fair enough so can you use this system and feed it back to the room to improve their ruse so based on your learning here so about I would say about four years ago maybe about just before four yeah just around four years ago most of the modeling that we were doing we were actually feeding it back into the rules you know the financial institutions were much more risk-averse which means that they were using it to tune back their models but now you know it's it's a slow process one of the reasons is that we who are trying to push ml in these areas right we have to understand that most of these systems are actually vetted really well by regulatory authorities so there are there are lots of rules that they follow the allure the rule generation systems actually follow a lot of rules you know just decided by FinCEN is the agency out here you know the central banks across europe and asia and all these places so having said that it'll take you know I mean ml for any process replacing ml for any rule-based system is going to be a time-consuming process and these guys have invested a lot of money people in it so to get them to understand you know that this moves quickly is going to take a bit of time you can yes you can yes yeah I mean if you want to save that 15 million that you're paying well you can replace it you I mean you can do that so but but a lot of them actually used to tune it earlier but now they use it in parallel or they use it for automatic classification I think it's it says it's a slow process it's a step by step process probably the next one would be you know where we see a complete ml domination in terms of this yes please now so in this case we're looking at something called as one of the things that we look for as fast money transactions which can involve your card action but credit card fraud itself is very different from the Sears case I think there was yes oh yeah have you guys found while trying to sell the solution to institutions I don't know how much success or not that you've had so far that the regulator's here are not ready to accept machine learning as a legitimate way to detect money laundering because some I mean definitely we've come across two or they they are very that's why financial institutions do choose to buy ACTA mais FICO the regulators are very comfortable with them it's very easy to say why it made a decision because it's literally a rule it was oh it was X amount of transactions an X amount of time of this amount or whatever but with machine learning unless it's explainable yeah so we don't necessarily see that principal problem and for one of the reasons is because the explainable so the the documentation that comes out of driverless a at the explainable part is actually looking at a lot of features it gives you the option of actually simplifying your features say essentially something that you can do in driverless AI is you can say I do not want to complicated you know like like a stack model ensemble or any of these things give me a very simple model at that time it will just give you a very straightforward model with all the features that is very statistically approved which which i think is what you were hinting towards when you're saying you know regulators excel and that essentially solves the problem for us but there are a lot of financial institutions who say yes with the regulator's will work with these kind of models and these kind of features but for us show me you know what's the next big thing and that's where you know we kind of separate this space and we say anything that's getting kicked out use the old school I think that's you know you want to explore use the new new age not new school I'm sure yeah yeah yes please so what are kind of the results that you see for the automated machine learning system in terms of performance compared to the rule-based system and when you say performance could you classify it is it system performance is it rule in terms of I guess f1 score is it much better similar or not as good oh it's it's very better compared to what we see but you do have to understand that we are working post alerting systems so in this use case we're working post alerting systems which means that things that were lost by the alerting system is already lost for us as well because we're using their output as ground truth for ah for rising so also a second question when you have the target and you also have cases what is a data point that is neither a target nor case is that just a flagged event that was only flagged by a small number so yes okay that's actually a fantastic question because we try to exploit those kind of alerts you know a lot more usually what happens is in the investigator realm they just ignore it saying this is useless nothing is gonna have come out of it but for us you know repeated alert like every six months or every month there is one that's coming up but you know the investigator is saying no this is nothing is very interesting for us in terms of behavior so you've been able to squeeze those kind of alerts for a lot more features and I would say we've been successful but it's not very ramp it's it's not you know obvious but you can squeeze out some features from that as well Thanks yeah my question is is there any way to twist a motto so that it is unsupervised and when you say twist I merely just remove the labels and I start building the model and sepoys or are you looking at it in some other way yeah like any any opportunities okay so one of the ways that we do is there are a lot of situations where you know certain rules I think someone actually mentioned oh yes the gentleman behind II was mentioning about you know not much data in terms of you know labels and then what do you do kind of a thing so at that time what we do is we try and look for similarity between unsupervised and supervised models to see if you know the same kind of features come in we use a lot of clustering techniques we try to avoid anything that is you know reduction in dimensions because if you try and do de madame dimensional entity reduction then essentially it's a lossy problem meaning you're losing a lot of information so before we come to any of those points we try to see if we can recreate the sepoys model using unsupervised approaches it's it's not as straightforward as it seems but that's essentially how I can word it like we try to recreate if the model in the unsupervised ways can classify the same way in the Savoy space and that helps us you know solve certain problems where we don't necessarily have labels or there are situations where investigators are not fast enough to give us the kind of alerts that they can and this problem ties back to I think the gentleman is asking about fraud investigators cannot solve fraud as quickly as possible like there are fraud alerts that come through as well but there are certain rule based systems which tend to classify it we use that approach in fraud okay questions no so how much training data these see before you system is set a certain level of effectiveness so good question so the in in a best case the worst case we have seen closed about I would say 6.5 million transactions a day you know in the worst case in the best case and this is probably like the Gulf of gold kind of things for people who do data science we've seen 1500 trillion transactions a month so that's a best-case fifteen hundred trillion transactions a month plus a 1.5 trillion transactions a month my god yeah sorry yeah so yeah so we've once we enter the zone where we need to balance we change the problem in announcer voice problem we do not make that suppose anymore so so now you mean the data is balanced because alerts itself it in the feet are a no-no so we don't necessarily go to balance the data itself we don't over sample to balance the number of others that we have if we have really low yield then we move the problem to answer voice problem does it help yeah I think maybe I've one score itself it's it's okay so you don't have to balance the data you optimize the optimize an f-1 yes okay sorry maybe you say that the beginning get high arrive late and given that the amount of cases that you analyze is so high I'm curious now how many predictors do you have in your model so there are certain restrictions there are certain restrictions that sometimes we have to play with in again in a worst case we've had hundred twenty nine in a best case we have actually hit close so what I would say close about 1200 yes yes yes just to complete of these in general which ratio are numerical and compared to categorical I mean 50/50 2020 80 what would be the ratio for the ratio between numerical and category Oh in terms of features oh I would probably say like 95 to 97% numerical oh we also use hierarchical features of that may that helps yes I think there are two more in the back yes okay at the beginning you mentioned that you doing packet sniffing do you do through a wire or wireless networks both sides wired and wireless I'm almost a hundred percent sure you mentioned this in the beginning in sigh apologize you guys don't ever you're using the financial institutions specific historical data their unique data you're not using consortium data no consortium data model no yeah but you know you guys don't do that you found that that obviously it doesn't work because what you see like you said is even a specific financial institution within the same country will look different so based on the so we don't necessarily use any form of consortium did yeah but the way we've done it is we've looked at different behaviors across different geographical you know endemic zones for that matter and we have used things that are you know helpful on the other side features features behaviors that can transfer between zones and you're looking at these alerts or you're sifting through these alerts on a batch basis I assume right not in real time because there's no point yeah yeah I mean these alerts I think banks have enough time to actually catch a person who's doing my laundry so but we give them an option of deploying it real time in case they want it so you can work the same thing in real time as well but because of how the you know the banking systems work most of them deploy it in batches which means that you know it happens in the morning or early morning something like that but there are some financial institutions that are not necessarily working in retail which means that they have a different need for them this works in real time thank you any other questions of we good I hope this one yes what is the maximum individual amount of transactions that you deal with in the model that you mentioned because they seem to be the regular transactions not in hundreds of millions 250 million or higher mo so we've actually seen amounts that I would just probably stop at saying in the millions in the in the only one not to rain alright not routinely no no quite a few yes so you deal with the standard charter banks and that kind of it we see different kinds of sir grasp our ranking as well happens so what you saw in the data set is retail which means that it's you know people like us it's not necessarily like you shouldn't doing transactions or things like that so their transactions tend to be much different yeah so you know you you would see behaviors in wholesale institutional cross-border yeah so you would see on those scale you would actually see larger question about government do you do any government banking system what they do and the the investigative branch etc etcetera in different countries no so we don't we don't need oh you don't have such a casa thank you [Applause]
Info
Channel: H2O.ai
Views: 7,259
Rating: 4.9402986 out of 5
Keywords:
Id: Cl-dCqHiYfg
Channel Id: undefined
Length: 65min 28sec (3928 seconds)
Published: Tue Oct 08 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.