Cybercrime in The Deep Web

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so welcome everybody kind of happy to be here again at blackhat to give a talk and this time I'm here with Vincenzo colleague of mine we actually worked together on visa research and the title of our talk as you carry there is cybercrime in the deep web this is basically we start working in this domain pretty much three years ago so at the time I was down in Chile for a conference and the front of wine from Argentina he showed me basically how you could use tour the tour client to get into Deep Web and at the time I even didn't know what was deeper so and you showed me actually that you know you could buy assassination services in Thailand drugs so I was pretty excited about that so I said okay cool you know why we don't develop a tool to physically crawl the Deep Web Oh to collect as much data as possible so we start collecting the data and it took a while because we collect pretty much more than two years of data and during this talk we're going basically to present you the tool that we developed to automatically crawl the Deep Web and also analyze the content okay and yeah so quickly about ourselves so nowadays we both work for Trend Micro search and we both come from academia me personally I was a before I did my PhD in system security and they work extensively with the guys that are behind the newbies and Wepa web so they seasoned security lab it's called and yeah we pretty work on any different field of seasoned security from web to malware last year I was black at presenting a search on a s which is like a tracking system for for ships so we are trying to basically position later micro not only as an AV vendor but more like a 360 degrees vendor in security so any topic or it is resistant Security's good to go and I think digital can introduce himself I don't have to do it and they will basically take from here and do the first of the talk and they will follow up a little out discussing a bit more in detail type of malware which use the Deep Web and why Thank You Marco so unlike Marco my background is more on the networking I got a PhD my PhD topic was basically peer-to-peer networks and how to make them talk chatter and so on so it was a fairly good fit for this kind of research after that I joined Marco in Trend Micro and we started working back to back home you know these kind of projects amongst which also this deployed research so I have a rather stronger development background which made me ideal to develop the system you know together with him so the idea here was again when we got aware that you know there's a world out there of resources that our self and heart the track which might be ideal for bad guys we decided that we wanted something to both you know make some research on top of it and sort of gives our colleague which are more into you know more of an Alaniz profile and they do investigations do some tools to actually track malicious actors around the deep web as best as we can so just to give a the road map of the talk is going to be the following are just going to give you a quick introduction of what we mean by Deep Web I'm going to give you a quick detail of our Deep Web analyzer so the system that we developed the sort of create the data and analyze it and organize it in a way that it's sort of easy to easy to navigate I'm going to give you some examples of what we found in terms of interesting resources related to illegal trading so all sorts of that crazy stuff that you read in the press from time to time will see them here what's in there some examples of the kind of data analysis that we can do and how can we exploit it to find again our interesting actors and then Marco is going to focus on the malware part so show you actually some malware that exploits the Deep Web and how and how we can track it actually and then draw some conclusions so first of all you know I'd like to charm the crowd by being extremely pedantic being told that that works some definition first so this is very important to make this distinction because if you see research that early research that mentioned the term deep web at probably beginning of the year 2000 what they referred to as deep web was in fact content not indexed by search engines and that's where you get you know those paper that claims that you know the Deep Web is like the submerged part of an iceberg of which the surface web is just the tip and so on of course it is and the estimate that we have like 30 million terabyte of data hidden in the Deep Web and so on which makes sense if you adhere strictly to that definition which this definition which includes you know your home banking page or anything that has a login page or anything that pretty much has a configure robot dot txt file so that the search engine doesn't index it we have another definition that's the one that interested more which is darknet which are actually private overlay networks dedicated to hosting resources like sites images and so on in a stealth way and deriving from that we have dark web which is the actual world wide web hosted on top of dark Nets so this distinction is important because for example you get things like again in the press or you have quotes like this so this is a documentary which came out this year from Alex winter which curious enough if you've seen Bill & Ted's Excellent Adventure one was Ken Reeves the second one was Alex winter so it was that guy there and he made this documentary about the whole Silk Road the trial documenting everything and if you look at the trailer you can actually do it right after the talk it starts with this quote the Deep Web is vast thousands of times larger than the surface web now said with the very deep and scary voice and scary music and so on which again as I mentioned yes it might actually apply if you consider the Deep Web as you know your home banking account then of course it's vast and there's a lot of resources there but when you see in fact the what's actually interesting for cyber criminals the sites might not be that big and all the you know the the iceberg metaphor doesn't really apply there so then you would si but then why do you talk about your research about deep web research I mean you just said the distinction that is why I actually what we consider that interesting for malicious actors in fact it's not just dark web which include you know tor I to be free in it and so on but we also try to sort of look up on other resources that might offer some interest that cyber criminals because they are you know less regulated like in a rogue TLDs they have a separate infrastructure and so on so I'll just go through this you know very briefly to just give an idea of what we're talking about so tor raise your hands who never use tor in its live never heard about tor I figure one guy there another guy okay so you know we don't need to spend too much time on this I think everyone knows it it started as a way to do allow people to an only do you know on the internet go on the internet anonymously overcome censorship and so on later and the development they pretty much developed hidden services which not only allowed people to actually go to say Google or you know censor website and circumvent firewalls and so on you actually allowed people to host websites in an anonymous way so that they are untraceable you know by the customer the users are untraceable and the server's aren't traceable to Onion Routing your request you know jumps from hoops to hoop encrypted multiple times only you know each hop on the path can only create one layer at the time so you never know exactly the complete path in brief I to be the less known brother of tor I'd say unlike tor it doesn't technically allow you to go and browse anonymously surface web it's dedicated to Austin hid the services which are identified by those domain dot I do P basically you can recognize on i2p hidden services because it has a dot I to P top-level domain for tor those would be the dot Onion domains in case someone doesn't know and it has a routing system slightly different where in order to reach someone you actually have a distributed hash table someone like an OS q L database that contains an inbound tunnel that you can contact to basically reach somebody that somebody can answer back through his outbound tunnel the gets routed your inbound tunnel all nice guarantees anonymity multiple level of encryption and so on Freenet this is the kind of a weirdest of the three so is the oldest one of the three in fact is where ITP comes from and it doesn't really allow you to host a proper website where in i2p you can actually you know deploy an Apache stack and all and actually host the proper website Freenet it's mostly for raisers lookup so you can post the results on Freenet like an image a static HTML page and so on it gets identified by a hash key and you can retrieve it you know by using the hash key so you issue a request to your network you have a neighborhood you know made of nodes you connect to you issue a request for something the request goes from node to node until it's found somewhere and then you get your response back so you can't really do dynamic web but still if you need to host let's say picture galleries or static files that's just perfect in terms of name coins and Emer coins so this is not really a stealth website hosting it's mostly a way to register domains like bypassing you know icons regulations and so on and they're basically so these are systems that allow you to register domains using a blockchain like technologies or imagine crypto currencies like Bitcoin and light coins and so on so rather than storing in the blockchain actual currency transaction like I give you ten bitcoins to that wallet you actually reduce their domain name information so that guy just registered this dot domain name dot vit that's the top-level domain for Bitcoin and so again you actually need you know specific software to connect to it but it allows you to actually host domains outside of the econ jurisdiction for example same thing for Rob TLDs this is like the oldest of all the systems let's use the trick it was fairly popular in the 90s where you know you had plenty of companies that would offer you your own top-level domains in their own DNS servers and so on provided that you connect to their DNS is basically and some of them still survive today actually I named space I think it's doing something open Nick the Sicilian root it's pretty much a whole we're world on his own it's a system developed by a guy who invented his own country religion medicine degrees calendar time things and internet he has deployed probably 20-30 DNS servers all over the world that responds to that DNS but that's another story so again with this one this in mind we figured you know let's see exactly what's out there let's collect some information see how we can index it in a proper way so that it becomes usable and what we find so we spend like one year or so developing our deep web analyzer which so this is the sort of the overview it's made of several modules so first one is of course data collection so we try to pretty much find URLs everywhere we can or you know as many places as we can we index them and in a way that allows them to be searched in very smart ways and also to pull statistics for example and we see how these statistics can become very useful when you want to spot malware on those URL we go and do what we call scouting with pretty much with this pretty much page crawling you know finding basically everything about the URL if it responds the HTTP headers if it doesn't respond why do we have a problem with the resolution is it giving us an HTTP error or something we get the page content screenshot parse the text we do active poor scan on some of the addresses to find things other than just HTTP and so on and we do that through our gateway so we have basically Universal gateway that allows us to go everywhere in the Deep Web through just one port combines all the different software there with information we got we do some enrichment namely we translate the pages the computes and were clouds and so on and then of course we can access the information so either through some statistic software it such as Cabana for we store them in elasticsearch so Cabana for it is the ideal companion basically to Putin to pull out nice graphs trends see exactly how much data electing and so on and then we develop a custom portal which allows us to actually do what we call the qualitative analytics meaning actually go and look at the data we have so what's the screenshot what's the data there what are the interesting links and so on so in terms of data sources this is pretty much what we are try to gather we get user data anonymize user data in particular so from our users we know basically what are the HTTP connections that they're doing so we don't know who is doing HTTP connection but we can see the HTTP connection let's see where is it going so we basically parse them to find all the connections going to onion domains i2p domains dot beet domains and so on we scrape sites like paste bin for example and all the similar slack see page D and so on to find you know paste this with interesting addresses there we have a Twitter feed so right now is the 1% feed but we're working to increase it to the full data host basically and pretty much see exactly in tweets if we can find onions I 2p the beet maybe in the future recognize you know botner traffic's and so on we monitor some on some subreddits like our tour our onions our darknet marketplaces and so on there are also some your listing sites like dark Knick links I think onion links comm and so just just sites that post you know lists and lists of links and that you can get there tor gateways so there are actually sites like torte org or onion cab which are dedicated to let you access resources on the Tor network without having to install the client so the disadvantage there is of course that you're not anonymous anymore because you know that website will see you actually connecting and they will keep statistics actually of what are the most popular Deep Web sites accessed through them and those statistics are public so we can prove them each day and basically see if there are new domains new upcoming domains that are interesting I to P host files this is something similar to the server dot MIT file that you have we some of you were using emails to do you know piracy stuff or something or absolutely not so it's a file that basically contains some text some information to help resolve ITP domains officially 9 to P domains is a big long hash dot B 32 dot i2p that's the actual address that is stored in the system but you can actually register some readable domains such as get dot I to P dot something they get resolved to their actual hashes and in order to resolve them you can actually download these text files that pretty much gives you this lookup information and of course by downloading this files they on a daily basis we can see if new domains are actually are actually registered and then of course we got the feedback so when we Scout new pages get all the links feed them back in the systems to collect statistics Index them and so on and so forth so this is the sharing gateway that I mentioned so it's our you know general deep web gateways it allows us to just have a system point to just one port or if we have people doing investigations or something they want to access you know deep web just go to one port we got deployed there and what they can do it's basically through squid depending on the URL they are accessing it can redirect them to either Poli point or we got 64 instances running we get an I to P instance there if it's an I to P domain we get free net running if we have something resembling a free net URL and then we have a custom DNS resolver that pretty much resolves again normal DNS rogue till these name coins and everything so connect to that port get everything you need makes things so much simpler in terms of scouting so as I mentioned this is kind of all the data that we try to get we use a headless browser that pretty much gives us back a full hard logs meaning all the HTTP transactions that happen in the page what we have is that a virtual browser window connects to the address and all the other connections that come either let the HTTP redirections or other connections they're logged there so we can we store them and it can help us for example five drive-by downloads or whatnot we get the full page after you know every executing JavaScript and everything so we get the final Dom with that we take screenshot of the page we take the title text all the metadata like all the keywords tag meta tags resources all sorts of stuff we store from some pages the raw HTML because you can always come useful extract the links as I mentioned we try to parse email addresses and become wallet this is the kind of informations this is what we are using actually it's a nice piece of code it's called splash from scraping hub which again as I mentioned it offers a nice headless browser that you can just deploy on your machine you can either act as a proxy or respond to some rest requests so that you can just you know you can either browse the web you know we using that as a proxy and it will eventually store screenshots or store data from there or you can use it as a server connect we tell them ok go to this page through this proxy and get everything you can to access that we actually modified some scrapy libraries and created some middleware is in order to access it you know better and actually integrated also with some shared queues so that we can launch multiple processes and make the thing more efficient and all and as I mentioned we extract all the links and everything the other interesting part about this software is that you can also scripting through Lua and have it interact with pages or that it supports no script adblocks script so you know you can even use it for say browsing basically so in terms of your reach meant as I mentioned we collect the data we do some original per ation on top of it namely for all the links that we find we go on our web reputation system and we try to get the classification for the surface web links at least we get the classification and categorization so is the link malicious is it safe and if it's malicious why is it malicious is it a disease vector is it a link to a proxy avoidance for example which doesn't really make it interesting but we got it there we do translation so we try to normalize all the text to English because it's what everybody speaks in the team so we can get at that point no sort of make it at least get an idea of the page even if it's in the farthest languages we try to do language detection as well of course while we do the translation which allows us to put some statistics of what are the most popular languages there and we do a significant word cloud which means that we try to pull a word cloud out of the text but in a more intelligent way I'd say so what we do basically you know from the text scraped from all the HTML of course we create our tokens meaning word and occurrences so word how many times it happened the page we filter out and keep only the nouns so you know table chair stage screen and so just announced their getaway all verbs and objectives and so on with the nouns we compute what's called a semantic distance matrix which means that we so we use a Python library called word net which contains basically a graph of all the words in the English language connected to each other according to their meanings or to their taxonomy so for example a words like dog and cat will be fairly close into that grass because they're both animals baseball and basketball fairly close because there are all sports dog and baseball pretty far apart from one another so we're using that graph you can actually compute in terms of hopes out distant two words are from one another and we use those so we compute the distances for all the word pairs that we have and we use that to do lyrical clustering so we create clusters of words with similar meanings take the first of the words in alphabetical order zazz a label and then just use that information to compute an actual word cloud that tells us something more you know than just pretty much objectives so here's an example I scouted the Russian forum it's called Russian see looks interesting I have no frickin idea what's telling it here because unfortunately I don't speak Russian but if I go check on the work loud at least from the top 20 words it looks like it's again registration rules for room trading razors Silk Road of course register and so on so at least it gives us a glance right away or what's a page can it be about and if it's worth looking further into or not so this is an more or less yes it's an estimate of the conducted data that we got for in two years we pretty much had like 440 point five million events which means each event is when we find the URL basically so we spot a URL is one event we spot it again it's the second event in total the URLs that we collected are line around six hundred six hundred and ten thousand URLs for a grand total of more or less 20,000 domains or twenty thousand five hundred domains keep in mind though that in terms of volatility it's very volatile data you know you got something like operation animals and in one day all your data becomes crap because all the there's a massive takedown and most of the sides go down and so all the data you collected works as historic data but the sites are not there anymore so again I think right now of all the twenty thousand domains we're probably around nine thousand actually online last time I checked so I'm going to show you a couple of videos that give you an idea of how our portal works you will forgive me if I don't do the connection but it's like a double tunnel connection and I really don't trust wireless networks around conferences so this is an example of a URL summary page that we have in the system so with the statistics I just showed you here is a breakdown basically where all the our URLs are grouped by hostname with path you know clicking on the plasti you'll see we can get path and clicking more we can get query strings in green you can see sites that contain pages that we manage the scout's of sites with pages that responded to our crawler in yellow are actually pages that we try to crawl but either didn't answer didn't resolve the name or we got an HTTP error and those pages are not to discard because they can be interesting to actually while in so these earlier pages while in red we actually find pages that we scouted and contained links that we rated militias so these are pages that contain links which we rate the militias and they should be interested on their own part so we can actually do search and filters so for example here I'm filtering and just find the i2p related pages here you get an example of those long hashes without b32 I can filter and just find the malicious pages and what I find here for example is a scan for you dot I to P which happens to be a counter V service something like no virustotal with bad guys where you basically pay to submit your piece of malware and get the malware checked against all ad engines you know much like virustotal but without divulging the information so if you're developing some new malware you could actually you know get the check there get the check there and you know have the information kept private so again the workload here sort of gives you the idea antivirus check list virus and so on we get the page screenshot right there under here the work cluster so the work load is made of 20 words in the table below you actually see all the words that got picked so again see the kind of clusters that you get least database blacklist spyware software guard defender and so on after that we get the links so yeah as soon as they move back we got the links there or the word present in the page this is the links that made the page the mark the page read is the ex secret comm which happened to be again internet was rated as Internet security disease vector and marked as dangerous that is because that's actually a site selling critters so software that allows you to obfuscate your binary either thing yeah we got the tags resources and so on basically another demo this come from the scouting page which is the page where we can do either searches on the page content or we can do searches on for example the URLs and we will see and this shows you and actually look at the data itself without caring exactly how the urls are organized so in this case we are in this case we show you actually how to do a search on the URLs to find for example open directories so there's a particular kind of open directories particularly the one generated by W get which presents a URL with two query parameters AC which means sorting criteria and they Oh which is the sorting ordering so in our case we actually store and index the URLs by all the different components which allows us to go and search for those specific query parameters and retrieve pretty much everything in our database that it's pretty much an open directory which is fairly interesting because then you can go and find you know if there are drop zones if there are leaks leaks or all sorts so here's for example we open one the screenshot shows that it's a file list as we expected the word cloud here doesn't mean much being a worker made of file names which would be mostly gibberish and so on so back to the presentation so let's see some examples exactly what we found there the pages and interesting things that we got to spot so most in one of the most interesting guns of course this is a store claiming to be in the UK you can basically find stores everywhere pretty much in every part of the world again I honestly have no idea how like one point for bitcoins for a Glock 99 it's very convenient to me mostly because you know it's not register so and again you wouldn't go there to buy guns because of the cheap price you would go there because you pretty much need no license or anything in terms of drugs I don't need and I don't even need to actually say it like this is you know what everybody sees in the price and so on from you know marijuana to all sorts of drugs you can see here crystal meth coke and everything this is like the main main driver of the economy in the Deep Web no doubt about it Passport fa cai' DS so this site claims that it will provide you with the Dutch passport for 600 euros which is fairly interesting they got pretty much many many interest national interesting nationalities the u.s. 1b is the most it is the most expensive you can actually get it with driving license and ID card as well they even take care of duplicates in case you are so clumsy to lose your fake passport and you know counterfeit money again here's an example with euro bills 25 by 50 you know for bitcoins you can basically get a fair amount of 50 euro banknotes or at least that's what they claim because about that you know we can track the claiming of the website then in order to go actually and check if the website is legitimate and so on you need to either do an investigation or a cover operation but I can't really go and you know try to buy stuff to see if it works because that's called financing criminal activities in most countries so so this is you know take it as they are they claim to be they look like respectable shady business whatever that might mean so but I haven't checked personally a credit card again you can buy them in bunch you know with maximum balance of course being either stolen or cloned they might not work so you get maximum balances and we draw limits the higher they are the more you pay for them this is very interesting for social engineers you can actually buy packs of PayPal accounts so hundred PayPal accounts of which you know 80% is given as working which again fairly interesting there for 0.3 bitcoins if you have to do again scams and everything this is what you want to look for what else Daxing some fairly interesting pages you know sort of giving out personal information about high-profile figures in this case there's an example with Bill and Hillary Clinton this is the actual picture of the address dicted there that we leave it there there was 1600 Pennsylvania Avenue we figured it wasn't really worth censoring us is the address of the White House so you know it's not really that secret I'd say assassins pages again mm self-declared assassins people will claim that for the convenient price of 7,500 euros they might kill someone aged 20 plus in the three continents and so on haven't checked like so far you know they claim that rumor says that mostly would be scams no way of verifying it but again if you had the chance of putting on a website that is untraceable and you know your identity will be covered and you have the chance of getting payments with the currency that isn't traceable itself and let's say you want to target people who might not go to the police complaining about their 7,000 euros being stolen by a supposed hit man well you see you see how this is a really good plan to get people from who get money from fools on the topic of getting money from fools meet your new Kickstarter crowdfounding evil this is pretty much it was a website called Deadpool what they would do is that they would have a list of AI profiles again Barack Hussein Obama Justin Bieber just because it's a high profile you grant not the actor the other one apparently the actor is pretty much beloved by most so they would put a bounty on them say two million dollars I think it was for Obama and says okay let's start doing a Kickstarter campaign here so you give us the money when we reach two million dollars we promise you we're going to hire an AI an assassin and get it done sure again I still see Obama around so looks like they didn't reach to two million dollars of course everything goes in bitcoins clearly so moving on to some more some other stuff on the data analysis topic this is a breakdown of the URLs we collected in terms of in terms of what kind of view of protocols we find in the URLs so either than HTTP NHD PS being pretty much 90% of what we got what's interesting we see plenty of IRC servers some IRCs XMPP which is simply being the jabber protocol which is fairly interesting because these could actually be used either as a rendezvous point for malicious actors or the infrastructure for botnets and CNCs and so on if we go we try to do an active port scan only on tour hosts and this is the actual numbers that we find lately so pretty much 49 IRC 31 a RCS and 800 SSH which is fairly interesting on its own this is an example of the IRC servers that we could find some cyber gorillas and animals with the channel dedicated to solidarity with jailed anonymous people more people members of anonymous basically in terms of languages of course like normally now English is the most is the most renowned language we could find their Russian being a strong second if we take them off we actually see French and German following up some Italian Portuguese and Dutch being around here pretty much in par with you know Finnish in Japan it's not that many I gotta say just out of interest this is done counted the number of domains that show that language so nothing is not in terms of number of pages this is an example what we find you know for example in the French side there is a guy in a forum trying to sell some weapons again I think Varma Tremaine vom yeah it's a colt 45 from 1911 apparently so it's not even that shaky I'd say in terms of pages embedding suspicious links so this is what we can find when we look at the pages again with links rated malicious as an example here you get an offering for a guy trying to sell you know some predator pain and outer log or so some some malware basically out there then key loggers in particular the term of email identifications has dimension we parse emails there and we can run some nice statistics on top of it to find you know what are the addresses that are used out there and so on and here interesting is not really - interesting the top one like zze is like one of the main developers of i2p so it's really not that shady you got some QQ addresses here and there is something like toxic poison or a bank from America one which I gotta say this is not actual bank from America it's just a guy using a username you had no idea how Bank of America can get pissed if you nominate them there so if you if we actually check and see what are the associate so what are the other email addresses that we find in the pages containing the Bank of America at mail to tour again not the bank just a user using the name we can find other interesting addresses here such as no code for example and I can go dig in the pages and find a posting Xillia forum when said no code was selling some credit cards and some approved credit cards here and to which you know later on mr. Bank of America know the bank was responding again with this email appearing right here any store chat address in there in case anyone is interested in terms of Bitcoin we can also we also try to scrape everything that looks like Bitcoin addresses we found what are 1,200 more or less Bitcoin wallets or at least candidates again we run reg X's on there to find the candidates so and we try to be as precise as possible unfortunately as Marco will show you there are ways on certain pages to obfuscate Bitcoin addresses so that while you see them in the page in the source of the page they're in fact scrambled around in terms of services we got some money laundering service for bitcoins again tumblr basically you send them money they keep a fee they sort of rumble them around make some fake transactions maybe just jump to another currency and go from Bitcoin to litecoin and then back so that your money becomes hard to trace basically on the other side there are Bitcoin multipliers that claim and again claim you know they will you send them one Bitcoin they give back a hundred or reasons again you know that's what they claim this one in particular I really wanted to at least to show its legitimacy and it's actually showing you you know the successful transactions that you can trace on the blockchain to see that in fact the Bitcoin got multiplied then they get the list of epic customers let's move now to malware and I let Marco continue from here okay thanks hello okay cool finishing chances are in the rest of a talk we are focusing on malware and the cool things that you know back in the 90s when you amalgam is mostly spread via you know like floppy disk and were just a piece of code running on your computer finished nowadays is every natural dependent so without network model basically is dead and it's for simple infection time bunch of malware now propagate using driver download of your web so basically we need web denoting fed machine as much as a propagation type when they move from one machine to another or when they know the out or of a mother want of datas code or when you have a dropper which download a second stage malware you need a little and infrastructure so the network to deliver of a new second stage model for example and also when it comes to command and control server so infrastructure used to run basically the universe data for for a botnet you need a network so you need a server Austin a command control and so why mullah router are interested people mainly because they want to make their very modern infrastructure just botnet very resilient against law enforcement attacks operation for example takedown or sink hauling and this perfect I mean because if the Deep Web is very difficult to sink hole because there is no constable P address very difficult to negate the server the same for takedown and and then another goal is to conceal the payment page as we see later in the talk there are a class of malware which is Roswell for example that at certain point we need you know to cash out so the possibly to hide those pages in the Deep Web is interesting so Brown five years ago at Def Con eighteen or four years ago discussed critically about the possibility to use the Deep Web for Marla routers but there was no proof okay so here I mean what we are going to show today is I should have some example model family with extensive use deep web for different reason so the first one is a Skynet which is a mother with distributed enough service capabilities Bitcoin mining and banking capabilities and this been discovered by Claudia when he was working in the rapid7 together with some guys from G data and so the idea is very basically the borrower come with with a deep web client which allow the mother to log into into a deep web particularly and so what you can see here is that the mail itself is a vis PHP script as a landing page to store the credential that are based on my vector machine so what you can see here down here is that using our system we can query for his path and easily we can see we can see that the first two domain reported here are the two common controls ever used by by this misguided malware with a number of of connection going to the common control server we can also plot over time the evolution of a of a common control server so we can see that in back in Maine there was just one common controls ever being used by this particular model and later on in September the outer introduce mo second common control server operating concurrently to the first one which is like green and then out and our common control server were introduced in late October similar to this is the dire banking Trojan so this is a BHO that basically two men in the middle of online banking pages and is basically it tours by when you get compromised by hijacking the session at client-side and what happens that when the user busy log into is a online banking the code the session which has been eject connect back to the attacker kind of the reversal approach and in this way the attack and has direct access to the online banking account of a user and this particular code it is the dgn which is domain generation algorithm and IP level to basically generate different type of domain ID ID no fast flux networker to all the infrastructure of IP but this is not enough because nowadays there are a bunch of companies and vendor they have way different system to detect fast flags and DG algorithm so what we do we also use the IP so it's the only malware out there which is IP post the common control infrastructure and so last year in the 17 of December there was just this single domain being reported for the flu virus total as using the no HP so here we disclose a couple of our domain used by Petrosian and we can see an infection over time so nowadays the the models actually still functional and actually there was a peak over the last couple of months so we show actually that modern routers are more and more you know using the Deep Web as you know form of setting up very infrastructure another banking Trojan which is pretty interesting to mention is vote rack so I don't know if you attended the talk before or steganography so what this malware does you basically hide the IP address or the community control server in the icon file okay so the icon fine if you download icon file it appear just as on so but the IP address our steganography he did in Vikon fight on top of that the web server austin vacuum file are hosted on on the Tor network which means that the malhari year used a combination used the combination of steganography and the use of Tor network to make aver comic contours ever more resilient to possible takedown and so now we can the texture stuff so what we did we basically criminalizing the code we found out that basically the malla router was using as a webserver specific version of open arresting so basically this condition together the condition that the paths contain of course is five cannot I call in and you know as fine name and fact that when you when you basically do a get to be address instead of anything at 200 which is a you know a normal turtle code for HTTP you get a 403 which means forbidden so this free condition together we can add them to our system of a system automatically tell us wherever there are you know some URL so that you know as the free condition matching okay and so if a quaver that I got a list of 23 you know command control server related to this piece of code which our plot here this goes back to February so we started we start seeing the infraction in February and more and more common controller be introduced in in the network as you can see that actually ver was actually the trend is is going up so it's not myself painting this image but the chocolaty system which give it to me and so what were talk now we'll talk about the ransomware so ransomware is being something that we are tracking a lot because is an emerging threat and a fact a lot of people around the globe and it's interesting to see that in addition to the possibility of using misusing the Deep Web to awesome and control server the ransomware is really interesting using the Deep Web because he deepened itself for his characteristics provided hidden a robust framework to cash out and for illicit money transfer what does it mean it means that basically if you all know over as well words if not me briefly what happen is that what guided the tail so basically your computer get locked all of the file that you use get encrypted and you get a score for the ransom to pay in order to get your file back on your computer back okay and usually what happens that you can you pay and the event value given back to you and the way the interesting things is that when you do when you want to track some cyber criminal basically in the chain the weakest point is the human so at the time when they turn basically the technology into real money is when is the time when you basically can get to know who is behind you know so you can basically do attribution and so the cashing out is usually the critical part for for the bad guys because we're usually as I said you can basically identify them so one of our goal is to hide all the framework for cashing out as much as possible so what they do here very basically host the payment page on on deep one such as very difficult to track basically the server which is hosting the page and so when you computer get infected basically you get you have to click on a link and the link point you to this page bad cells by this case I can barely see myself so let me see here it's a hundred and twenty know what is 1200 Australian dollars you have to buy Bitcoin for more than a thousand years dollar and you have to pay all the wallet which is down here so basically the page itself is often on the deep web the payment is done through Bitcoin which makes it even more difficult and the other interesting things that sometimes what the criminal does tuve be very asked for a payment in Bitcoin and then we go from Bitcoin to litecoin on from light Colin to another currency so it's very difficult when they basically transfer from one corner to the other to track them the fact that they use a combination of different cyber crypto currencies make very difficult to be to be tracked mmm another interesting things that goes back to what Vincenzo was saying before is that the address here is shown as a Bitcoin address but in the code of the page is actually shuffle is of scaling so it's very difficult to know to crawl this page automatically actually we are not able of doing it this well I mean bye-bye we found actually for a system and going to describe later on but it is we didn't found actually the page by surgery Bitcoin address because not Eric X you cannot do it with very less pressure so what happened is that when a machine get infected the malware generate a user code and user pass kind of one-time user code and user pass and it does it to log the fettle machine into the payment page which means that again you as if you are a law enforcement agency or if you want to crawl the Deep Web if you don't have this token you cannot access the payment page so that's one technically use a to con to block the payment page to not infected machine which speed wresting second things it does to provide the page with the language specific to the guy has been affected so if you are like in Nederland in your page we'll be in touch if you are in China it will be like this one so what we can do we are in our system we can type this kind of a guest into the system and we can search for all the payment page and we can have an estimation on the number of infected users where they come from as you can see here we have basically user from different countries and most of infection are for native u.s. speaking or British following by Turkish friends Chinese and Italian this is done because in automatic because we have a module as min transformation for language detection so we do the detection of a payment page at this one okay and yeah let me conclude with this example but taking too long here so we can have some time for questions so the last one is a nine spy this is a piece of code that does pretty much everything so it's still confidential information such as keystroke password or the private documents and as a bunch of functionality that are very interesting for people that does kind of spinners for example no parrot attacks and so on and so forth because we coded out to record video and audio and so the idea here is that what we look at we look at wherever there are in a kind of emerging trend kind of surging popularity in the number of values associated to the same query string query string parameters so that's the idea so the idea is that suppose you have a you there is a you're infected right or another is a new infection so what's happening in fashion usually when a machine get infected what it does it communicate to the command and control server about manufacture okay and it does if he does using a gap basically what happen is that in the query string parameter as you can see let me go directly here which is easier so in the query string parameter for example this parameter can with a blob with JSON blob with reporting infection so if there are a bunch of infection in a very short amount of time you see basically spike like this one so you have a spike of one parameter with a bunch of and values and so what do we have basically we can using our system automatically we have another whenever we're out of this case okay when if you're going to take here what they can tell you is that this parameter called exhale experience a quick surging popularity over the last couple of weeks over more than 70,000 different value sociated we've seen parameter so we said okay Wow what's going on here so we look at parameter and we saw that the value itself was can we order because was in a URL encoded binary blob and it at the end it turned out that this was basically the data that Arleta from the factory machine so the malware uses a get to communicate the data instead of a post so we can see that I did own the URL string so you communicate that you communicate also a new infection together with the operating system and machine language Nov 4 information and we can plot it here so we can plot the traffic over the to parameter and we can see that in the do year with a blue you have your red blue so the number of new in fashion is getting down while it was pretty you know more we add one day in middle of nobody's May the middle of May we've over and red new victim on a single day I'll be same with the traffic so let's sum of an RS we can do our system and it's something that actually we are doing it so basically the take home here for today is that to mean together here me me tend to believe in gentle Val did they all development we build this system for doing the collection and analysis of a deep web which is something unique that has not been done before and we're actually using it using it to detect a different type of cyber criminal activities such as training of elicit God as no Magento show we are tracing underground forums and marketplace it understand what's going on we we want to do in the future some automated analysis and correlation of different emails of a different blog post to understand if which out are talking to whom from which BlogPaws we want to correlate the deep web with the surface web to try to addendum eyes users in the Deep Web that's how if you want to look at we are focusing malware there to trace new campaign mul infection and so that's pretty much what we have done and I think we have a couple of minutes for questions here so don't be shy and it's your turn now to contribute to the talk thanks a lot for listening I hope you enjoyed what we had today for you
Info
Channel: Black Hat
Views: 50,719
Rating: undefined out of 5
Keywords: InfoSec, BlackHat, Black Hat Europe 2015, Black Hat, Information Security
Id: OcuzaOLs7dM
Channel Id: undefined
Length: 55min 7sec (3307 seconds)
Published: Sat Mar 05 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.