DNS Evidence You Don’t Know What You’re Missing

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone and welcome to today's sans web cast DNS evidence you don't know what you're missing my name is Carol ahthe of the SANS Institute and I will be moderating today's webcast today's featured speaker is philip hagen dfi our strategist with red Canary sans certified instructor and course lead for fo our 572 if during the webcast you have any questions for Phil please enter them into the questions window located on the go-to webinar interface we will be answering them during the Q&A session at the end of the presentation and with that I'd like to hand the webcast over to Phil hello welcome everyone thank you very much Carol I'm very excited to be here this afternoon morning evening whatever time of day it happens to be where you are I took a look through the attendee window looks like we've got quite a lot of folks from pretty much the world over which is pretty amazing to me anyway it's still still pretty awesome well we'll go ahead and get started and I also wanted to say as we get started before I jump in on the content itself I'm very fortunate here to have a very good friend and colleague Brian Johnson with us Ryan as you can see at the bottom here this slide is going to be teaching at the Laura 572 and a number of upcoming locations so it's always nice when we get to add a few friends to the mix so thanks for joining us Ryan glad to have you here Thanks make sure my mice click all right so we are gonna start out here today I'm talking about DNS evidence in general you know this is something that I've been using for a very long time and I know a number of other folks in the deeper world have as well but it's really starting to come into its own and we're going to talk a little bit about some specific use cases methods for collection things like that definitely gonna be a lot of fun and I really also like to stress when I get going with any of these kind of webcasts that we're not pulling these these ideas out of the ether so to speak these are all tried-and-true methodologies that we use in cases day in and day out so all of you you're talking about being in a traditional forensic shop where you're doing case based analytic and evidence examinations whether you're in an ir team that's interactive incident response throughout your environment which the marketers call hunting great big air quotes around it regardless this is stuff that uh it's real world we're going to show you some some genuine stuff here and definitely give you some practical means of using it back in your environment so just by way of introduction carol did a great job just with the initial thrill my background i've been an information security for a very long time and i definitely look at it's something that i will continue to do for quite some time you know my background is in computer science and in the US Department of Defense I was an Air Force comm officer and then after getting out moved over to work in the consulting and contracting space husband for about thirteen years now as Carol mentioned also with red canary or a threat detection startup so not directly relevant and related to what we're seeing in this material but I definitely a tangental security startup and seeing a lot of great traction in there so it's a lot of fun fantastic team it stands I was a very very fortunate very honored to be asked to be the core sleeper forensics 572 and I teach that in round 20 sometimes per year so as I say in class I don't do anything for 22 weeks a year I don't really look so it's definitely a good time I got injured by it and I also ask Ryan say couple of words here and give us this background sure really briefly thanks ville I've been doing forensics and information security since about 2003 I started out doing dead box forensics for police departments working on cases from fraud to murder drugs and guns I left there I went to a lovely location in Iraq to work with the military doing media exploitation for you know combat devices that were they were collected from there I moved into consulting where I got the opportunity to work on a lot more IR that's that's my primary focus at this point in time I also was able to do some some teaching for the State Department's anti-terrorism assistance program there there cyber division from sans perspective I've been teaching 572 for for saying since since last year so I try and do that as many times as my as my schedule low allow excellent well I guess that we're we're definitely fortunate to have have Ryan here got some good perspective we're gonna be adding in here as we get going and as Carol had mentioned as well if you do have questions please drop those in a question interface we will get those I'll try to get to them during the the presentation if they could do come in and if we can slot them in it will absolutely take a run through those toward the end of the presentation as well well let's go ahead and start off and just say well why DNS in the first place one of the reasons that I have always gone to DNS is that it is very fundamental it is used by pretty much every other protocol so if you don't know any of the several protocols that might be in use in your network there's a really good chance that at least for the TCP and UDP carrier protocols that you're talking about something that uses DNS very few pieces of software will use hard coded IP addresses except in very certain edge cases most of which in my experience anyway are pretty suspicious in the first place so when you're taking a look at one protocol to rule them all so to speak DNS becomes a great way to to do that so go to one location go to one protocol parse it well and you're gonna get really good value now as you're probably used to hey when I look up sans org or cnn.com or any other hostname it's going to return to you a series of IP addresses makes pretty good sense but there's a lot of other types of Records as well a lot of times those all actually give us very good insight into how certain services are established how certain authorizations are put in place whether you're talking domain on ten occasion for things like Google Analytics or what are you talking about SSH fingerprint pointers or raw text records that could be used for good or for various purposes but if you're looking at this traffic you get a really good idea again that kind of pulse that one place to go to a lot of different protocols in addition it's an old old standby it's very well documented although it's somewhat convoluted in terms of how it actually lays down traffic on the wire it's very well documented there's a lot of very great well-established parsers out there so you don't really have to do a whole lot of reverse engineering and exploration you can really get down to analysis pretty quickly on the network forensic side but as I said we use it for that single analytic pulse but it's real easy to see normal and abnormal if you've got an idea of what the baseline of normalcy is and I'm going to show you some tricks that you can use to establish that within your environment tactically or strategically it becomes very easy to take a look and see which ones are abnormal which ones are out of cycle which ones don't belong to the to the picture so that's one of the things that that I really like about it is because it's well documented I parse it with any number of tools and it's basically gonna light up like a Christmas tree if I see some stuff that's you know somebody speaking out of turn Ryan asked Ryan to talk just a little bit on passive DNS specifically for a minute sure when we when we start talking about passive DNS when it comes to a incident response process what we're looking for is some sort of normalized data set that contains you know historical DNS activity for our environment there's you know we want to be able to go back and say okay here was a request for a specific hostname what did that resolve to and there's there's a lot of reasons we want to do that in an incident response or hunting as we say with air quotes a process one of the ways we can deal with this albeit impractical is to go ahead and just you know collect all the DNS records for every site across the internet and and of course you know I'm being a little facetious there because that's that's not it's not very practical and because I grew up in Canada I'd like to use the metric system and describing this and we're talking about a bajillion or more of rapidly-changing records it's just not something that that that will work so what we do in practice is focus on the DNS activity or the DNS queries and responses that are happening in our specific environment after all I really don't care about sites and hosts that are out on the internet that are not actually being activated or accessed from my specific environment so there's a few different ways that we can go about collecting this information we can do things like pcap we can we can take tools that are that are able to record traffic as it as it crosses the network and store that for for later later analysis we can actually collect this information on each of our endpoints and then forward it in some way to a central repository for for later analysis and lastly we can actually use the tools that are that are already existing in our environment most of the the IDS's and proxy systems that we would typically find in our environment would have the capability to actually collect this information in a in a in a searchable format now the key here is obviously that we want we want it to be capturing the requests and the responses along with with some other information that we'll we'll see in the next couple of slides the best part about this whole thing is that it's really really really easy to collect the the cost associated with the collection - the disk space is boat so we have an amazing return on return on investment for investigative purposes it's huge yeah absolutely and and before we go on to the next slide round I just wanted to circle back on something you touched on which is really important you know we are in this case we're talking about capturing both of the queries and the responses a question have actually come in earlier about doing query law typically and on a particular implementation of a DNS server and the reason we don't really talk about that so much is because historically query logging was just that it was just capturing the queries and that's not not something that you want to throw away if you have available but if I've got the responses that's something that's gonna give me even richer information and I think that's a good segue to kind of give you a real world example just from a practical perspective so if I do a DNS lookup reddit.com I just do this here at the bass show and my OS 10 machine and you can see I get a response list of these 11 different IP addresses now and far as why this happens and how it actually occurs that gets a little into the DNS fundamentals but the bottom line here is we've got 11 different IP addresses that are being used in order to provide service for the reddit.com hostname commonly done for load balancing for DNS round robin for a number of other geographic kind of disparity type delivery networks so this is a really common I mean you run a DNS lookup against Google or CNN or any other major platform out there and you're gonna get a bunch of the plies back well the challenge to us becomes well how do I pivot this information this host lookup which IP address does the client use to make a web connection if all I'm looking at is the query I know that it looked up reddit but how am I gonna pivot into a a NetFlow log how am I gonna pivot into you know full packet capture at that point I'm gonna actually know which of these IP addresses it is making things even more complicated these IP addresses by RFC are returned in random order there's no deterministic way of identifying which direction this is going to be is it gonna be sorted or unsorted because it's 100% random so that's a very important distinction to make so what we like to do is take a look at the passive DNS records now this is an example coming out of a utility that's simply called passive DNS I'll give you a link to that a little bit later on here but this is standardized records and you can see when you bring up my pin here that it has a set of separator characters hopefully you can see where I'm circling in red that happens in this case to be the double pipe you can change that this is just the default I ran this on a collection against my home network here and it was exactly the same look up the we just showed you on the last slide so I did the DNS lookup for reddit.com and you see that in this case my sis log includes a variety of data points now these data points are standardized that's the key I really want to stress them because they're standardized normalized records they're really easy to parse you can feed this into your sim you can drop this into we've got an elk VM that we use it elastic log stash cabana and VM we use during the class that I distribute outside of class as well tons of different places we can use this but the data points the fields themselves are very standardized and you know what let me clear my pen here so that's out of the way I'm gonna do this there we go we'll get rid of it so I want to look through one of these records and show you exactly what all these fields aren't very first one well anybody that's been through the class or has been hanging around the the forensic world long enough knows that that is a timestamp this is the number of seconds and in this case some seconds that have elapsed since January 1st 1970 at midnight UTC just the way a lot of UNIX systems keep time so in this case it's a 10 digit integer on a whole number portion of the decimal point that starts with a 1 4 indicates that it's a fairly recent timestamp in this case but we know that is also something we can pull out of the source code to confirm well the next value that comes across in this log we see the clients IP address now I really want to make an important point here this is going to be the IP address as its pulled off of layer 3 on the packet that this this utility the passive dns utility seeds I should probably also state this passive DNS utility best run on an endpoint as a daemon or it could be run against a tap tap port and you observe that traffic or you can feed it against peak APIs I'll show you in a little bit but it is gonna see the layer 3 address that's what's going to be logged so if your mattad you're gonna see the post not address there's nothing that's going to pull out the true endpoint IP if it has been added or translated in any way we get the client IP address certainly that's helpful we get the server's IP which server provided the answer on in this case this again is my home network 75 dot one happens to be my default gateway it's a device that I've got on-prem and that's actually I think as my DNS cache my caching dns server now the class this this is gonna be I am I on means that it's an Internet class of query there's a bunch of other classes that are very rare I'm not going to get into them here but uh what's nice to know is that if you did have anything that was non internet class it would actually be logged as well now it actually gets a little bit more interesting so in this case the next field we're gonna get the actual name this is exactly what the name is that was requested we did a lookup previously and and received information back we got the triple w reddit.com query that went out and then it was sent out in one packet but you'll see that we've got multiple entries here I'll get to that in just a minute no punches in the reply each record also includes the record type in this case we made a request for the address Rutledge and DNS RFC explicitly states that that's called an a record and then we get the response back so these responses remember we have those eleven responses that came back in for that one query so those responses each different answer back gets its own line now if you look over in the V grab my pen again if you look over here in the left hand side of all these results you'll see those all have the same time stamps well that's because they all came by in the same response packet so realize that you're getting multiple entries for those multiple responses now the last couple of values are also interesting from a forensic point of view first of all the TTL this means how long should any intermediate server such as my internal gateway that I use here on the premises on the perimeter of the house how many seconds should that DNS cache keep this field keep this value in place in this case 297 seconds it's just under five minutes and then the last one tells us the record count I've mentioned the fact that there were eleven records previously again those are going to be reflected here so we know are looking for eleven records but again this is extremely useful information because not only did I get the fact that the query was made which is what you used to get if you enabled query logging on by nor Microsoft DNS or any of these other circa platforms I'm getting every single response back which is hugely valuable to me as an investigator because a lot of times the visibility I have is limited to layer 3 layer 4 when you're talking about something like NetFlow will give you a couple of use cases for this as well coming up in just a minute but let's talk briefly on how we're gonna actually how going to actually leverage this against our other types of evidence we've got one here Ryan's got another one will come up within just a minute and we'll talk through some collection methods for that so what I've got here is a couple of records out of NetFlow this is also NetFlow i'm collecting on the perimeter of my house here and don't worry too much about what the exact command line is goes beyond the scope what we have for the presentation here today but the thing to keep in mind is NetFlow has no content and that flow is just a statistical aggregation about a connection that was made in the past we're generally limited to see things like the the source and destination IP address the layer 4 protocol the source and destination port as well as some other metadata like timing information and and things like that now what you'll see here I've got two different records that I've displayed the first one is and one more time the first one is between an internal IP address this 192.168 and an external IP address using port 80 which is this 5.7 t9 IP address so this external IP address is something that maybe I want to get a little more information about you know is that an evil command and control node is that you know a content delivery network is that sans org well I could I could do a reverse look-up on it that's maybe an operational security risk not necessarily something I'm gonna go to first so that's gonna be a blast case if I've got passive DNS available I can actually do a lot better you'll see in this case is I go into my passive DNS logs and I simply grep for that IP address that 5.7 t9 IP and we'll see that I get this DC SC be Phillips com may not know exactly what this is if I don't know what the endpoints are I at least know that it now is communicating with Phillips comm major electronics company maybe it's a television maybe it's a some other device that uh that philips has here in in my house well I did a little bit more digging and this actually is the Philips hue light bulb bridge I've got a couple of internet addressable internet control the light bulbs just because it's an occupational hazard and when you teach network forensics that you have a whole lot of weird devices on your home network to see what they look like but what we'll see here is I can actually characterize the nature of that connection don't need the content don't need to look at the the hosts fields if I don't have a web proxy to try and characterize that port 80 traffic you know I've got still some very useful visibility let me go to the second example before I show you what that one looks like this 108 dot 168 IP in this case it's actually communicating this was my system communicating while I was looking through the p-n connected through a VPN to the house but we'll see here that there's a number of packets that were communicated but not a lot of data you know this is actually a hallmark pattern of activity for potentially command and control check-in because if I've got lots of packets that are very small that may be something that it's worth looking into well before I spend the time to go dig in and actually figure out what all these communications were let's go ahead and take a look at what the ah the hostname that's here what we do here is again grep that IP address out and then we'll see that this and this time I've got something on the discus Condamine a little bit of open source research shows me that discus comm is actually a provider a service provider that allows you to insert comments on webpages now first of all I like I like to think humanity is a pretty good thing so I never read the comments on the website period hopefully - it's just kind of something I do to maintain my feet and the good of humanity but at this point something that was actually going on in the background so I'm able to characterize it is it malicious probably not but I've at least got that visibility before we go on to the next one I see that there's a question here so a question did come on in from Christopher on specifically means of capture we're going to talk about that in just a minute in terms of some different ways of capturing it but you specifically asked if security onion and Elsa can do this there are probably about 750 two different ways you could capture this so there's absolutely a possibility you can do this with security I mean you can do it bro you can do a t-shirt guy I use elk with a log stash like I'm Matt mentioned so there's a lot of valuable places you can get it we're just gonna focus on a couple of them just for the sake of the webcast but yeah you're absolutely right and then I will throw it to Ryan for another example sure thanks so in this example we're we're sort of getting our toes into the water around what I refer to as the first boogeyman of network forensics and that is encrypted traffic so in this particular net flow we see the source IP 23 to 46 5 135 communicating across port 443 to one of our local internal IP addresses and we see there's a lot of packets and a lot of data now you know one of the things we talk about in class is is net flow is is an abstraction it essentially gives us as phil referred i still already mentioned sort of metadata about the traffic that's going across the network so if we're looking at this we we can't tell what's going on in the traffic we there's no visibility into that traffic we get that with with net flow but we also get that into in what we're dealing with encrypted traffic and we hear quite frequently during class oh it's you know it's encrypted traffic just ignore it there's no benefit we you know we can't see inside of it I especially disagree and and we we will disagree at length over that leaving aside the fact that it may not actually be encrypted traffic going across port 443 we can use very similar or the same tactics that Phil used in the previous slide to try and get an understanding of what this is this is a lot of data coming from an Internet host down to our local network so we want to have a look and see if we can discern anything so if we grab our passive DNS logs for those IP addresses again the 20 2:46 we come back and we pick up a single record that has a second-order domain of Netflix video net looking through open source Intel or gathering open source Intel on on this traffic we see that this is actually affiliated with with Netflix and this is actually streaming information coming down now we may not be able to to see what's inside of it and in this actual case it is encrypted traffic but it will certainly allow us the ability to give it the sniff test to really get an understanding or a feeling of whether or not this is something to be concerned about one of the most important things that we talk about and something that that has served me very well in doing Incident Response is being able to sort of understand what looks what I call funky and and what's not funky yeah and so this this allows us this sort of transparency yeah absolutely and you know that brings up another good point too and the reason that you know we've decided to use Netflix as this example is Netflix along with a lot of other of these you know new very flexible elastic service providers they will spin up hundreds or thousands of systems in a very short period of time from the handle surges and demand and then spin them back down so for example you might see connections out to at the IP address that happens to just be an Amazon ec2 no way to tell exactly what that is the reverse lookup owed the IP address is not going to necessarily get you back to anything related to the service provider it's just going to tell you who the hosting provider is so being able to find out basically how did this client in this case the 75 28th client connects to this remote 23.2 46 IP address but reach that through the use of this host name so another good example of why that transparency and curt that or not is an important one to make and we've touched on this a couple of times - about the use of DNS as an intelligence and benefits you know who some of the things that we're starting to see and again I'll be really kind of honest here you know you look at the term hunting with your quotes as as we've said a few times so far it's really just a mature IR program it means you are actively out there looking for compromises that you didn't necessarily know exist some people call this the assumption of breach models some people call it you know the the hunt teams so whatever you want to call it we're really talking about what metrics do you have in your environment already that you can start tracking in order to establish these baselines of normalcy so you can real quickly easily identify where the deviations are I think DMS is a fantastic example of this so there's a little bit of shell gymnastics that I've got listed off on the right-hand side is really just a way that I can go through all of my entries and my passive DNS entries and I can look for what the top and second-level domains are for each query that was there and give me a nice histogram and I sorted list of the most common second-level domains that are seen in my environment so in this case what we'll do is if you can't right now just let's go say your environment we're where you're sitting or you're working today if you can't right now tell me what the 5,000 or 10,000 most commonly used second-level domains on your environment then you absolutely need to go out after this webcast is over and figure out a way to start collecting that because that's gonna help you understand when something changes then when that changes you've got a little really spring into action that becomes the reason you start investigating because the first time you see a domain very first time up a second-level domain you better wonder why and you better have you know the the team process the technology available to figure out the answer to why that showed up new because it's a very good chance that could be used for malicious purposes now even more malicious or suspicious is if there's a domain that was just registered yesterday and it shows up in your environment today that's something you need to know so you can do that unless you have a list of domains to start with so when we talk about hunting or assumption of breach or or these proactive I are continuous I are engagements we're talking about a last stage in the process so it's something that you're mature I our team is going to have but you've got to start collecting that now because if you're not collecting a baseline as you're going to be in trouble but the thing is on this on this little bit of shell gymnastics that I've got over on the right-hand side this is actually something I ran off of one of my servers I run a small fleet of VPS is mainly just to get some visibility on real-world scenarios or real-world traffic and patterns that are out there on the Internet and the first thing I saw is holy cow this is this is weird I've got a significant number of in adder dot RP lookups now if you're not familiar about no problem with in adder DARPA is actually what's used for reverse IP lookups so anytime I do a lookup for an IP address and I get an answer back it's actually converting that before it goes on the wire to an in-dash addr dot ARPA lookup but when we see that it means I'm doing a lot of lookups of IP addresses my first thought was holy cow why is this happening this looks unusual to me because I hadn't looked at in a while I'll be honest it was something that I collect in the background and I haven't had the need to really dig in on this evidence just yet I took a look further and I realized that one of the ways I'm doing spam prevention and things like that on that server is through the use of gray lasting through the use of reputation checks IP addresses well it does this through looking up the IP address which ends up being in a toward ARPA and it looks that up in a variety of my online reputation databases so at that point that's something that I said okay this is not only normal this is good it's preferred now I've got a new idea honorable and baseline kind of trace this back to find out why I was surprising to me ends up I had enabled a new reputation check a couple of days prior to one I pulled this information so it was still showing up as as unusual not something I'd seen before but once again seeing what that normal behavior is seeing what the typical layout and ratio between all of these different domain names is going to be a great of hunting and profiling and baselining for your environment well we're going to pivot real quick into how to capture now I mentioned before before getting on this slide right here I mentioned before specifically not talking about query logs we're specifically talking about two impassive DNS monitoring now you can do this a variety of ways you can do this with full packet capture you can do this with dedicated utilities third party services we're going to touch on each of those but I definitely want to clarify again we're not talking about enabling query logging in bind or turning it on through the registry hacks in Microsoft DNS for anything like that because none of those are gonna give you the responses responses are really the paydirt that we're after well it wasn't too long ago going back on the slide here that our main means of collecting this evidence was through the use of full packet capture hey you want capture if you want to see what your DMS traffic is you've got to run a command like we've got on screen here using a capture utility like TCP dump and you know we've got a couple of what I'd consider to be practical collection options here so I'm looking for a snapshot length of 0 it means capture all traffic of course I'm assuming I'm legally permitted to do so the - upper case G tells me that I'm going to rotate my output files in this case after every one day it's number of seconds in a day 86,400 and instead of displaying the content to the screen I'm gonna write it out to DNS pcap but after a day goes by and this rotation kicks in we're gonna see that that DNS dot pcap adds a new file called DNS pcap one dns top UK - DNS pcap 3 etc etc now the very end here i am using what's called a berkeley packet filter specification or BPF that BPF in this case says i'm only looking at traffic i'm only gonna be putting traffic into this output file that is using port 53 now we are here using tcp and UDP because if you didn't realize it DNS traffic although it typically rides UDP it can just as well ride TCP as well so if you've got a DNS visibility solution that is not including TCP you're definitely gonna miss the boat there's gonna be some scary stuff that's gonna happen on TCP you probably want to be aware of and I've definitely seen it hackers in in real law compromising situations that have evaded detection through the use of TCP 53 well this is all good and well if you know anything about full packet capture and parsing pcaps are trying to bring them up and Wireshark or some other utility it's great but it's gonna require some post processing because I'm not gonna get what I showed you previously which is that nice standard normalized format of an output log containing all those various fields that we found to be so interesting so post-processing not bad if we have to do it we can standardize it but it does mean that we got an extra step I would actually prefer recommend actually that we would go ahead and do this proactively I think that's a direction that we would want to go and we'll show you a little bit on how you can do that in just a minute to not have any questions here all right we are in good shape so the post-processing can actually come in a couple of different forms the first example that we got up at the top here I'm using or I would consider to be my favorite pack of handling utility which is t-shirt you're not familiar with it t shark is Wireshark it just runs in a shell I like to say if it doesn't run in a shell you don't need it to you don't need it anyway I do stand by that I think that all the real work all the real forensic aiding is done in the show but uh what we've got here is a t-shirt command that reads that pcap file in and what we're doing is writing out individual fields now you can do a man page on this later on and kind of feel to figure out what all these parameters are the bottom line is instead of me outputting a whole bunch of information about each individual packet that's seen I'm just pulling out some specific values so I'm saying I only want to look at the response packets and for each response that comes in I want to print out a few fields first field is going to be the time value now it's not as machine parsable as we saw with passive DNS which we'll get back to later on but it is it is present so it's helpful it's going to tell us what that UTC time is when that was observed we've got now the IP address that's the destination first and then the source and the reason that we include those in that order is that the destination is going to be receiving the response again we've got this - capital y up at the top and that capital y parameter tells us that we're just looking for the response packets coming back so that's why you've got destination first and then source so here's the system that received the the reply and here's the system that had made that response in the first place and then in this case we're looking specifically for what the response name is that's gonna be the query that was made and then we see the DNS dot a record now this is a little clue Jian hacking so a lot just bear with me here in this case I've explicitly stated just showing the DNS a records instead of saying show me the cname the alias show me the MX the mail search show me the NS the authoritative nameserver show me the text records any of these other things that may have come by which would have been included in the previous logs that I showed you in this case I'm gonna have to run one query for one command T shark for every single query type not exactly manageable it's descriptive all it's something I could easily do if I had to but I would prefer to go as directly as possible to something that's really easy for me to handle really easy to parse and in that case we're gonna dip over to the passive DNS utility now I've got the URL listed down here and trying to make it easy it's fo our 572 dot-com slash passive passive DNS that's actually gonna forward you off to a github page for a user on github named game Linux it's just the handle that he could he or she goes by and I love this utility because it's very small it's very lightly super easy to compile I think it takes me about 3 seconds to compile the whole darn thing on my systems here and it does provide the output that I showed you premium and it will as I've shown you in this case read from a peak out file so those pcap files we created with TCP dump before or it can read from live network interfaces so here in my home I've got a I mentioned that Flo collection and things like that I'm also collecting passive DNS in real time and I'm feeding that directly into my Elko VM or I can archive up a couple hundred million of these responses and query across them very very quickly so really recommend that's the direction that you had and I think this whether you're using this passive DNS tool or as was mentioned previously something like security onion I touched on bro or you know whether you're pulling this off of a proxy server like Ryan had mentioned if the proxy server has the ability to pull passive DNS info or maybe you're even running this passive DNS utility whether it be on a tap or you can run it on an endpoint many many possibilities the deployment of which go beyond the scope of what we're gonna hit in this webcast but you get their idea that a standardized normalized value is exactly what we're looking for and let's see before I go any further I saw that a question had come in want to hit back here when you want to hit packet size as well to look for anomalous traffic like hidden channels you know what absolutely you know that's a that is definitely a visibility gap you would have with passive DNS through this utility right here because it's only going to show you these fields but again I've got to go to the fact that I probably have you know a 500 million to one ratio between normal DNS traffic and the suspicious DNS traffic you know I'm gonna at least go to some kind of information about well if it is an oversized packet I'm probably more interested in the hostname itself which is then going to drive me to want to do focus to collection there's always that what if that's a tough one to answer but I would say that it's usually going to be normal DNS traffic it's something that we can use very effectively just through the use of these standard records so when it comes to collecting we also have another option which is commercial services now of course you know sanz's is not commercial training we are very vendor agnostic with everything that we provide but I throw on a couple of options here that I think are at least what's looking into because I know a lot of folks that work at these organizations and I used them in practice in some cases and I've got a really good feeling with where this industry is going as a whole that all being said I could make this slide look like an f1 racers fire suit or a NASCAR drivers fire suit with logos you know we're not talking about these three in particular is the only ones out there there are dozens definitely think you should do your homework on this before you decide on a commercial provider but the benefit that you get here is that a lot of them provide their information by api's meaning I can programmatically ask them at some point in the past what IP address did you have for the sans org hostname and they'll go back into their databases take the entire data collection sometimes from across all of their clients and it's going to give you an answer which is really helpful because I can go back in time without having to without needing to have had that collector out there proactively that's that of course we certainly profess that the need for continuous monitoring and and you know continuous visibility to support continuous irr actions so we certainly wouldn't want you to also consider a service that allows you to place your own sensor in the environment that you control and then it will be able to archive those in aquaria will form off-site on their platforms as well you know some of the intelligence that these folks also add and the one that I will say that I do have specific experience with with a regard to the work that I do at red canary is the Farsight mount which is newly observed domains we really had a fantastic result with them and basically amounted to any time we see an incident involving a domain that's less than a certain age a certain number of days there's like a 95% chance in our overall visibility stack that that is at least suspicious if not overtly out lalit malicious so that gives us a real good visibility there so you get an idea to the type of intelligence you can get especially when you're talking about someone like you know an Open DNS pass it's a little far site any of these large providers out there that are gonna just have literally billions of records I was talking with her to talk for Andrew hey I can't remember the exact number of queries he saw when he used to work at Open DNS but it was something in the billions per day that they would observe which is really a pretty amazing concept you think you've got big data issues you know this certainly takes that to a whole new level and then the last thing they'll hit on before we move on to the next slide here in terms of use case question had just come in on campus in DNS scale to large networks specifically 10 gigabit plus you know I haven't tested at that level so I'm not gonna be able to say definitively yes or no but uh you know that that probably is going to be stretching its limit that being said if you're running passive DNS the specific utility that I talked about against captured data that you've peak out it may be able to it may not look in real-time but you'd certainly be able to pull that information out when we're talking about a typical forensic environment you know complete comprehensive parsing is treasured far more than immediacy we'd rather have accurate information to ensure that our findings are correct but if you are looking at doing 10 gigabit plus you're probably going to be looking at something that's a little bit a little bit bigger of a solution I don't know if very well and they work fine given right platforms just not something if I have the opportunity to like to test out so let's give a quick use case and though I know Ryan's got one coming up as well and I think we're doing really well on time so you know one of the things we you may come into contact with is the this notion of a fast flux architecture and if you've used it before you're probably already wincing in pain because you know what an absolute pain in the butt this is to investigate and try and clarify but if you haven't seen it before fast flux is really nothing more than a clever way of saying malicious load-balancing in the pursuit of of concealing your your the bad guys are platforms so what we got over on the rough on the left I've got it labeled bulletproof web hosting which just basically means hosting that the bad guys pay for and they buy it from providers that aren't gonna really take them offline if they do bad stuff so it's kind of like hosting resellers that don't care about the acceptable use policy well even though they're not going to get booted off line of course they're probably not scared to go into jail in the first place they still want to keep that bulletproof web hosting as long as they can costs a lot of various currencies that you might use in some of these countries where where cyber crime runs rampant in you know some of the former Eastern Bloc nations and to be honest globally but what they're gonna do is they're going to lease a botnet so these compromised hosts I've got down the middle here are part of a a botnet the bad guy out there he's gonna lease these he's gonna probably pay with a stolen credit card or a fraction of Bitcoin or something else and they're going to basically lease time in order to forward traffic it's gonna just be a little tiny piece of code that runs on these compromised hosts in order to follow traffic so it's really acting as a map device what I want you to do is consider what would happen on the on the query side so our little client down here at the very bottom your laptop it makes a query out for evil.org and then when that query comes back it's gonna actually say oh whoa that evil.org it exists at this 101 202 IP address so the clients gonna say sure no problem I'm gonna form a request and I go out to 101 dot - OH - well of course this little piece of evil that's running on that compromised host is going to Matt the connection back to bulletproof web hosting we receive it a response and then broker it back to the client now I want to think we about I want you to think about what you and I will see in this situation our visibility is right line striking this red line here all you and I see is a single query in a single response that appears to be going out to the 101 about 202 IP address now maybe we identify this you and I look at it and say AHA that's bad let's block it let's go to the firewall team get rid of it let's block that traffic so we block the IP address well what's gonna actually happen and this all is automatic the bad guys gonna actually change the IP address that's responsible for evil.org and he's gonna change it over to this 7519 address which means that next time are a little victimized client here in the bottom needs to communicate outward and establish command and control connectivity it's gonna do a DNS lookup free evil.org receive the response but it's gonna get to 75 Main teen address now and at this point the the clients gonna reach out to 75 Main team now if you and I have visibility on the second red line that I'm striking across the query response pair then we're gonna think we found a new command and control server now what's happening of course is it's still being broken back to bulletproof web postings same servers still responding seem servers still they're doing everything it's just that you and I think that this is two different systems well you get the picture here this is gonna then scale linearly and although I only have three compromised hosts in the botnet shown that's really gonna be hundreds if not more and it's gonna be very very very fast so the challenge for us is that man if I can only get an idea for what this even will dot-org is I'm actually gonna going to look like it my evidence I could find where all of the different connectivity was well sure enough as you might expect passive DNS records are gonna help us out there and if I'm looking at the passive DNS records associated with the activity that I just explained on this notional fast flus are flat fast flux architecture don't say that five times fast on a webcast everybody but the fast flux architecture is gonna call that out in a hurry now you can clearly see yes we've got you org reflected in all three of these entries but I've got all of my IP addresses that were used for all of those connections showing up right here in crown and glory and we can now go back and look at something like net flow find out where they connected to maybe it was something in an HTTP proxy so we may be able to look for our evidence of those artifacts in HTTP if it was reaching out to that via web connectivity we've got time values we've got the ability to scope in the clients so we got an idea how many clients and which ones and where they might be or were engaging in this type of fast flux architecture so altogether that gives you really good visibility now the question that just came in is you know I'm gonna was he you know what a question from Bill I wanted to acknowledge that I've got that from you I want to answer that one a little bit later so just give me a minute will absolutely hit you back come back because I want to give Ryan a chance to hit this next example as well off to you sir thank you so in additional use case we have when it when it comes to command and control is what we term phased sea - this is a situation where we will and we've seen this quite frequently where an attacker sets up a command and control server or fleet of servers and they decided to go off where they have to go off and and operate on another attack campaign and while that original excuse me that original campaign is not being actively explore we used they'll change their DNS records to either point to a non-routable address like a ten dollar address or or something of that sort or they'll actually set it to the loopback which is what we see in the in the example to pass a DNS example on the screen so if we were looking at you know a local systems DNS cache or a lot or you know a typical typically configured DNS server their cache all we would likely see is the last entry which says that evil.org result resolved back to a one two twenty seven dot zero dot zero dot one that's on enough on its own I mean we don't typically see unless we're responsible for it we don't typically see external DNS requests resolving back to loopback but that really doesn't get us very far so looking at the rest of this we see well we have an 18 hour TTL is 18 typically again this is with with a fair amount of sort of or it depends typically we see TTLs in the 300 range you know the five minute range so that resolving to loopback seeing a low TTL seeing a single record return for evil.org those are all things that are suspicious now if we have the opportunity to have this DNS information historically we can look back and see that you know not too long ago evil.org actually resolved back to that 7519 IP address and you know that in context would be or should be something that sounds the alarm that you've got something weird going on and definitely it's something that we would want to look into my typical next step would be to go and find out who else tried to resolve evil.org who else or look the IP addresses were and then look through every record I could find to figure out who else was also communicating with those IP addresses so we've got a fair amount of visibility that this can provide to us just having the historic information online absolutely and it's a that's a great point too is having it historically how long can you keep this it's pretty cheap you know obviously we're not talking about bargain basement storage from from fries necessarily but you know you can keep this stuff for a very long time because it compresses extremely well and it's not really that large to begin with so very very good a very very good use case before I run to the next slide I did see a question come in Zac you asked for the link again I went ahead and pasted the link itself into the chat if you mean it verbally it's f0r 572 com forward slash passive DNS and that will get you where you need to go but I've put that into the chat window and as well for you so those are a couple use cases um you know this really just scratches the surface this is a something that is finally I'm pleased to say finally coming in to its own in terms of you know being used in proactive and even after the fact I our engagements whether you want to call that hunting or just robust IR anything that we're doing in the investigative sphere we're always looking to improve our game and this is a great way to do it I've been looking at this for years and I love seeing a lot of these new utilities kind of disruption in the market to see all these great new platforms and solutions that are coming out you know but as I said this is one tiny little sliver of the puzzle it's a very important one I think but we talked about we've got our number of courses coming up and we talked about everything from net flow to full packet log aggregation we go over the elk VM and which you could use for both net flow and log aggregation itself you know we really have a lot of great material and I'll also say it of all our classes you know we're about 50% hands-on so our workbook is 300 pages of hands-on true evidence actual skill development you know so we don't necessarily just talk about it we're gonna we're gonna actually put it into action you're gonna leave this class with some some real practical knowledge to either start applying back the day you get to work you could absolutely make my day I had a student a couple of classes ago whoo I think I'm like the second or third day of class they said I've got a case and I've been working back at work VPN back to the office and started using this stuff right away so we've actually had folks using it before the class is even over just great as you see on the right hand side here we've got a certification from giuk as well that's something you can pick up to get the acronym after your name and show that you've you've got the chops in this particular arena of network forensics but we've got a lot of these other classes as well I've got mine that are listed here Ryan's got some coming up if you've got a chance to get to a Washington DC that's gonna actually be the best time of year to go to DC to be honest the weather's gonna be great but the schools aren't going to be out yet so it's not gonna be crazy a little run crawling with tourists yet so it's definitely a good one to check out plus it's a summit so you've got the chance to sit in on some fantastic talks surround and the training itself you know definitely one that I would recommend checking out but there are probably about fifteen to twenty more events this year we're gonna definitely be someplace near you we've got events going in Europe we've got events in a APAC Australia Singapore you name it as well as across the u.s. so a couple of questions have come in I want to take a chance to answer these I realize we are about six minutes into before the end which is fine OOP you know what somebody just came up with something Thank You Zack you know what I want to do I'm gonna let you go ahead and I'm gonna change on the fly because you know what editing slides while you're in the middle of a webcast is a great idea Zack you get 17 imaginary points I don't know if they're good at at your local retailer but it's forr 572 dot-com slash course my apologies for that so I definitely check out the full list that we've gotten going on but I want to answer some of these questions if you need to duck out we're gonna keep recording until all the questions are answered you can always come back in and hear them later or I'll be able to stick around for a couple minutes after I always look up some extra time because this kind of stuff generates a great questions so the first question that I'll mention here it sounds like TCP dump question up sounds like a tcp dome should be running on the servers on the dns service and if your servers are windows can Windham keep up with it you know I actually would say a better deployment mechanism would be to be put a tap in front of your DNS server and then take the revenue port of the tap and put that into your passive DNS collection solution now I run passive DNS itself on my endpoints that don't share a DNS infrastructure so I know that that will work to a reasonable level but on the windows side you know even if you're just talking about your server a lot of admins are gonna get pretty twitchy about running yet another utility against the server so that's one of the ways that I mitigate the the risk that comes along with that whether or not it's a real risk it's perceived so all you usually put a town from the DNS server and try and do it that way but even still if you're collecting that live with passive DNS you don't even need the storage to do the full packet capture necessarily and I think that that's really the most bang for the buck did you have any other thoughts on that one Ryan the only thing I would say is is if you're when you're talking about you know capturing this by IP cap you know the birthday packet filters are your friend here if you wanted to to put TCP dump in place for the purpose of cap capturing this information restricting it down to the point 53 both TCP you would make that as small as humanly possible so it's it's certainly an option and a reasonable way to go yeah absolutely so let's see couple other questions have come in to from Dallas oh I hate these in order for you Dallas questions on NX domain entries you can know would pass in DNS and with a number of these other utilities out there as well you can explicitly ask that it does log and X domains but you're very right that's a great way to potentially see on things going on because if you've got a system that's making you know 20% of its requests or coming back as an X domain maybe you're looking at a domain name generation algorithm you know there's a there's a lot of possible angles of why that might be bad so I would say just evaluate your utilities for you operationalize them of course and in doing so I think annex domains are a great record to make sure you can hit and then where's my cursor you had also abdali's asked about recommended free tools to parse Windows DNS logs you know I would just say roll your own yeah but I want to give you a big caveat there because remember if you're taking Windows DMS logs those are query logs those you're not query in response logs so you're not going to get the full complement of passive DNS data I would say it's much better to go with something that's going to be truly passive instead of relying on and hoping that Windows is going to long bat for you but in terms of actually doing in the parsing you know I have done that both in terms of bind and windows before we have bind query log parsing that's built into the ELQ VM that I've mentioned that we use for class we've got a you know a couple other parsers that are out there as well so you know your mileage may vary I would just say you'd be much better off going to a true passive DNS solution though questions are yeah here's a good one I mean you've got some thoughts on on it especially the second half of this one Kevin's asking Fox I'm using silk and if there's any recommendation for open source income on the first part which I can definitely Ansel to answer directly silk if you're not familiar with it is a very large-scale net flow collection architecture it is open source it is free I know a lot of ISPs and folks inside that DoD and the US government at least like to use it it's a very well designed software but it's not going to necessarily get you the visibility that we've talked about here it is going to be reliant on that flow alone so if you have time to invest and deploying it and building maintaining the architecture I think it's very worthwhile and I think it can libel a lot of commercial NetFlow solutions as well but it isn't necessarily something that you're going to get in the excuse me in terms of passive dns visibility and then on the on the open source Intel side you know I'd actually throw this to Ryan and see if you've got any thoughts on on it my first thought is always to go check out and what has most recently been known as emerging threats I find their their IP lists to be pretty good but I didn't know he had any thoughts on I'm in tow sure I I think the emerging threats is is great you're gonna get a lot of a lot of false positives out of that but the way that I do this and and really my my my process is to reduce my scope as much as humanly possible so I will take massive quantities of passive DNS data and I will compare that to something like the Alexa top million websites and you know get a data reduction so that it's that my my scope is a little clean a little cleaner a little finer and then that I use that as a methodology to sort of you know figure out what needs to be investigated and then I'll go having Farsight is a great thing to look at for you know newly registered domains so I think yeah I mean that's that's the process that I would I would go to there's there's a lot of versions of open-source Intel emerging threats being being probably one of the best yeah so delaford also posted a comment in here recommending checking out d shield that's another good one thanks for that thought process I know the Sands is C the internet storm center has a number as well you know not to make it all about classes but one of our newest classes and the Friends of curriculum is 578 it's all using and creating threat Intel I haven't had the opportunity to take that class yet but I'm really excited about it because I know there's a lot of very useful things coming out thank you very much Ryan so let's see the question came in about the LPM yeah you know I should have thrown that in the slides I put the copy link into I tried to didn't actually link for me but the ELCA VM is downloadable it is free it's something I did distribute outside of class I will send out a tweet with that as soon as I'm offline here and my fan calms down I'll defend that but there is a link mostly in the in the chat here it's for-4 572 calm /log stache - read me yeah yeah my my chats have been a little little unhappy so yeah logstash read me is the one that's going to get you where you need to go I'm like I said I'm gonna put that into a tweet it's something I recommend you check out a little bit of sneak preview in the next probably two to three weeks we're gonna have a brand new version of that vml so don't don't uh doesn't mean not to check out what we have now but do stay in touch and stay aware of when that new VM is released out and then the last question that I have in here anyway in case unless anybody else asks any others is you know is DNS authoritative any better well it depends on if you're talking about authoritative or if you're talking about something like DNS SEC so authoritative DNS is just going to be the NS or letter DNS server that is explicitly permitted to answer on behalf of a certain subdomain or domain so for example that NS records that are authoritative for the sans org domain it's just part of the DNS infrastructure now in terms of DNS sack you know I'm I haven't seen DNS at catch on it will be authenticated which is going to improve the ability to trust DNS because right now you're trusting in plain text protocol over UDP which is about 17 different kinds of odd and and concerning but uh so a fennec ated of any kind dns is going to be a little bit more what I would consider you know trustworthy and wouldn't say secure but it would be a little bit more trustworthy but that's a whole different era you know I don't think we're gonna see the time turn on that for quite some time so Manuel you're to your question you know it is it is nominally better but I couldn't really quantify exactly how much better that is any other questions at all I get a little indicator if people are typing I don't think I'm seeing any well let's see before we sign off Ryan is there anything else you wanted to add in talked about a sock summit or all Boston near events coming up well I know that were for me I know that we're a little past time but if anybody is coming to or in the DC area in May 19th through the 24th love to see it socks on it likewise I'll be headed back to Boston in early August for one of my favorite events the sands Boston events so if you're in the area by all means come on by and we'll say hello and that would be great excellent well thanks everybody for sticking around those couple extra minutes our contact information is on up here please feel free to drop us a note if you do have additional questions the recording for this will be posted and available to you in probably I believe a day or so maybe later today just depends on the backlog that they have at the shop I appreciate your time and thanks very much sands for putting this on and I will hand it back to Carol to play us off all right thank you so much Phil and Ryan for your great presentation which helps brings this content to the Sands community to our audience we greatly appreciate you listening in for a schedule of all upcoming and archived sands webcasts including this one you can visit sans org forward slash webcasts until next time take care and we hope to have you back again for the next sands webcast thanks very much everybody appreciate it
Info
Channel: SANS Digital Forensics and Incident Response
Views: 8,976
Rating: undefined out of 5
Keywords: Network Forensics, Digital Forensics, Computer Forensics, Incident Response, DNS Forensics
Id: mZrNLZAdTTA
Channel Id: undefined
Length: 66min 24sec (3984 seconds)
Published: Wed Jun 01 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.