Building Security Dashboards on ELK Stack/Elasticsearch to supercharge your SIEM

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] okay so I have almost 80 slides here so 45 minutes that's about 30 seconds per slide so I'm gonna move here pretty quick if we have questions I'll try and field them encourage that but again we're gonna try and keep them moving how many people here have heard of elastic searched okay Wow impressive and how many people are using it for security purposes okay a few all right basically starting from scratch if you're faced with having to build log analyzer or an indicator of compromise how do you approach that problem in a large organization so we're gonna kind of walk through some of the precursors for that and then we're gonna get into of course in that exercise in this case we've selected elastic search so we'll get into what you do how you deploy it how you start getting logs and how you start getting reports so so we're gonna cover the need for security visibility where to start selecting the platform deployment of elastic search working with data building reports so the need how do I enable my security team with the right information to quickly identify potential threats correlate categorize and act that really is the essence of what you're trying to do in a security organization and the Reformation is key being able to quickly identify what you're looking for in the haystack in fact I equate it not so much looking for a needle in a haystack but looking for a needle and a stack of needles that's more realistic and then correlation categorizing you know because data is dirty you get a single data point that doesn't tell you anything ask us to tition you have to have a number of data points to correlate and give you context of what you're trying to do so what do we have to work with so most intelligent managed devices generate log data as we've been talking about so we remain a service provider when we take on a client we have to make sure that the core infrastructure is managed or intelligent and by that mean by that I mean have the ability to generate logs give us information interact with it so these are all the sources of log information we can get it from firewalls switches windows event logs web server logs sequel operating system you can read the list so you know there's a plethora of things in your vironment that are generating information that can be useful for your security program so the challenge is mostly used most people buy these fancy machines but they're not using that data they're not pulling it in and analyzing it or they are pulling it in and storing it but they're not looking at it they just have a lot of data they're paying and sitting on the desk so log did it can be a treasure trove of information and give visibility and almost anything that could be a service affecting so the solution listen to your environment use log information your systems are generating constantly and it's selectively make it available to your team in a meaningful way and at a great question earlier on kind of how to do that how to get you know meaning meaningful information in front of your your user community or your audience so that is important so general challenges and opportunities here so there's too much information and then not enough of it is the right information and it's out of context and it's unstructured dirty data and the tools to manage the data store it and analyze it are expensive and complex we'll talk about why that is so data needs context and correlation and then you've got at somebody that can do do all this bring it together implement these tools so that's an elevated skill set and then you've got to do capacity planning you out of the physical storage if you do it yourself general challenges opportunities so IT environments are complex ever-faster systems are generating log data in real time so sometimes tens of thousands or even hundreds of thousands events per second so that's a base throughput and data you have to store so those are things when people implement these projects and they start turning on all the logging that ratchets up so you have to plan for that that high and so there's a strong business case for collecting log data you got to consider the needs of visibility proactive management requires early warning sample reports can help make the business case so am i building a logging system am i doing is it just for storing logs am i using it to do log analysis what am I actually building a full-fledged team you kind of need to know what your your end goal is because those are different projects making the business case so that's already I I think when you're talking to the executive staff a good comparison is your ERP system so Oracle's here and thank you for sponsoring and hosting that's great so they make a great EHR P system so that usually does decision support so I'm ingesting financial information it's helping me decide how many widgets I can ship or where my assets are so all that information in one place and then I query that database that that OLAP cube so the same approach can be used as security information it's a similar ROI dynamic so what data should be collected so you can collect data from anything that's that's intelligent on the network we're just going to focus right now on firewall syslog data and Windows Event log data and the reason being if I've got to pick two systems in any environment these are the two categories of data that I want to get I think it's going to give me the most bang for my buck as far as what's going on in that environment so once you get that dialed in and working then you can expand get switched traffic traffic from your phone system your security system it's easy to build on top of that once you get the initial platform deployed okay your systems are talking are you listening so on the Left we have a next-generation firewall and that logs so I pulled this actually from forward in that I'm a Ford nag I make no excuses but similar things can be pulled from a Cisco SonicWALL WatchGuard I'm sure as long as it's kind of categorized as a next-generation firewall and I talked about that a couple months ago so basically the firewall is doing you know something like deep packet inspection it's looking at your traffic so it's not just seeing who's talking to who it's actually getting some visibility to what's inside that packet is it an email is it doing anti virus signature filtering and if it hits a signature it's gonna trip that it's gonna send an alert so that's all information you can get from your firewall if you're listening so on the right we've got our Microsoft Windows operating system and that logs so these are all the types of things that a Windows server is capturing these types of events so these are very things to know how many packets are being blocked you know Windows servers have firewalls built into them too I'm so oh s based firewalls how many PowerShell scripts are running how many new services are being created a new service shouldn't be could be created all the time that's kind of an exceptional event that's an exception you can report on maybe that's a normal thing but again it ties into you know let's cross-reference it is is that an expected event or did a virus get in there and create a service okay that didn't approach so assessing the project can be intimidating planning is key I'm a big fan of organized IT environments if I take over an environment that's the first thing I'm gonna do is a documented as a structured we do we have a hodgepodge of equipment and vendors and configurations we have to get that dialed in otherwise we can't make any changes educate yourself about the available data and we'll take a deeper dive into that so don't overlook ancillary systems so this is kind of BLS we talked about you know getting data from your PBX your physical security device what-have-you but again we're gonna talk about Windows and firewall event logs here so estimate amount the data being generated that's kind of key so a full-fledged log analyzer you can store terabytes of information and large organizations can get up to petabytes so that's a challenge for you know capacity planning information life cycle how do you store that get the performance and dump the data you don't need after it's stale or not adding value to the organization so the amount of data will change the filtering so the typical dynamic is you just start collecting everything or a lot of stuff you're not sure you need you look at the logs you go through and then you filter what you know you don't need and that is a bit of an art and a science as Eric indicated you know you could say well I'm not interested in successful login events well that's great maybe you're not but the time a you may be asked hey we did you a login failure how many successful login events oh we're not capturing that log information so it depends on again what you're trying to do and what questions you're trying to answer it so kick off an evaluation process commercial open source and house out sourced ok so here is a typical logging architecture so we've got our search engine on the right that could be whatever could be elasticsearch it could be Splunk it could be oracle a sequel database whatever whatever you do wherever you decide to store your data and then you've got your endpoints here so these are operating systems your firewalls whatever it is you're you're pulling data in from and then optionally you can have a log shipper so this is a bit of middleware it's got some intelligence it does parsing filtering enrichment and is limited by how many events per second it can process but it's optional the endpoints could just send the data directly into the search engine so some companies use one mental model or another or a hybrid so we're gonna talk about this three-tier architecture which I think has a lot of value but it's not an absolute okay selecting a platform so these are kind of all the things that it's going to need to be able to do collect and ingest data parse it so in many cases the data is coming in different formats maybe it's in JSON maybe it's in comma delimited maybe it's you know flat file could be any number of things so then we have to filter it maybe you only want certain tidies but not others so enrich it talk too much about that but let's say you're getting data from your PCI Network and so you want to tag that traffic with the keyword PCI so now that makes it easier when you're analyzing your logs you're just looking at PCI data or you're trying to find this PCI data leaking over into another network so you can add additional tags as its passing through that middleware layer so obviously it needs to transport it to get it into the database encryption is key a lot of large organizations will say great you're shipping out log information but that's sensitive as Eric said it might contain passwords so we can't have that go unencrypted over the network you have to store it you have to think about scalability and then you have to report interface and then support IOM IOM is information lifecycle management so basically you're just generating a ton of data you can't store forever that's not practical it's data get stale delete it archive it purge it whatever what have you identify the back end okay so these are just a number of options they're certainly not all of them so a commercial seam there's a number of them out there you can easily spend Dan said million dollars you know for some of them and I happen to know that there's financial organizations here in town that are running million-dollar commercial seams and a running lastik search right next to them because these commercial seams can take hours to generate a return a result run a report but elastic search is so fast and efficient that it just does a faster so SolarWinds we've worked with that Hadoop it's a similar to elastic search can run in the cloud it's it's designed to deal with petabytes of data it's usually overkill for most organizations so we're gonna go through all these but suffice to say this kind of the landscape of options for systems that can collect and manage logs ok datastore so the next question is do you build your own or do you just call up a service and say set me up with my log analyzer so I can ship my logs to you and that solution is works for a lot of people okay and then you're gonna identify log shipper and a lot of these can be mixed and matched you can use a log shipper for one company and work with another company's database as long as it's key doesn't mail form of the data to any great extent okay consider your data life cycle we've kind of talked about that we are generating probably 250 gig per month of log data elasticsearch okay this is why we're here we went through this evaluation criteria we've selected elasticsearch and now we're going to talk about how we would go about implementing that so briefly it is an open source no sequel database it stores all of its data and flat files rather than a sequel database it's a search engine at its heart it does have commercial support through the elasticsearch company and then a ecosystem of other companies that provide that support it accepts unstructured data as documents so I've worked quite a bit with sequel databases and if anybody else here has you know that you have to get your your data formatted correctly before you can do a sequel insert statement so now with the lastest search you can just dump your data in so you did you do want to spend some time knowing your fields and mapping them to data types it's gonna make your life a lot easier on your reporting end we'll talk about that scale a little the terabytes or petabytes depending on how much Hardware you and how fast your discs are and it can produce sub-second response times if optimized so again we're querying about 300 gig of data I've got example that some of those reports are coming back in 3 seconds so you just can't really do that with a lot of other systems so this is extremely scalable and it's used by governments and fortune 500 companies all over the world and a showcase customer Sprint is actually processing 3 billion events per day with 200 dashboards so that is a tremendous amount of data so elasticsearch is a dominant market leader AWS and others have mature hosting solutions recently there's been a buzz in the news with elasticsearch AWS had decided to fork their code and that was a bit of a hostile event to the elasticsearch community because they kind of see AWS saying we're going to decide which direction elasticsearch goes so lastest search just recently released 7.1 I think that's pending so a note on security by default with free version of elastic search there's no security that comes with that you don't get encryption you don't get user authentication and there's a machine learning as well you have to pay for so that was a big sticking point so we've all heard in the news the country of Panama recently was breached there they had an elastic search database it was open to the world and it had 88 million of their citizens that were in that database with you know all their personal information and so that was that's not an uncommon thing and part of the problem is if you fire up elastic search by default it's just open to the world so that is going to change with 7.1 there including the encryption and authentication so that's something that's exciting to the elasticsearch community and I actually believe it's a direct response to the AWS forking their code because you don't get that with the AWS so they're just trying to stay relevant so it's the flood in the market I guess and then machine learning won't talk too much about that but that's an enhanced feature you can get with the commercial version of elasticsearch and it will look at the data it's coming in establish a baseline and then tell you when things go out of that baseline without you having to program that so and that's common for a lot of commercial seams as well okay deployment do-it-yourselfer engage the service or both so in this case we're going to do it ourselves I've actually got elasticsearch cabana and logstash running in a VM on my laptop you do not need a big heavy piece of iron to to take this code for a spin now if you're started working with a lot of data yes you've got a scale up appropriately so you can basically dozens and dozens of servers and you know you have multiple tiers and depending on where your indicee is how how much data is coming in whether it's being acting with the rig - you can move it through that tiered model so large deployments typically have a front-end server full of SSD disks in a raid configuration and that's all what your data is coming in when it's hot and then every day you might increment your indicee and then it rolls over and then you move those to warm and cold nodes it eventually long-term storage and the purge it if that's what your information likes life cycle dictates so obviously with a lot of data there's capacity planning it's scale and you've got to know you know size your CPU memory disk and I ops that's a whole challenge we're not going to get into suffice to say there's providers that will say look make that our problem sign up for our service dump your data on our system and we'll just scale it on the back end talk about at least one we're using to do that deploy milk okay so elk is short for elasticsearch log stash and kibana it is the tech stack for those three items so the code is running java so it's actually OS independent you can run elk on linux on mac OS X or Windows or some combination up it doesn't matter so whatever your favorite OS OS is so elastic stack is based on the Apache leucine engine so that's an Apache project it was basically a search engine team that built elastic search said that's a great search engine for keyword searching and we're just going to put our elastic search on top of that and use that as a platform so again use a text base in this indexes so each log file is a document and the documents are made up of fields and field names are mapped to types okay as we build our hypothetical stack environment we're gonna do a graphic so we're going to go through this animated slide which is going to shuffle in all the comments that we're gonna then go through step by step and build so this is an example of our environment we started out with a domain controller in our data center and a firewall and then we deployed our elasticsearch stack so we decided to use three nodes that are the master so we have a quorum so those are basically don't do much with the data that they just control the cluster and then we have three data nodes so this is a very small deployment but it was kind of a proof of concept so there's a cluster three managers three data nodes and then we loaded cerebro which is a web-based management tool and we'll talk about that and then we fired up logstash so that is our middleware tier that we talked about and the firewall and domain controller they're gonna send their data log stash and then it's going to send it to elasticsearch so yeah it's just log data so when log beat is an agent we'll talk more about that so that's the agent that it stalls on your window system creates a service and that ships the windows event logs either to log stash or directly into elasticsearch and then Cabana Kb that is a reporting engine so that's going to be our web dashboard and we'll talk about that and then graph onna is later in the slide we'll talk about that's another reporting engine that's not part of a elastic stack that is an open-source graphing tool which is just amazing so I'm going to talk about some of the things we can do there but they complement each other ok and then just a sample if you ever if you've worked with elasticsearch you probably recognize this graph this is a typical graph of how many documents per unit of time and over here you can see that we're looking at about 50,000 documents every 30 minutes in this particular index so that is a lot of individual data and then our challenge of the managed service provider is we have all these clients out here in remote locations so most of them are virtualized so they have domain controllers virtual hosts and a firewall and again we're interested in the firewall traffic and the traffic from the domain controller so we install a log stash system that's virtualized client location and that does double duty as a relay we're actually running a Gil so it's pulling data from the site and pulling it and we're using that same Linux instance to ship logs back so that's an example in this case we didn't have to have a dedicated log stash server we found in an existing Linux system that was doing some other work it had capacity we install log stash on it so syslog win log beat go to log stash Alex - then ships that through a reverse SSH tunnel directly into our elastic search and as a trial we actually have this company log zio so they are example of a provider that says look don't build your own elastic search cluster that's complicated it's hard to scale send your log data to us we'll do it so we are basically in some cases shipping the log data the lastik search you can split it off it doesn't have to just go to one location you can say send the data here and here so we're doing that as a trial so anyway we're going to go through building all this components and then this is the pricing which most log providers will charge you on how many gig per day you're pulling in so 10 gig per days five few 500 bucks 30 gig per day is about 1,500 bucks so this just puts an emphasis on if you could filter your data and slow down the data that's coming into your provider you're saving money you don't have to go to that higher tier right okay deploy milk install elastic search so it is a typical Linux rpm command line install you can compile source if you like so these are the basic steps when you are creating an elk node you first have to define the cluster name and every node is going to in the same cluster name that's how it knows it's part of that cluster you give it a roll let's see if there could be a master a data node or an ingest node or it could be all three any combination box-type hot warm cold so that's going to dictate whether indices actively being written to or it's done being written to it's just used for reporting or it's gotten old and you're not interested in reporting on in real time it can move to those different tiers so I think we have configured our node set to allow up to 16 nodes so we've only got six in the cluster right now but because it uses unique unicast if I create a new node with the right name on that network the cluster is continually pulling for new nodes it would automatically detect it and add it to the cluster it's just a seamless process so and we define a minimum number of masters for the quorum so if two of the master nodes went offline and we only had one the elastic search or go offline I do not have enough master nodes to agree what to do with the cluster and and the reason for that is you want to avoid a split brain situation if one of your master knows gets isolated from the others to a networking or whatever problem you don't want it to say well I'm the only one alive I'm going to be the master and modify the data you have to have at least two masters so that's a common approach and we talked about the permissions are wide open by default so this is heavily protected we can call it DMZ it's an isolated security zone that can only be accessed from internal tor network so it's not publicly exposed of course many other companies say that too and then they show up and Eric cyber threat run down and there you go okay deploying the Hulk stack you've done all the steps you've started the service and you hit the we call our no ds1 we have es one two three four five six you hit it on this port and you return to this which is a JSON fragment so here's the cluster name in this case it's PT 1 cluster UUID that's unique to the cluster we're running 6.6 I think 6.7 is out now and 7 dot o and then it's got backward compatibility it knows that it could play nice with other elastic clusters this version six five six that's it once you get that response the response from you hit a web browser and hit the node you get this JSON response your clusters alive you can start adjusting data your fully operational okay so when we start doing anything like that I strongly suggest you install cerebro so this is a third-party tool that's not from elasticsearch but some guys got together guys and gals they wrote it it's an amazing thing so it uses that JSON interface to base will give you a nice clean web interface to look at your cluster so basically this is the health indicator for a cluster these are our six nodes you see the three that are masters you see the three that are data notes and the data ones have a load the masters are almost doing nothing from a CPU standpoint their job is just a police the cluster heap usage disk usage so I have a 250 gig on each one of these and so loads spread across and then up for a month so anyway that's it okay credit manage indexes so index templates are are really key to how elastic search works so these are pattern templates if I'm adjusting data so down here's a good example LS FortiGate there we go so and I've scrolled down here so basically this is a JSON code which says if this field name comes in IP to give it a type of IP and half floats so you can do integers text fields so you might want to do mathematical operations in the field you can't do that if it's a text field so that's why you as you're adjusting your data you want to make sure they're properly mapped so you the integers are coming is integers floats if there IP addresses you can do cider arithmetic and say I only want to see things from the cider block so and there are some preset templates but in many cases you'll start adjusting data you look at what you have and then you'll remap the field types delete potato start started it again and it will when the indices is created that's when the mapping occurs okay now we're into a log shipper so this is log stash so log stash is CPU intensive depending on what you're having it do so all the logs are flowing through it and it's doing that parsing that filtering tagging enriching and then forwarding it to elasticsearch or it couldn't forward it to yet another log stash server and eventually get into a database so it will create pipelines so in this example we're gonna have two pipelines why is that important we have one pipeline for Windows data and one pipeline for firewall data well I have different rules that data is parsed differently and I won't have different logic the the challenge is so because this is Java they each have their own separate Java process and memory pool so I can shut down one pipeline and start it without affecting the other so that gives me some resiliency from that standpoint okay this is an abbreviated version of our win log beats pipeline so it's pretty simple in this case we've got an input section so this basically tells us what tcp/ip or beats port network port that this pipeline is listening on so the data is going to come in on these network ports so the win log beats if you install the agent on your computer it's its default to come in on 5044 we're not really using this one and then it turns around and says the output to elasticsearch so these are my four sorry three data nodes that we talked about so it's load sharing that's resilient one of those notes could go down it would say okay I'm just gonna to that one I got two that are up as long as my master nodes are up and arbitrating the quorum everything will work fine so the index down here that's key here year year month month day day so every day in this scenario I'm going to generate a new index so that I'm not storing all my data for the whole month or indefinitely in one indices so that way to roll over and I can implement some say in information lifecycle okay so now going to my sis log pipeline for my firewall a little more complex there's actually quite a bit here fancy stuff I'm doing as far as parsing in the data enriching it we're not looking at not going to get too deep into this but we had to listen to a port 51 14 because I'm running log stash as a non privileged user so non-privileged easier can't bind to some of those lower ports usually about below port 200 so I'm doing I P table NAT to basically redirect that to the lower port anyway suffice to say the data is coming in and getting or needs to and the message so here we're using regular expressions to basically parse it out and parse it out by the key value pair and then send it to elasticsearch same as before so this was my firewall setting we basically just turned on tape send the data to syslog and here's the IP address and then you've got to make sure the right boxes you checked here but that's it your firewall sending syslog data so installing the agents PowerShell installs on Windows it's a comma beat framework it runs as a service you configure it and then it starts shipping it to either elasticsearch or to log stash whichever okay once you start doing that you're adjusting data here is the cerebro web interface for this cluster PT 106 nodes we have 138 indices so all the different clients are coming in all the different firewalls we have them in their own buckets I'm not going to get into shards and Docs it suffice to say it breaks the data up in chunks and that's how I mean on the back end so elasticsearch data so source of this flat file so example it might have wind log beat location one location - but then if I do a search and win win log beat - LOC dot star I'm gonna include all those indices of match so it's just a pattern matching about how I search and how I organize my data installing Kabana ok now we're getting into the reporting interface again this Java can be on Windows or Linux can our configuration for gabbana is four lines what port is listening on what the IP of the cluster is yeah actually here's the cluster it's IP and its name that's it combining room talks to the cluster or anything starts reporting so we just review of what we just built so you go through all the steps and all of this is running I think we're getting Thailand time so I'm gonna try and breeze through the rest of this understanding your data you need to understand your data types you know know what five fields you're gonna report on so log formats they come in a lot of different ways so you got to be able to parse the data regular expression eventually you're gonna get it into fields in the database so this is an example of firewall syslog happens to be FortiGate but you can see the key value pairs VD equals b dom so log stash will parse that out and put those all into different fields Kabana is your main reporting engine so it does discovery visualization has all these applications we'll cover some of these here so using Kabana create index pattern to match indices and then those patterns become available for search ok this is the Cabana interface so up here this tells us our time this is critical this is what time frame I'm reporting on this could be the last 15 minutes this can be the last 24 hours the last 7 days so if you don't have a set right you're not going to get the data you're you think you're looking at and then over this drop down over here this is my index template so in this case I'm doing LS - FG - PT so that happens to be log stash FortiGate freedom technologies so that's all of our internal firewalls not our client firewalls and we're getting about 4,000 events every 30 seconds coming into the into the system and then these are all the field names that scroll down that it's ingesting and then this is one record I could expand that and look at the data okay so I just put a filter on here deny so that this is every time that the firewall denied a packet so you can see that now we're down to 160 events per every 30 seconds so it's quite a bit less than we had before okay and back to particular query if you look at the data it took about three seconds so it keeps all statistics on its performance and you can graph that and report on how well the stack is performing ok building reports what fields do I have what are the data types what I want to filter on we don't want a group by so some of these sound familiar to anybody that's in the sequel world right these are some of the same things so interesting questions so one of our clients is and we sat down with him and I said what kind of questions do you have so these the kind of things that came back with they wouldn't know how many failed login attempts per user IP how many logins are happening outside of business hours no you can go down the list it's all the anomalies the things that really shouldn't be happening you're in your environment so and those are the most of these if you catch them then the next question is why is that happening you dig a little deeper and we'll talk about how you can use this tool so briefly this is room for Donets documentation these are all the different types of events that it can cater a firewall can report on and a description of them okay similar thing for Windows Event channels so this PowerShell command will give you this listing if you've ever gone into Windows events basically this would look very familiar so there's a plethora these are called Windows channels so these store all your events and many of them never get triggered it depends if your servers has that role or not but so these are all available to pull data out of and ship it into elasticsearch through when love beats okay this is kind of key here the NSA as much as they are as a fly in the ointment they they still try to do good things for the community they've created this white paper on spot in the ĂȘtes-vous area with Windows Event log monitoring this is a great article so it's 54 pages there was a little dated it came out in August 2015 but it focuses on where do I start what do I look at in Windows of that logs if I'm gonna try to answer the question have I been compromised and it's a great starting point from looking at this data so that is the link to get at that PDF so this is a small sampling of well okay so somebody's basically taken that white paper and they've written a PowerShell script and it's available on github I strongly recommend you educating yourself on this PowerShell script we are using this with Nagios to run it on our key servers so if somebody goes in it makes a firewall change or does any number of things it'll trip this wire and it'll generate a security event so we're now going to translate this logic and try and answer these same questions with elasticsearch data so instead of doing it on each individual server we can do it on all our event light data in one central place so this is just a short list of the things type of things that is checking so discovering new Windows data so finding interesting log events in Windows so in this case we set a fair sampling of data so 12 hours you'd see 12 hours right here so we select the correct template index on this drop-down you can see that's changed now it says win log beat star that means we're getting all Windows Event log data and we're only getting it for to my domain controller and I'm excluding anything that says audit success so really looking for anything that has an audit failure for a login on this domain controller and I'm getting about 3 events every 10 minutes and those are login failures on my domain controller so I want to see what's going on there so on the Left I will scroll down and I can type in target and this field comes up wind load event underscore data target username so that's going to tell me the target user name and if you click on the field kibana will say well these are the met of the 180 for matches these are the top ones so these are the user names for the users that are generating those logon failures so I can see my name right there jaenisch - because I was testing just before I did this report and I fat-fingered it so it's working but the point is this doesn't necessarily mean there's a problem hey but at least you know you can kind of get a sense what is normal that's one of the first things you start asking yourself when you collecting log data okay interesting firewall traffic finding failed logins to the firewall so again we set the you don't see it here but it's a 12-hour range index templates change we're looking at an entirely different set of data we filter for just level alert and we have one match for J - - one fan login for my IP that was our terminal server and I'd had the same thing it says right down here same search a different index so I changed it to look at it now I'm looking at all of our client firewalls so now we're getting a few interesting things administrator freedom login failed because video a password and look we got a hitter application mask and scanner so this is a malicious attempt somebody's scanning us and we dropped that packet and we've logged it and one of the things we've done here with this you see this URL link here so if you were to click that it would take you directly to for Donets page about that alert so that's where we're we're enriching the report interface so let's say you've had another system that you wanted to say here's an interesting IP you can click on that and get a report from another system what's this IP doing do we scan for our last night so you can basically integrate this with other systems you might have in your environment okay come on to visualize here your graphs bar graphs so there's a number of graphs you can do here's an example of a heat map so Caban is immensely powerful but you have to kind of know what you're doing so in this case looking at the top talkers which destination IPS are talking to which source IPS and where it's darker green that's most of the traffic so this happens to be open DNS so we are using Cisco umbrella to do DNS filtering so a lot of our traffic is going there so I just know that we can so I could click on that and discount it or remove okay come on a time when it's going to briefly do touch on this if you want really advanced filtering you kind of need to go to timeline so you basically can use a query language and it's similar to leucine on the back end so as an example this is the query language where I'm gonna do queried only show total traffic receive per interface so I'm specifying the log index here LS F G I'm splitting the traffic on source interface keyword so it's doing a stacked graph and showing me each color and which interface that amount of traffic is this is a good example where I'll actually do some math here so the metric on doing a sum function on the number of received bytes so if that received bytes was a keyword and not an integer this query would fail that's one of the reasons do you want to type your your data properly and this is a great website time line tutorial zero to hero so if you ever do programming in ten minutes you'll basically be able to do some really amazing graphs okay it's only graph honor so I think this is the most impressive tool most of our security dashboards we're moving to graph on ax so that's what we'll talk about here for the rest of the presentation anyway installs on Linux it's web-based integrates with all this stuff the juror cloud watch elasticsearch graphite influx sequel server by sequel Postgres so you can tie to a lot of back-end data stores kibana primarily only works with elastic search I believe so on the Left we're considering the elastic search to use it as a data source so I say add elastic search and give it a name and HTTP localhost because I'm running graph on uh on one of my master nodes I just have to tell it to talk to local host on port 9200 I can talk to it alright so these are the different types of graphs we can do heat map I'll get into all of them but pretty flexible so I am gonna look at when log event ID for justice ID and again the audit failure and event log log in so this is the same thing as those login failures we were looking at the Kabana so these are the settings I have on my graph and it's giving me a standard stat graph okay I changed that same graph to be a heat map so I kind of like the color code I want to see what's happening on which server and which point in time he said heat maps are great but they don't really tell you the rich detail you have to dive in so that's true this is an indicator or interesting things going on and do I need to dig deeper right so I can see that this server here has had an excessive amount of film login so I might want to dig deeper and look at that night similar thing firewall policy heat map this was kind of fun I wanted to know of all the policies in my firewall firewall which ones are getting tripped and so it was interesting you see that these are daily patterns this is over seven days so these are the workdays you see the patterns and because we're trying to save data we based filtered the normal events so that changed the pattern here but I decided I wasn't interested in that data I'm happy I collected it but I'm I don't want to see it going forward so changed three or four days ago and that's the reason follow all session count by PC so this was an interesting one whatever engineers is like we've got a user that's got a lot of active sessions seems to be an anomaly we just want to track that over time so we were able to do that through the firewall so now I know every server we eliminated our domain controllers because they had the tremendous amount of sessions but these are each user endpoint and how many sessions they have some our engineering workstations is just grind away all the time on 24/7 and some just turn on and off so you kind of see those patterns you get an idea so final steps assemble all these reports until a rotating sock dashboard create your panels organize them create a playlist add new data sources and my staff hates this start asking pointed questions why am I saying this what's going on like China's - boards again you know that's normal it supposed to do that well prove it to me you know and I found some anomalies we have to go and clean up so so comprehensive dashboards example of a sock dashboard so here's kind of the end result so this is the grand finale so basically this is a combination of windows trap Windows Event log traffic firewall traffic for different things so I'm looking at all of our land traffic coming in and out of our data center and which client its associated with so if somebody's exfiltrating data i'll see an anomaly here something will stand out Windows Firewall how many who's filtering packets at the Windows Firewall level and why so we've got one server it's give it a very noisy environment and for logins per user which users are getting the most failed logins so I can dig in and say why maybe that users having a problem maybe they need some extra help maybe it's not a real security issue if you're getting 100 failed logins a minute yeah that's probably somebody running a scanner right so anyway we don't need to go through all these but you can mix and match these dashboards for whatever you want once you understand your data and this is all free and no licensing so summary set goals it's really easy to get lost in the weeds when you're getting into building these things just keep in mind what you're trying to do what table you try to solve your data sources you know syslog on Windows is everything we're doing here understand your data structure field mappings log volume capacity planning and then keep the eye on the ball for the business you know be able to communicate how there's a our wire value to the business so you can get funding to keep this thing going right that's it [Applause] [Music]
Info
Channel: Southwest CyberSec Forum
Views: 19,242
Rating: 4.9673471 out of 5
Keywords: SWCSF, John R Nash, Elasticsearch, ELK Stack, Logstash, Kibana, Phreedom, Log Analysis, SIEM, NIST, John Nash
Id: rXjm-iKeIlM
Channel Id: undefined
Length: 50min 6sec (3006 seconds)
Published: Sun Jun 09 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.