Webinar: Splunk for IT Operations

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
my name is Priya Balakrishnan and I'm with Splunk I'm responsible for the product marketing of the idea operations solutions including the virtualization technologies and storage and operating systems and all of good stuff so in today's webinar Paul and I will be co-presenting in the first part of the webinar we will be talking about Splunk itself and Paul will then dive into a demo of the Splunk solution for you but before I do get started what I do want to introduce you to is some of the challenges you know customers are typically facing right if we drill down into the priorities of a CIO you know who's it's essentially at the very top of you know the organization we see that you know they're focusing on some of the key initiatives and according to a survey by Gartner that was put out in 2012 the top priorities for CIOs include increasing the enterprise growth delivering on business solutions achieving and you know analytics and business intelligence and essentially improving the customer experience but the biggest problem that they have is that they have to do all of this while cutting costs and eliminating tools the CIOs are under a lot of pressure to essentially deliver these priorities and they still have to deal with you know all of the challenges around the day-to-day operations right they need to be able to keep the lights on and the business running they need to make sure that you know they're able to support your organization to ensure that not only are the top priorities met but also be able to you know provide tools and information so that they can troubleshoot issues properly analyze and you know analyze their data properly so that you know it can tie to strategic business initiatives and make business decisions and essentially proactive about their environment not only from a troubleshooting and analytics perspective but also from the perspective of making sure that these top four priorities are met right the challenge is that there is so much complexity in the environment today this increasing complexity with so many you know interconnected knowledge even layers of IP software and individual silos actually presents a tremendous amount of problem because now you have tools that cater to shade which traditionally or you have tools that cater to the surveyor or you have tools that you know APM tools that provide your visibility into your application cycle and user experience but when it comes to diagnosing a problem now you have these individual teams that are actually having to look at their individual technology stacks right this complexity is increasingly hard to manage because now what is happening is that you know there are siloed technologies all of these solutions are disconnected or you know all beaded because they if you look at you know how the trend has been in the last few years for instance virtualization and big data are some of the biggest trends that are there in the market but you know if you use traditional tools that are out there to manage these you know these environments you're kind of you know you're kind of making decisions in a silo you're kind of making decisions in a vacuum because these technologies as the traditional technologies are not equipped to handle the dynamic nature of these environments right the other piece is that if you want to actually connect these you know tools together the challenge is that now you're actually going to have to you know have extreme customizations on your tools if you want insights into your application layer then you need you know insights and dispensing technology stacks which either requires an intrusive ways of actually collecting the data and probing into that the internet specs a technology set or you know actually getting very high level visibility that really doesn't provide you the insights that you're looking for right for example applications may be written in Java or Python and may be running in virtual machines and you know your Java management technology is it's you know probably not written well enough to extend to the virtualization piece or and the virtualization piece is connected to storage and network layers and and servers so you need that cross visibility across these various tiers and right now all of the solutions are so disconnected that if you want to you with a particular problem you're really spending so much time and identifying root cause analysis simply because you don't have a way to holistically look at that data right when I was you know in in in support a long time back you know we had this problem that you're essentially seeing on the slide right we would have a problem and you know the service desk will be called then the service desk we'll look at some high level data they would say oh you know my tool is showing green and I don't want to you know I don't think there's anything problem but some of is calling about in a user experience issue and they essentially then escalate the issue to an application support team the application support team does not really know what to do they escalated to the application developer team then it goes down to the Systems Administrator right eventually it comes down to you know either the applications team looking at the data or the idea operations teams looking at the data and like I was saying when I was a support engineer what I would have is I was in the application support team I would essentially have my Perl scripts right that would extract information from the logs and then I will look for keywords in the log like failed or error and then determine the appropriate time stamp and see if it had any associated impact to the user that was actually calling in about a particular problem right now assuming that there is a particular problem in that application stack now at that point in time I don't have visibility into okay which part of you know the the various ID tears actually contributed to that problem right so now then I essentially have like 30 people on the call for you know eight hours everybody looking at their individual side or two thing oh you know my tool is saying that this is okay or my tool is saying that this is okay the biggest challenge is that the tools are showing green then there are tons of people sitting on a call actually trying to diagnose a problem simply because there is no holistic visibility across you know the various various various ID stacks right right from the end-user experience to the application to the web layer the infrastructure that supports these applications to the backend pieces like the databases or even like you know your your your infrastructure could be like in the cloud environment so you have even limited visibility like say if you're hosting applications in AWS right the challenge you're facing is that your infrastructure is becoming extremely complex and troubleshooting becomes incredibly hard right so think about it right I mean you have so much data within your organization I mean every part of your infrastructure is generating so much data within your organization the biggest challenge is how do you make sense of the data right you have data that's coming in from your logs from your configuration files from musical extremes and all of the good stuff but then now when you want to diagnose a problem like I said I mean you know I write my pro scripts to extract the data but then there's there are ways to actually interconnect that data across these various cheers right from your audit processing to the middleware the web server there may be these common IDs that could actually you know make sense to that data right take for example this you know purchasing a product on say your tablet or your smartphone if the purchase transaction fails you can call the call center and then you tweet about the experience or you say oh you know what I had you call the call center and you essentially say you know I had this particular problem now there are various pieces within your infrastructure that actually can connect the data across these various tags and collect that information right and make sense of that data but right now you don't have an easy and automated way to do this right for instance if you look at this example we have a single customer ID associated with the order ID and you know they're and if you look at it from the order processing and the middleware and the web server perspective you know you have the customer ID or the product ID you know across these various pieces of logs additionally in the middleware layer you have information that's saying oh there was connection refused in the middleware layer in the web server layer or in the database layer now your challenge is trying to figure out the connectedness across these various pieces of software right so now what if you know you had an easy way to actually search through the data through this morass of information that actually generated you know across your various infrastructure pieces right what if there was an easy way to access the data to determine you know what parts of the product or the user or you know the what parts of the product of the user user actually traverse through and what kind of experience did he have across these various pieces right what if there was a way to actually record the behavior and use it not only from a troubleshooting perspective but also from an analytics perspective right to say well okay these were the pieces of the product that are the software that you know I frequently see users are constantly accessing and so maybe that's what I need to kind of focus on from you know an X feature from a product perspective right what if there was a way to kind of get that end-to-end visibility across these various technology tiers and consolidate all your monitoring solutions into a single place so that you can get this comprehensive instability and essentially gain this you know this useful means to actually make sense out of the data right I mean if you look at look at a lot of them look at the trend out there I mean there's this big hue and talked about Hadoop right what does to dupa dupa is an easy batch processing way to actually collect all of that you know data within your organization and store it somewhere successfully the biggest challenge is then what do you do with that data right I mean and that's where a Big Data solution we get an analytic solution actually you know makes sense right I mean because this data as you just saw in the previous example can provide you tremendous amount of insight right and that's where Splunk comes into play what is sprung right plant is essentially a tool that provides you the ability to transform your data right essentially your logs your config files your user click streams and all of that into you know useful dashboard report thread it provides you the means to actually look through your data and gain intelligence out of the data it provides you real-time visibility and correlates data across multiple technology tiers and as Paul will show in his demonstration today you know you will be actually able to determine how you know Splunk one provides you the ability to collect data from any part of your organization successfully - you know proactively use a user probe for proactive monitoring but not only that it provides you the ability to use that data for analytics and intelligence right it essentially enables you to gain real-time insight and also you know trend on your on your raw data on the underlying data to actually make sense of that data right so what can Splunk do right Splunk essentially think of it as you know a real-time platform for your machine data so what is machine data right machine data is any data that's within your organization it can come from customer facing data to Windows Windows operating system registry events from virtualization events application events databases any of that right what so just to give you some some you know background on what's how the word sounds actually got formed you know it's an abstraction of the words P linking right P linking is the is is 0 exploring underground cave right the abstraction to this is your exploring your underlying data IP data right and this data because Splunk only requires the data to be in some kind of a flat file format any data can be essentially ingested and indexed wooden Splunk right and that that is what Splunk provided the flexibility to do collect all of this data across your entire environment without any upfront schema what do i mean by that in that we do not you know have a database in the backend and therefore you're not in any form of fashion required to actually pre normalize the data it to fit into you know specific schemas right we do not have any custom costly connector so because Splunk only requires the data to be in some kind of a flat file format you do not need intrusive collectors to actually collect data from your environment think of it pretty much like you know a tail in your log file item if you are familiar with UNIX command if you're collecting logs as an example what is essentially doing is kind of unisys logging that data you can use this log or TCP RDP to collect the data but it's so non intrusive in that it's like attend your log file and essentially you're you know forwarding that data onto the Splunk indexes to index it Splunk has its own custom MapReduce technology you know much similar to how Google works and you know very similar to Google right we have a very simple you know search engine which allows you to actually search through the pseudo insane amounts of huge volumes of data very easily and and Paul will actually demonstrate that in his in his demo today right so what how does what does Splunk actually allow you to do right plunk enables the connected data center what do I mean by that right it allows you to do ad-hoc searching right like I said I mean it gives you a very simple search but once the data is in Splunk you will see that the foundation or the foundation for creating alerts for creating reports for creating data could use everything is long it's the search right and then once you do that what we notice is that customers you know say oh you don't want I was able to figure out that there was a problem that I wasn't doing aware of right because you're just searching through all of that data within your organization so they say oh you know let me go ahead and you know index additional data it is through apps that Tina's punk-ass apps are basically Kim's on top of your data what do I mean by that is that it gives you a framework to kind of collect data from specific technology tears because not every data is every every technology may be you know in a flat file format I can give you an example if you take VMware as an example if you're running VMware in your environment you know you have data in your in your hypervisors and your holes in your virtual machines and now when you want to collect that data across these various environments across these various pieces of your VMware infrastructure it becomes really hard right so now what we also provide is a framework or called an act which will allow you to essentially collect the data from your VMware environment as an example and essentially put it in just lung and then you know index that data automatically and provide your useful visualizations right out of the box right so going back to the connectedness piece right I mean one just one customer start searching and investigating they go and say oh you know what let me index more data because maybe I have other pieces with my infrastructure that has alerts and problems that I'm not gonna wear up so they start indexing more data and kind of like proactively monitoring it because now you can easily and quickly set alerts on top of that data then once they do that they realize you know what I can actually get insights into the data in the sense that they can get actionable dashboards as actionable reports and analyze that information I can give you an example right I mean one of our customers bought as an example they had you know they had they were supplying medical equipment to you know to their Hodja hospitals right the hospitals were frequently complaining that you know there was something wrong with these medical equipments because you know they were frequently not providing the you know providing the capabilities that was expected of of those tools then Bosch started actually clunking the data from these medical equipments and then they realized as an example that nurses were not fully charging the data right our nurses were leaving the data in very random places they're leaving the equipment's in very random places right in sites like that I mean it can be easily gained as an example another example I can give you one that I'll see your love to give is you know there's a building management company that was clunking in data from their elevators right and when they were sucking in data from their elevators they actually use the data to kind of do an analysis to figure out based on where the elevators are stopping in a tenants who are actually you know going to move out and you know essentially put you know houses in the market it's you know there are very powerful ways in which you can actually group and analyzes of that data because Splunk is very ad hoc in its reporting capability so you know from a maturity path perspective I mean that's how we typically see our customers grow they start off with like you know search and investigate into your data then they once they start searching and investigating and really you know figure out that you know the power that Splunk brings to the table in terms of it's such an investigate capabilities is phenomenal that they start indexing more data either through our apps or through customs you know custom methodology that they have they have implemented and then once they do that then they're actually able to get intelligence and business insights out of that data because of you know the various reports and role these views that you can give across the organization right the other piece is that plunk is a platform for ID operational intelligence right so what do I mean by that right one of the things that our CEO hates for us the FRS is a marketing team to do is you use the word platform and you know he essentially borrowed us from using it for a very long period of time because unless the customers you know really call as a true platform it's very hard for a vendor to tout that we're a platform and in the last year or so we've seen more and more customers use us and and call us as a platform which is why we're not talking about ourselves as a platform when we first started we were you know a you know a log management tool right then we moved on to being an engine for machine data and now in the last yearís essentially become a platform right what do I mean by that is that like I talked to you about that app framework an app is essentially a means to actually collect data from specific technology tiers and provide a starting point for useful useful visualizations and dashboards right you can completely and easily customize these apps if they don't meet your needs because Splunk is very ad hoc and it's in its capabilities but at the same time I mean sometimes you may not know what to do with your data right you may not know what you want to start looking at so these apps kind of provide you a starting point right and then you can easily customize them and we have over 320 apps right now right all of them actually most of which are free we have at this point in time only two apps that are paid for our enterprise security app and our PCI compliance app right we have all of these apps for servers for storage for network for operating systems for mobile applications for your cloud AWS infrastructures but not only that we're seeing more and more partners in other vendors that want to partner with us to create apps for instance you know we have solutions from extra harp or app dynamics or or you know Palo Alto Networks right that are actually creating Splunk apps and they're kind of supporting this apps to say you know what we want to be able to Splunk that data so that you know there's a unified way to look at the data across these various technology chips right so now with all of these apps indexing data with installed you have that you know holistic visibility but alternatively you also have you know an API in a very comprehensive API so strong provides these very comprehensive API to actually you know export the data out of plug if you so desire right what so you may have you know executives within your organization which are very used to a specific way to look at the data right or want to export it into a centralized internal portal and it's easily possible to do that using the Splunk APs because we have very powerful is the case that actually allow you to actually export the data with outside of slope this one right so we're not like saying oh you know you just need to here to use use Splunk for everything although that would be wonderful but you know you can essentially you know use it however you want and this is what truly makes it a platform and we truly believe that you know we are you know this is just the beginning we're seeing more and more customers and partners wanting to partner with Splunk to essentially make plunks the central platform for all machine data right so you know the the big heart plunks is very ad hoc in its reporting capabilities you know it can be used across various various use cases right I mean it can be an it provides rule-based access so now you can you know because you have all of the log data one of the biggest challenges we hear from customers is that oh I don't want my development team or my helpdesk team to have access to my production environment right and you no longer have to do that you can actually just have all of that data within Splunk and have them you know I have provided them access just wrong to look through the data not only that you have the ability to you know allow them to you know see only specific portions of the data with role-based fields right so now your helpdesk team can essentially you know look at the data very quickly and easily and your application support team can you know look at that data from you know a middleware layer perspective or you know at your infrastructure perspective your system administrator can you know essentially look through the logs very quickly to kind of find out or look through the data you know I stand corrected and that it's not just logs you can collect any kind of any kind of information that resides within your organization and now your system administrator can look through firewall data or can look through you know logs or can look through config files and see that information additionally we have tremendous insights into security right I mean so for instance we have customers that's saying you know I want to find out very simple right I want to find out you know what what versions or what you know what patches are running for specific technology Church you know across my entire environment right and it's it's as simple as doing a search across you know the information that you've collected and and basically create a tabular list of that data and export it right and lastly because you can provide very useful visualizations and Paul will be demoing this very shortly you can actually gain powerful insights from your data and drill down right from you know the end user right down to you know the the underlying infrastructure and to the raw data easily with its blunt right at this point in time I'm going to pass the control to Paul and he's going to be running a demo thank you for you we do have a question Laurie has a question of can Splunk proactively alert me on any problems absolutely so you know like I said the underlying um you know the if the foundation of Splunk is search right once you figure out what you know how what to search or how to search if of the data sometimes you may realize that one of your monitoring tools has not you know essentially alerted you of a particular situation and now you can say oh you know what I was able to figure it out with this particular search and generalize that search and create an alert and Paul can quickly you know maybe show you that as well and see you know if it will allow you to see how you can use that search to basically create reports alerts that would views and role-based access to your data thank you for you Howell you may begin okay thanks that question is a really good question I'm actually going to touch on that so I want to come back into I cap ratios I'm gonna focus on aspects of I kept eration x' which is all of the critical applications that you manage today to probably consume a large portion of your time now I know that's a portion of IT operations but we'll focus on that and I want to kind of explain the environment we have I'm going to come back to a use case we talked about at the very beginning around servicedesk gets a call slow application performance customers are complaining how do we resolve that and how Splunk plays an important part of that to getting to that quick time to resolution what we call MTTR so what I'm looking at here is the actual environment and kind of explained some of this here what was collecting for data and then I'm going to go into the course Splunk searching capability mode and come back to the app we're representing a very common business application that is multi-tiered multi-layered different technologies we see this all the time you probably have several running to your mind today that you are looking as thinking yep that's my application environments this case it's an Apache based web tier with a my sequel database behind the scenes they're running a middle tier that's processing all of these product orders so we've taken an online store and represented that but it could represent any of your application and we're representing this is running in physical hardware but PS talked about a few minutes ago we were just I was just with her last week at IBM user conference Splunk is also very very good at verse plunking your virtualized environment and reaching in and getting insights out of your applications running in those environments so we can take the entire big picture of everything and bring that in and make it usable what it does here we take the four ters and we collect that that is for sits out on that source it's very lightweight Universal folder and two things it does take the data what is it and send it up the indexer where we go ahead and put a time series stamp on that so we know when that event occurred and now it's ready to search it extracts out different fields that we can report on and I'll see that in a few minutes and then I'll be interfacing through what we call a search in so let me go ahead and start with search because that's where Splunk gets its power it is a platform for search over your data so we've laid this search capability over this data and what I just want to do is just 60 minutes what's coming in the last 60 minutes so you're going to see very quickly here everything is showing far as events goes and I get this really great flash timeline I love this for troubleshooting issues I'm using an Indian house in her own environment as well as when I'm on-site with customers have done several pocs lately and we're immediately finding value or the issues in their data one of them was for response to a production go live and in one day's time we were on on-site for just a couple hours we were already starting to point to the latency issue that they were having but we were showing them other things that they meaning those going on in their data a little more about that story in a few minutes but first I want to kind of point out here of what's in this data well I know it's coming from eight different hosts so these are fields that Splunk automatically will extract and make available to me I know that there are different sources of this data so you see that some of this is coming from you know the database some of it's coming from the linux server netstat and so on so I have some really value information of the sources but also go down here and I see what's going on within this data including let's just go ahead and look at product ID now customers are interacting with this business application ordering products and if I'm a product expert and this is a good example where this is non IT data that we're bringing in and supplementing through a database lookup or simply just bringing in a CSV file on a batch every night or there's different ways that data is out there Splunk can get to it and bring it in and make it searchable and usable but I don't know this product codes and our product expert I you know would see at a glance so most time I have to go looking at catalog customers don't have to do that or in this case the analyst doesn't have to do it because all we do is we add knowledge and we translate those product IDs into the specific products I can quickly turn this into a visualization so let's just take a look at top overall so all I did was I put in a search command here of top products and say I have to give this to my manager I'm going to go ahead and format this into a pie chart and I can then simply save this into a dashboard that's the other thing being developer for years on other platforms I love how fast Splunk allows me to develop so we'll just say I'll just put a title on here I'm going to put in a dashboard I'm kind of been building out for a few minutes while we started the demo today so I'll choose an existing dashboard that you'll see in a few minutes order processing and I'm looking it around perform itself and I don't want to know what they're ordering the last 60 minutes I'm going to put some other things with it we'll say a pie chart but you get all kinds of other types of visuals that you can create and I can even accelerate this search so it runs really fast even faster than what we were seeing on the screen because there's some ways that it keeps us information readily available in search now I'm going to come back to this issue so I'll get back to this in a few minutes but just put in a search a very common search and that is and I've saved part of this just to save a little bit of time I've placed this here and I'm going to search for all airs failed severe and also in source types I'm narrowing my source type and I want to look also not only what if it's calling a failed error or severe but also the our codes and it finds quite a few but more than I really want to know now if you may recall Priya talked about there's no database behind the scene there's no schema that means that we're actually able to what we use call what we use schema on the fly the scheme on the fly that allows me to go ahead and say well how do I want to present this data at this time so all I have to simply do is come in here and I'll say let's create a table so with the table we will go ahead and put in the fields that I want to include in my dashboard and I'm going to say that I want the host I want to file one of those status when those status books at URI because I want to know what query is actually causing this on my system and we'll just hit enter you so what we see here is the results of this in a table now this table I've defined in the search I can turn around and make this available to someone let's say you do put it in a dashboard and a manager and another technician comes back says well I'm missing these data elements if this was a database I'd have to collect the DBA and have to define on it so it's no I just simply go in and say well I need to know the method and enter there's the method I'm going to add this into this dashboard real quick to the existing make us a table I'm just going to accelerate this one real quick so the end result of this which you're going to see is a dashboard that I've added several things to two of these I pre added before are well we're going to the demo so these are just total page access time what's the total product and oh by the way before I give this to my manager I want to go ahead and just move current products up here so they they know what's going on and I want to and you can notice I can edit I can move things around I could take this link right now send it to my manager my team you know staff and say here you go you know who how many people are accessing the page the application in the last 60 minutes what's being accessed and most importantly what are we getting for errors and what is causing the air I've just put some valuable insights of my production application at the fingertips of my team that they can start acting on these issues and that's why I wanted to spend a few minutes with searching and show you how quickly you can turn information searches of your data across the vertical and the horizontal of your data center into powerful insights that now someone can look at at a glance say I know who I need to contact or I know how to resolve this issue customers using Splunk today on average around 50 to 70 percent reduction in MTTR the reason for that is because we spend so much time in the mean time to know okay I know why but why did this happen how do I quickly resolve it so let's go the other perspective on this let's go back to application management and I want to start with from an IP operation so now you understand what's behind all of these dashboards every single one of these visuals is an active search running on this environment looking at my data is looking at multiple sources and this is a really good example so let's take this application up a level and say for the IT manager that wants to know how is it performing we've promised to the business this level of performance are we meeting that right now especially when an application is reaching outside too customers beyond the main part of the business so I added I know right now my SLA s not it's not looking good and this is an SLA looking at transaction and throughput I can use statistical commands so the search language itself has a hundred and fifty plus search commands and very easy to learn and also powerful beyond just your very basic searching allowing me to do stats and transfer transformations and in this case transactions I can look at a transaction moving through that I can say well an acceptable rate is this level of seconds and in this case here I'm in fringing upon the SLA I also can look at what were failed transactions these aren't too bad we see some errors you saw those a few minutes ago this isn't really the concern it's somewhere else and I just got a call there say Micah my customers are complaining about is taking a long time to submit an order why is it so I want to resolve this quickly well I know it's not a capacity issue because work this is a summary of memory CPU utilization throughput so it's forty seven forty five percent that's not an issue I also see current you know what's my seven-day average from transaction volume so my transaction volume is not over what it normally is we're actually looking pretty good but it's the SLA is it keeps bouncing above so something's going on it's holding up these transactions so how many are failed and successful this looks fairly good so what I want to do is actually and that's a great thing Splunk I can make these dashboards fit the audience so they don't go out and go to searching directly they can quickly triage an issue in your critical production apps or IQ operations you know networking environment all of these things that you would want to manage I just simply click on a link I've added to that dashboard in the XML and I'm on to the next layer of this so now we look at the middle layer so here's the you know the database I'm sorry the web server the middle tier and the database now at a glance we know it's right here but I want to kind of walk through this so 575 seconds and that's my transaction volume that's it's alright it's not what I would like so where is the issue so I start glancing here really quick we do have some errors not you know I would obviously want these down to minimal or none but it looks like these are police servers and look how I go this line nearness I can look at performance of an isolated one I can potentially click on any of these and drill down to them so all of these are actionable real-time data dashboards not you know it was refreshed today ago 24 hours abilities as dashboards we see in other apps they're your data it's being searched on I've worked a lot with Tomcat in Apache environments and we know the JVM can be challenging and in this case memory utilization here the heap size is looking good so it is over here I come over the database and we've brought in you choose what you want to have you can have a quick summary of what the metrics are and we could have all of the different queues spunked out we could have the error logs so we just put a quick summary of what you potentially could do and here we put the purchase queue for our example now the purchase queue is way above where it should be that's my known issue so instead of having to call that you know okay so perform summon call the web team let's escalate to the web team notes not them all right let's get everyone I call SWAT call and now it's now we're going in towards midnight and we're all on a call and I'm telling all right you go over and check those logs I go over and check here and that's our typical SWAT call it takes hours of time to get to the resolution where is that you know a tier 2 support person maybe even tier 1 I can come here very quickly know I need to route this ticket to the database team room look at and potentially because I've given them additional insight so I'll go down one more tier I can look at ok so is it the database environment the server know if you looks good memory looks good but the queue is really high why is that well here's another thing of adding very valuable data from IT operations that's outside of your normal what we call log files and that is your change in configuration data Splunk is very good about taking in this data and mashing it up to where I can at a glance know what was the configuration values like here they increased increased but here this doesn't look right so I know I need to go to have a talk with Simon I can tell the DBA saying Simon's the one that caused this issue here and also was there a change ticket it's a simple query into in this case remedy ticketing system but whatever your ticketing system is you have a change management environment take and just basically match up on product the server the application and be able to look and say was there a change ticket in this case there isn't one so we can go down to the raw data and we'll actually look at the search and we can do it so next time this occurs whenever there is no change ticket I want to be able to go ahead and alert on that so I'm just going to quickly I'll just show you whether that I won't create the Oh Lord for sake of time so one more thing I want to show you and that is if I wanted to alert on this I could say all right this search will when it's issued when do you want to schedule it while I'll run it once every and hour looking for this and what's next here and I will show you what can happen I can send an email to an alias of an administration DBA team I can run a script we have I work with several different customers that are actually going and creating tickets in their ticketing system and it's simply a strict a script that kicks off a web service however it integrates we're able to go over and create that ticket so now we have this monitoring this this is beyond your basic level monitoring because of the fact that this search is considering multiple source types there I have yet to find many other monitoring tools that have this broad reach and SPARC is really good it can come along and complement your existing monitoring tools and fill those gaps you have or for other customers they're retiring their legacy ones because they're great what engines they tell me what went bump in the night but it doesn't tell me why I know why right now I know the why is this was unauthorized change by this person on this server so for the sake of time but before I hand this back over more some right back to searchers another way we could have come at this investigation and that is with the actual IP address let's say I do know the IP address of the server that's having the issue and I could put it put in simply what's the IP address this would be like a level two support person so oh I see a spike here five oh three years not what I want that's right around they reported this issue right around this time window and I can simply just go ahead and drill in I'm just going to looking at one event I'm going to come all the way down so I am sitting on look on the timeline right there one second now all I see is one record but all I have to do is simply come back put a wild card on and I look in that one second a time or one minute however your time-space is I see not only that event I'm looking other source types I bring in my other source types from the database the combine logs and there is my issue this happened because sequel shut down and we're seeing that it could not write to it and there was air writing to the file and with this error cause this segregation in service because I had to use them in the queue now this is further telling expanding upon the whites going on and the why was because we added in the configuration data splits can take in configuration files so I'm going to come over to another app called Splunk cons bunk we can use Splunk to look at its own environment a lot of customers users and it's a good example of how you can look at config files I can simply pick all the config files out of Splunk and I can compare two files I can look at you know on this server and on another server and it's this type of interfaces root it's easy to setup it's in the XML II just basically add my menus my search as I want to reference and I can start my configuration management team from an IP Operations perspective has a valuable way of getting to that almost like closed-loop change management being able to look at configurations and changes as they occur after they occur and better manage those so I kind of wanted to come in and just mention that for the sake of time to give you some additional insights on that one last thing and that is Splunk base Splunk is great about putting information at the fingertips of developers users of Splunk so the documentation is great take a look at that there's getting started there but I'm going to come over to spunk base spunk base is a place you can add apps to fill maybe I have a my Linux servers or my Windows servers and I need to go ahead and start bringing that data in as Priya mentioned it's a great starting point and these apps are out here you can download it in seconds have them in and start getting that data in so with that I want to go ahead and hand this back over to Priya hopefully this gave you an idea of what's possible little Splunk and I'll hand it back to Priya will lap wrap-up and we'll go to a Q&A session at the end here so pretty wouldn't take it back and if you have any questions from the demo please go and put those in the chat and we'll go ahead and take those in the QA once we wrap up thanks you it has been handed to you um you there we go sorry I was actually talking on yes I think you're look things thank you very much Paul you know I thought that was an awesome demo and I was just saying you know you could essentially see how ad-hoc Splunk is in its capabilities right I mean one you start from search then you move on to be you know creating alerts and dashboard views and reports but you know you can also use it from analytics perspective because now you can persist that data for however long you want right I mean you're essentially going to be able to if you can throw cheap hardware at it then you know you can you know cheap memory edit you're essentially going to be able to store that data in its drawer format and use it for you know planning and analytics I mean as an example one of our customers essentially you know he's available and health care provider has a mandate a federal mandate that they have to store log files you know about their customer interaction so that if there's a claim five years down the line or so they have the ability to go back in time go back up to five years in time and look through that information and therefore you know they're using Splunk to search that data and then she store that data for five years right and that's how you know how ad hoc Splunk can be and that's how you know you can use that raw data however you want so essentially to deliver that operational intelligence you know you need three primary capability drive one you have the ability to get you should have the ability to get real-time visibility of your data and correlate the trunk you know transactions and events across these multiple ID sources and monitor them you know proactively and track them against the citizen and drill down into your data so that you can you know in fact you can diagnose problems the other piece is that you need powerful navigation capabilities so that you can actually get to you that needle in the haystack to be able to troubleshoot and identify the root cause very easily and as you saw with Paul's demo if you just enter simple terms like give an error and then you know you can keep typing the information or keep adding on to your search query to essentially get to that data very easily right and the last piece is to be able to analyze you know historically as well as live streaming data you just saw a minion there was a report which allows you to see you know historical trend versus your you know real time trend and basically you know identify patterns and you know basically prove you know um you know any kind of compliance issues or I have you know identified security vulnerabilities right and justice plank with its core capabilities can support these three main workloads right and it can actually provide you extreme operational intelligence across your environment right so we have a few customer case studies I'm sorry before I go to the customer kitchen is I do want to talk about our apps you know like we mentioned our apps our skins on caprica data and we have over 320 apps only two of which are paid for everything else is free and they pretty much plug and play for various technology technology tiers and they've been built by the community partners by Splunk uh by you know by a whole bunch of folks that you can easily leverage right from a customer history perspective so as an example ping identity there you know a technology company that essentially provides you know cloud identity security solutions right and they had very cumbersome processes limited visibility through their environment to actually identify if they were you know compliant with SMS right so some Ursula's have included things like you know where they had to recover from a failure within 30 minutes right and with things like that I mean they essentially before Splunk we're looking at independent systems monitoring solutions and then siloed capabilities and and you know they were not able to really get that visibility off your shaft service very easily right with blanc they were actually able to achieve you know that much much desired cross ITT invisibility n and among the most important benefits they were actually able to save time and effort simple tasks like a log extraction from your horse and VMs you know for system level insights right would take them for hours because they had to go to individual holes and individual virtual machines and collect that data with Splunk this time was reduced by in a 70% and and you know basically gained tremendous efficiencies Splunk also help them consolidate monitoring tools and scale very easily for centralized monitoring if you look at water phone right I mean they have a problem where their service desk could not easily respond quickly respond to customer issues and they were actually very struggling with you know very time intensive searches on exceptions or errors in their Java and j2ee infrastructures right what they wanted the ability to do is provide that their service desk the ability to quickly access the information and troubleshoot issues are escalated accordingly so that they could improve the customer satisfaction levels and they were able to you know deliver this rapid troubleshooting capabilities you know with this block right Cisco Cisco is a big user of slug right I mean they're a worldwide leader in it working and internal you know one of their internal teams found too costly and time consuming to you know track security instance across you know they're in their employee base right they were struggling with like dozens of consoles of like disparate devices and tools and security systems with no easy way to actually correlate the data across these various tiers and they wanted a centralized way to look at that data and with Splunk they were able to you know basically you know proactively you know assess threats mitigate any issues in terms of like in an incident sir and they were also able to analyze this security architecture and identify if you know they had the appropriate you know firewall rules in place and such and and you know detect incidents and and actually proactively you know manage the response as well right why should you consider Splunk like I said you know customers go from you know improve use it pretty much it can it can scale from desktop to you know enterprising that it's that it's a free download you can quickly access it and then you know identify specific use cases you want to you know you know target and then grow over time there is you know grow you don't you you don't have to immediately buy a large volume of you know licenses to actually look through all of your data you can very easily scale by either based off of demand we have over we have thousands of users of our free software but we have five thousand two hundred plus paying customers or in more than 90 countries and more than half of the fourteen hundred are in a strong customers across various various verticals right we're a growing controlling company we have you know you know and they're a very very transparent company you will see a Splunk base plunk questions and Splunk user groups are very very active and customers and partners and Splunk folks usually usually you know participate very actively but additionally we also have our comms in Vegas the end of this you know in September and September where it's a very technical conference it's not the sales conference it's a very technical conference we have enablement we have customers presenting on how they've used Splunk to solve specific use cases and so on so it's some and in fact it's mostly only for like mostly we see existing customers attending them right so Splunk it essentially enables you know ID operational intelligence it allows you to get I like to call it an end-to-end platform for machine data machine data because it can collect any data from your environment it's a platform because it can collect it it provides you the ability to easily extend into your existing monitoring framework and it's end-to-end because it can connect the dots very easily across your environment to gain operational intelligence thank you for you we do have a question house I have a question about how long can I keep the data before it's aggregated within Splunk take that us so you know I mean like I said I mean no Splunk is essentially you know it doesn't have a database in the backend so you know you don't have to even aggregate the data anytime soon all right I mean like you can stir it in its raw format if you so desire but if you don't want to we also have something called summary indexing which can essentially you know aggregate the data in a manner that you desire you choose to do right so I mean it's it's really customizable and it can scale very easily because I mean for instance one of our largest customers indexes 100 terabytes of data a day but that means he's searching off of you know petabytes of data at rest right so it's not like you know you're you're one you have to aggregate the data because you know you you don't have you have you're constrained with the database right you can assign shape achieve memory and Stewart but just because you're putting cheap memory and you're storing it doesn't mean that you know you're constrained in scale as well you can easily report about that data very easily within spoon Thank You Priya the next question is is what is the impact on my production environment when I use Splunk for data collections great so that's that's a very frequently asked question so the impact is actually extremely minimal I mean from a memory perspective it's negligible and from a cpu perspective you know we have found that it's anywhere between one to three percent depending upon you know the amount of data you're collecting and the kind of data you're collecting as well thank you and our last question comes from Colleen who do I contact if I have questions after I install Splunk good question I'm going to take the one for you if you go over the next slide and by the way as we wrap up because we're coming up to the top of the hour if you have follow-up questions there's a good contact email and you can always get back to us we'll be glad to talk to you so you can try Splunk out today I really like this is how customers get their hands on it and see the value they can use the free version and start getting insights out of their data excuse me so you can go out to our website and click on try spunk it'll take you over to the download link and go ahead and install it on your desktop a that we say that it runs from the desktop the enterprise and that's where customers start out solving issues there and then they implement it the other thing we offer we've done this for quite a few customers is the QuickStart we help you out in getting started let's say of a certain use case you want to solve we'll help you get it installed get the you know first of your several log files in and extract that in and you can start working with it so just send us an email and we've glad to set that meeting up and be able to work with you the other thing I want to just if you go to the next slides pre as we wrap up here I want to thank everyone taking time out of busy day to join us and just consider how Splunk can help start solving those data challenges that you have going on in IT operations we are actually as a follow up to this session as we're going to be hosting to lunch and learns and this is the first of the two on a schedule and if you don't see your city there let us know if you're interested or you know city is close by we're picking a lot of the major cities and Nashville's the first one coming up and then the in May in a May here a couple weeks and then Seattle the first part of June and this is a chance to go ahead and come and meet some of the Splunk experts that will be with us we'll get a chance to talk about your data challenges there's some additional demos and just a chance to also have lunch too so please join us for that and also if you have any questions whatsoever don't hesitate to send us an email and thanks for joining today you
Info
Channel: EffectTechConnect
Views: 34,148
Rating: undefined out of 5
Keywords: Bigdata, Splunk, IT Operations, ITSM, Application Management
Id: pPWmvJTZxLY
Channel Id: undefined
Length: 57min 10sec (3430 seconds)
Published: Thu May 09 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.