Performance Testing in Agile and DevOps by Gopal Brugalette

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so as I said I'll be talking about for how to do performance engineering in agile and DevOps I work for a company called so Phi or social finance and we are an online financial services company doing all the sorts of things that you would expect of banks a little bit more about me I've been doing performance engineering for quite some time ecommerce financial SAS consulting but before I got into iti was a nuclear physicist I did experiments in Japan detecting solar neutrinos as well as accelerator experiments over in CERN and in the United States looking for quark gluon plasma and trying to understand more about the Big Bang I also live on a farm that's me on my tractor I grow a lot of fruits and vegetables and can them to make jams and preserves and pickles and sell them at farmer's markets and to my office mates and I also have a woodworking business that's when the puzzles I make I make um two core items furniture toys and sell those online and at craft for craft fairs as well so I stay pretty busy the traditional approach to doing performance engineering are performance testing that in the standard waterfall way is you do all your development and testing and then maybe you do some performance testing and you go into production or more likely you do everything you put it into production you realize oh I've got a performance problem maybe I should do some testing um so that doesn't work very well for a number of different reasons there's a lot of technical challenges but also fundamentally and once you think about in this process we are trying to test performance into the application so we're just doing everything and then doing performance at the end and hoping that we are performant there's different ways to think about it and as I said there's a lot of technical challenges in doing this type of approach to performance testing performance environments necessarily need to be very large-scale so they're very hard and expensive to maintain and just getting code into there is an ongoing challenge the data volumes that you need to deal with are quite large again expensive and hard to hard to generate hard to manage and of course in any type of automated testing you have the scripts that you have to constantly maintain and update and that's just a never end never ending ordeal so all of these things mean that you spend a lot of time trying to do your performance testing and all of these challenges bring into question the reliability or the accuracy of your of your test results so we need to start thinking about performance differently and normally people will just categorize performance as one of those non-functional requirements right and that the very word itself is very strange non-functional literally means does not work and so essentially what that ends up mean is nobody works on it and you ignore it and that really doesn't work very well for the customer what you really need to do is start thinking about performance as a feature because your customers or your users certainly do write well not a lot of customers or you know we'll give you a five star rating in your app store because you're really really fast but there will be a lot of customers that give you a one-star rating because you're really really slow and they're not very high they're not hesitant to go on to social media your Facebook page you know Twitter and just complain how slow your application is and your website is so clearly performance really does matter to the customers and if you're in e-commerce you can see in this sort of canonical ecommerce conversion rate graph you know as you get your load times increasing your your conversion rate decreases substantially right so again this is the way people tell you that performance really matters to them and as you started to think about performance as a feature then you really understand it it really becomes clear that everyone has a responsibility for performance it's not just performance engineers or or QA engineers and with everyone you know as a feature it's important to your customers it's important for everyone to think about it it means you really need to think about performance throughout the entire lifecycle of your product or your application so what does that look like and again now we're not just thinking about performance and testing we're thinking performance all the way at the beginning even in even when you're even in the realm of business or marketing so everyone is you know every company is generally trying to grow that means increasing usage increasing customers you've got to start thinking about you know that growth and what are your growth projections look like and what does that mean for your scalability in marketing campaigns are especially tricky and even recently I got dinged by a campaign we decided to purchase the naming rights to a American football stadium so that has a very huge audience the marketing team was very unclear about that this was going to happen we got a couple like four hours notice oh hey by the way we're making this huge announcement you know millions of people in the United States are gonna see it and they're probably gonna all go to our webpage so we did some testing we kind of you know it didn't go very well or we thought it went well the marketing campaign launched and our site went down I'll talk more about exactly why it went down but it turned out that you know not only did we only get a few hours notice but they actually didn't even tell us the entire camp it wasn't the announcement of the naming that brought us down but it was all the commercials that we put on television throughout the day that we didn't even know we're coming so you know definitely again right we need to think about performance and scalability you know very very early on and as you get into your plant you know and then again with this in mind as you start planning out your products probably planning out your development be thinking about your SLA s and this is really important again don't let it come at the end I had an experience where I was we were testing a page this is our product page very important page and the testing found that the load times of the page increased by half a second so we said you know if you think about that curve I showed earlier a half a second you know not not a good not very desirable so we said well this can't go live into production the product owner came and said oh you know I want this product live I've got a date I've got a target I want alive we said well it doesn't meet our SLA you're you're not going alive and he said well you know how bad is it I said it adds a half a second to the page load time he said well half a second doesn't sound that bad I'm okay with that I'm the product owner go live I need to hit this target date and again I said no it's not within our SLA you're not going live and he challenged and said what you know who who are you you're just a performance architect you're just a the performance team I'm the product owner I make the decisions it's going live so I showed him that graph and I said you know you're right I'm nobody but here's our customers telling us what the SLA is and they will not accept a slower page so you're the product owner if you want to put your page in half a second slower and cost us millions of dollars go right ahead right so obviously he didn't go live we fixed it and it went live you know in a week or so but you're having this conversation about SaaS early you know and being aware of that and then as you get you know another common question or challenges there's no way you can do performance in agile right because again performance testing has to come at the end and agile is really fast performance testing by nature is really slow you can't do agile you therefore you can't do agile performance right well you can obviously that's not true there's many ways to integrate performance engineering into the agile process I was working with a team and again they said well they were just starting agile and they said well we're just starting agile we barely know what we're doing to think about performance testing go Paul you're just a dreamer you're just an architect this is real life come on we can't do ad we can't do performance testing and agile I said well let's start small when agile definition of done is really really important so I said let's include the definition of done into your agile or let's include performance testing into your agile definition of done in which case they said again that's too complicated we can't performance test everything I said no we're not talking about performance testing we're talking about performance engineering and let's just start by making sure that every single story that you have you think about is there a performance impact to it right that's that's the you know a major major step so they agreed okay we can do that right we can think about whether a story it has performance impact so I worked with their JIRA we were using JIRA I work for the juror admins and we added a little checkbox into every story you know check thought about performance impact or not and when there is a there was a performance impact then we went ahead and created stories in JIRA to handle all the performance engineering activities and this has a lot of benefits because it you know it integrates it into part of the sprint it makes it part of the sprint planning the sprint grow mean the Sprint retrospective right you just bring it in and again instead of just making it something that comes afterward that the team hands off integrate it all and integrate it into everything and again you know formance engineering is not just about testing when you get into production you need to have a lot of monitoring you can start creating those monitoring create that create the stories for those right in your sprint and get them get them part of your work design now this is very very important think about scalability think about things like horizontal scaling implementing a load balancing think about caching caching it's something very very easy but I'm always surprised with how often people like to hammer their database to get static data or serve static JPEG images off their web servers instead of implementing a CDN and that that by the way is how our our page fail our website failed when we did that stadium naming announcement there was an oversight and our CDN wasn't configured correctly and we weren't caching any any assets so everything went back down to origin and you know brought our web servers down and think about testability in your design as well testing again should not come as an afterthought but make it there's no reason and I I really think that saw one of the attributes of software should be that it is testable easy to test so again like you know when you're when you're especially when you're doing automation testing performance testing managing data is very hard or getting those scripts written is very hard there's no reason you can't you know make your code easy to interface or interact with your automation tools or your performance testing tools and things - like I'll talk a little bit about testing and production there's a lot of challenges and testing in production but if you actually designed your code to allow it to test in production to be able to identify test transactions and handle them appropriately it becomes actually very very easy but this is things again that's hard to do at the end it's much easier to do in the beginning and as you build observability is very very important you want to be able to collect useful information out of your site what you know out of your applications while it's running I always surprise just simple things like a great naming a good naming convention can help I've worked on sites where every single page has you know basically every single URL and every single page is essentially the same name or every service call has the same name so you can never tell exactly what the users are doing it just looks like they do the same thing over and over again and of course when you're trying to understand your user workflows get some metrics troubleshoot it's very very hard when your logs just show the users do every everything the same and resiliency is is something that isn't thought about very often I know there's some talks about chaos engineering coming up tomorrow there's a lot you can do to just better handle failures and simple things I worked at a company where they just left the default time outs of like five minutes I think on the web servers so things whenever anything slowed down every application just all its threads started to wait it became thread starved and you know the whole thing would lock up you know just simple things a funny story there so where I was really pushing you've got to set good timeouts you know nothing longer than 30 seconds but as short as possible so we like we deployed a page and all of a sudden immediately upon deployment that page just got 30 seconds slower and took us a while to figure out it's just exactly like plus 30 seconds well we finally realized that the developer had instead of putting in a 30-second timeout he put a 30-second wait into the code so at least it was the easy fix just think about how you can handle failures and eventually you'll you know you're doing agile you're going to want to do see ICD testing and and again the question how can you do performance testing in C ICD right because performers tests are big they're long they're slow well think about a different type of performance test one that's very very fast and just focused on a few key metrics it's not a scalable scalability test it's just a performance test where you're really establishing baselines of your code running a few transactions against it do not use average average is very bad especially I were just always bad for performance but we can if there's questions around that we can talk about it but you want to use like a 50th percentile and it's much more accurate but then you can either set it but in addition to talk and in addition to thinking about response times you can also look at other metrics like accounts like how many calls to third-party services are you making how many calls to the database are you making because these things can changes in these counts and these call numbers can have huge impacts to performance and scalability across your entire architecture and they're very easy to catch if you're looking for them and you can do them in C ICD catch these big issues and you know let smaller issues through that you can handle quickly when you're trying to test very fast and do this analysis it's very helpful if you can build a framework to make it very easy to do this testing there's a number of different open source and off-the-shelf or commercial products out there J meters big J meter and gatling or be a go consorting tools of course you can use something like get pipeline or Jenkins to orchestrate integrate with your APM tools like dynaTrace or data dog or any way so some other ones they're all you know they're all out there and then have some you know and then have a way to analyze it very quickly you can also use tools like elastic or Splunk to help with that and then you know do your C ICD testing your small-scale testing there is still a place for laboratory a large perf environment testing but it's changing now the attitude of you know don't try to test everything the reality is that 99.999 at least three nines of your check-ins of your code commits are not going to have any performance impact so don't try to test them it's just kind of a waste of time and money but what you do want to test in in this type of way is your major architectural changes right like okay you're changing your framework you're changing your database you're changing a database driver you're introducing a new a new service or a new application or you're trying to test for your unique events like you know like in the United States we have Black Friday and Cyber Monday or we have you know you might have big sale announcements you might have big registration events or all your customers are coming or all your users or you know making some choices or having to register these are types of things which it is challenging to prepare for if you can't test in an experimental way beforehand and I'm also a very big fan of testing and production it's really great because you know your your test environment is perfect it's a hundred percent accurate you you do in most cases have like free test scripts and test users your customers you know they kind of pay you to test and there's a lot of weight but there's a lot of different ways you can do it safely familiar with like an a/b or blue-green deployment or a gradual rollout where you redirect like 5% of users or 1% of your users see how it goes and then ramp up 10% 50% and eventually up to a hundred percent you know where you're gathering metrics and information all the time it's also a great place to do synthetic load testing again because your test environment is as perfect as you could want it but as I said you know it's important to design your code for that so I was an e-commerce website and we wanted to test for cyber monday which is our biggest sale day of the year so we actually put code in that could recognize a test transaction and route it appropriately so that we never actually you know shipped shipped items out anywhere I never went to our inventory management system and we were able to actually you know when I first said hey let's do performance testing and production management thought I was crazy they just looked at me like you're crazy we might just fire you right here now for being unreliable but we got to the point you know over very very careful testing where we were running cyber monday load which is our peak load of the year twice a week in production without any issues so it was very and we found actually a lot of issues that we couldn't have found otherwise because we were literally testing in production and then real browser testing is very very hard I think there was a talk earlier this today to talked about how to do browser testing it's very very hard in any case and so you can do something called real user monitoring or real user testing or browser testing where you can collect those browser performance metrics from all your customers or all your users right in production it can be very very effective if you can react fast enough and of course you you never will be able to or should be able to test everything but you really need to monitor everything this is you know beyond just testing you're in production you need to monitor what's important and that's the customer experience it's going beyond just CPU and memory utilization monitoring and think about you know what what are your customers seen and how is that affecting their behavior and how does that mean what does that mean to the business right so metrics like how many orders are you placing how many reports are you compare your users completing how many accounts are they opening right things like this you won't see from CPU and memory I've seen lot of failures where a system is down you just can't do anything CPU is like 5% utilization right it's like perfect it's right if you're only looking at if you're only looking at CPU so make sure you monitor what your what your customers are doing you know obviously you need dashboards to do that and then here's like the very simple formula for creating alerts look at your production look at kind of like what your peak level is for like your CPU or memory or something set your alert threshold just a little bit above that you should be okay enable your alert then go into your email and set up a rule to delete all the alerts that you're going to get so maybe some of you have had this experience but sitting alerts is kind of more of an art than a science but there are there are some really interesting machine learning applications now which can help you dynamically manage those alerts so if you haven't if you haven't checked those out a lot of the tool vendors are putting them in it's definitely a good thing to look at so I've talked about how did you a july to do performance engineering in an agile environment I want to talk a little bit about how to do performance engineering in an agile way so the big thing when I when I think about agile you know one of the one of the agile themes is individuals and interactions over processes and tools so typically when teams implement agile they think agile is this right you have to do scrum and you have to and you have to use all these terminology and you have to do project planning and you have to do this and you have to do that that and you do all these things and you're doing agile but don't be like an agile zombie there are there are different ways to approach agile so when should you do scrum no this is not for development I'm only talking about performance engineering when is scrum applicable it's when your performance engineering team is strongly aligned with a development team that's doing scrum so if you're in an engineering if you're a an enterprise-wide engineering team this may not be the case but if you can't align with your development teams go ahead and do it because then you get very predictable planned work and then all these scrum type approaches really work well and if your team if your company or your team is new to agile scrum is also a little bit a little bit more friendly but there's there's still a lot of ways you can go wrong with scrum I found a lot of success in using Kanban why because when a lot of times as an enterprise performance engineering team you're not aligned with development teams so currently I work with about 50 different development teams and they're all on their own schedule so my workload is very unpredictable if I was to have Sprint's that start and stop on specific dates and I have to go through all that process planning it wouldn't work very well so now stuff can just come in and based on the priorities we can just start working it or push it or push it into our backlog we don't really need a lot of planning and go through all those scrum ceremonies because we always do the same work right with performance engineering we meet with them we get it we come up with an approach we update the scripts we execute the tests analyze it and then we're done right we don't need to you know finger fingers and voting and t-shirt sizing they have they have their place but to think about if you really need to go through all of that it's very important to manage your backlog especially in scrum I always think of the backlog is like a black hole you put stuff in there it never comes out and it just gets bigger and bigger and bigger so be very careful about you know don't use your backlog as a dumping ground and be very very careful about managing your whip or your work in progress so one way to manage your whip is another different technique how many of you are familiar with like a hackathon so a first hackathon I did a few years ago and I was just really like blown away with really impressed with how how this team was just able to come together really focus and accomplish a lot of work so I thought to myself well what if I did hackathon it's not just once a quarter or once a year but what if I did it like almost all the time or a few times a week and so I started calling them perfect ons and we started just doing that or the team which has come together take a problem and work on it you know intensely and get it done and that's actually you know an official work or a recognised work method it's called swarming or mobbing where it's completely different than what a lot of people are used to instead of combining instead of combining or instead of what's weird I can't think of the word now but instead of dividing conquer it's combine and conquer and this is really great too when you have a geographically diverse team it's like currently I have team spread out across North America as well as Europe and so it's a way for us to all come together and work together and combine our varied expertise and background levels and get things done very very quickly as well as train and share the knowledge and this is something that it takes a little bit of practice you know give it a chance if you want to try it again like it seems sort of counterintuitive but it can really work when applied appropriately and again think about like a hackathon which is very familiar approach and a little bit about putting together a performance engineering team this is kind of what I've sort of settled on you know you have an architect that provides that technical that leadership and vision and it does some of the the TPM or technical project management work engages across the across the enterprise that that's usually of course I always have one because that's me so I can't really be on a team without that happening but I think that role is important you've got the performance engineers which are really focused on and now data analysis is there's a key skill for performance engineering as well as the other things it's really great if you have a manager handle all the HR stuff and the politics because that's really no fun for all the technical folks and if you're managing a test environment or you're developing tools and frameworks having having a developed type role it can be very useful I haven't seen a lot of success in bringing developers into performance engineering teams long term I've done it a few times there was get bored and they start like coding crazy stuff and then they leave and we throw it away so I haven't again I haven't seen that very well and then QA performance engineering is not like tests it's not the same as functional testing so it can't be challenging for people who really like QA testing to get as a transfer over into performance engineering because the primary skill is the results analysis not actually the testing and so you know it's all live you're all here because you're interested about performance but I recognize that not everyone in your companies when you go back are going to be so interested so and part of that is you know their prioritize responsibilities they've got to do all this stuff and you know and then you go back you seem you got I think about performance or like I don't know so if you want if you want to be able to influence people at your company you know it's really about educating and enabling them in performance and what a key way to do that is keep score present interesting metrics present reports that are relevant to the teams and relevant to the business when teams don't even know what the performance is they're not what their performance is they're not going to be very interested but as soon as they know it's now you know it's in front of everyone its data based decision but make sure you target the right teams you know not every team not every application on every product not every service performance this performance matter so if you ever webpage that just displays like your you know your your company's locations and hours or your store locations and hours maybe that's not really you know so important for performance whereas like a homepage or a product page a login page is so if you focus on the right teams you will get much better traction and get much better results when you do and when you start focusing on them you know you really have to get in there and get hands-on you can't just stay off and be like a ivory tower theoretical center of excellence you really have to be able to do work that's what really gets things going so start by writing their scripts run their tests for them showing them how to analyze it and then eventually train them how to do it you know train them on the tools and then and then handoff to them but just telling people to just oh you should do performance I haven't seen a lot of effect with that but saying you should do performance and we're gonna get you started and write all your scripts and execute your tests works very well and when you have successors or when you have failures write them up and so this provides a good medium for you to share them a couple things one is you know you're working with with a little team or you're working with one team you need to spread out you know both all your findings and your lessons learned to the other teams as well as advertise your services or what you can do or why performance is important and what you've accomplished it's also important there's always a lot of turnover in most companies and IT and you know I constantly think like okay I've worked with every team now I'm very successful everyone knows about performance and they're really excited about it well new people come in and they've never even heard of the performance of performance or performance engineering and so this is a good way to sort of you know embed it into the culture have some cultural assets that you can indoctrinate new people in so if if you if you just have a few takeaways from this talk you know think of performance as a feature and think about performance across your entire lifecycle thank you [Applause]
Info
Channel: DATA MINER
Views: 1,887
Rating: undefined out of 5
Keywords:
Id: LH7Gz7eiy3Y
Channel Id: undefined
Length: 33min 10sec (1990 seconds)
Published: Tue Oct 29 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.