Flaky Test Management with Cypress

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone welcome to flaky test management with cypress i'm cecilia martinez uh we'll get started in just a minute just letting a few people more people join we'll be using slido today for our question and answer session you can go to slide.do and event code is hashtag blakey i'm gonna go ahead and just pop that information in the chat as well feel free to introduce yourself in the chat and let us know where you're from uh just make sure that you are sending to panelists and attendees with your if you're using the chat function in zoom all right looks like we have quite a few people from all over today wow yeah chad is busy and again we'll be using slido today for questions sli dot do event code hashtag flakey all right so let's go ahead and get started um so again my name is cecilia martinez i'm a technical account manager at cypress you can find me on twitter at cecilia creates what i do is i spend my days talking to cypress users about their test suites helping them get up and running with the cypress test center on the cypress dashboard and also helping them overcome technical challenges i'm joined today by mike did you want to introduce yourself yeah yeah mike cataldo i am a dx engineer so that's developer experience at cyprus and uh working with you guys uh on how to be uh on being successful and and working with the other teams internal uh with cyprus to make the product better um i'm i live in utah and and uh love being here glad to be here great yeah oh and i forgot to mention i'm uh in atlanta georgia so um what's where our site versus headquarters here in atlanta so uh again so sharing kind of the slides in the chat and then as well we're going to be using slide.d.o for uh questions and polls today so we are going to have a couple of polls so make sure you get logged into slido uh if you'd like to participate all right so uh the agenda today we're going to talk about flake all about flake and i'm really excited to talk about this today because it's one of the topics that comes up the most often when i am engaging with our users uh so we're gonna dive into what actually is like uh how we define it some of the causes of flake in your test suite and how you can manage that as well uh and then we're gonna have some time at the end for a q a with myself and with mike and then again if you could if you're using slido in order to ask questions so if you do have any questions uh please feel free to leverage slido for that if you have any technical issues or you can't hear me or my microphone goes out then please feel free to drop that in the chat we have a few moderators as well and again if you'd like to um access these slides they are public they're at cyprus.slides.com cecilia that's me flaky test management so feel free to follow along or reference these um you know going forward as a resource so we can go ahead and dive right in with what is fleek so we're talking a kind of a dictionary definition um we didn't define a flaky test that um when a test passes and fails across multiple retry attempts without any code changes so for example a test is executed and it fails and then it's executed again nothing else has changed no test code has changed the environment's the same but this time it passes so all of a sudden we have a test that is considered flaky because we don't necessarily know for sure if it's going to fail or pass deterministically so let's take a look at an example so the cyprus real world app is our flagship demonstration app for cypress use cases and best practices it is built by our dx team and it is a full stack react application it's a venmo style payment application that allows you to you know it's um again it's it's an application that you run locally but essentially allows you to send transactions back and forth to each other you can comment on them you can like them very similar to every use like then venmo or cash app and this is open source we make all of our dashboard runs for the app real world app public and you can access it at github.com cypress.io and then cypress real world app and so one of the things that our dx team has done is they've added a branch with examples of flaky tests so that is going to be the example that we're going to walk through today and one that comes directly from that branch on the github repository so our test case like i said this is a payment style application so our test case and for this flaky test is that user a likes the transaction on user b and user b gets a notification that user a liked the transaction i'm gonna zero in on just a portion of this test to demonstrate but in order to create a flaky test in order to kind of force that demonstration what we've done is we've just added um a delay in our code for the likes api so whenever a user likes a transaction there is a random delay before the api responds and actually processes that like so sometimes it can be you know a half of a second it could be it could be five seconds it could be any number so that is essentially how we are creating a flaky scenario and again this is a very real real life situation as well a lot of times we have we're not really sure how long an api call might take to respond so our test case looks like this it is a um psy dot contains and we're grabbing the likes count selector and then we're getting the like button and we're clicking it and we're saying a successful link should disable the button and increment the number of likes so again if the default cypress timeout is four seconds and we have a arbitrary kind of delay set on our api do we know if this test is going to pass or if it's going to fail well if we run it the answer is it depends right so if that delay takes less than four seconds then the test passes if we rerun the test and it takes eight seconds ten seconds but it's going to fail because we're not going to actually see that like notification come by in time so what happens when you have a flaky test like this when you have a test that you're not sure if it's going to pass or fail and it has nothing to do with the actual application it's just a flaky test so plate can impact your test suite in a few different ways it can cause longer deployment processes because you have to rerun tests or you have to restart ci builds if flake causes a failure in order to determine if it actually um is an issue that you need to address before releasing and then there's also a can cause a longer debugging process so if you get a failure and you have flake in your test suite sometimes you have to i've heard this a lot of times people say we have to investigate if it's a real failure or not or if it's just something to do with the machine being down or anything else that's going on and then also it can cause reduced confidence in your test suite so the main reason that we test right is to have confidence that our application is going to work the way that it needs to in order for our users to be able to engage with it so if you have flake in your suite then it brings up questions of do failures actually represent regressions and bugs or on the other side is flake hiding underlying issues in my application or test suite that i might not be aware of and so these are all reasons you know in ways that they can impact your test suite and can kind of hurt the overall health of it so let's go ahead and start with our first poll again we're at slide.do and the event code is flaky so let me go ahead and get this first poll going here and we're going to start with the question of how big of a problem is flake in your test suite all right so again how big a problem is like in your test suite i'll give a little bit of time here for people to answer okay we're waiting for the poll to to come in i wanted to make a comment here on on confidence um you know when you have flaky tests and you know in your test runs are part of your ci uh you know process and you have you know all these you know test failures and you're not sure if it's flaky or not i think a lot of times you kind of get used to that used to that constant failure and eventually become kind of numb to it to a point where you know you don't know what to react to now right and so when there's a real failure you're not doing anything about it because uh you just kind of chalk it up naturally to oh it's just flake you know it's it's it's nothing wrong and so um yeah it's just something that i i've i've experienced and talked to other people about experience um that they've experienced and yeah it's a huge issue yeah it sounds like and it sounds like you know uh based on these kind of pull responses that it's a problem for a lot of the people here and then also looks like in the in the chat i'm seeing some uh some comments as well kind of relating to that experience of you kind of kind of giving up uh when you do have the flake and it's a persistent problem so it looks like for wow um almost 90 percent of the people here it's at least you know a moderate problem and they have some flake uh even more you know almost ten percent it's a huge blocker the two percent that are flake free um you know feel free to drop your your tips in the chat if you haven't you know tell us how you did it um because i'm sure that would be very helpful so hopefully you can still have a few things to learn today uh from the information uh so yes let's go ahead and move forward and talk about some causes of flake and so i typically group causes of flake into three different categories and these are categories that i get identified by you know talking to our users over and over again about their test suite and what kind of inflate they experience uh so i'm gonna go ahead and stop the pull there so the first type of flake that we see is dom related flake so this occurs when there is inconsistency with how elements are rendered to the job or how quickly they're rendered so this can be a result of a couple of different things it could be i will talk about that kind of in just a second but these are some of the errors that you might see if you have some dom related flakes so you may see that an um a side out click failed because the element is disabled you know and it probably shouldn't be disabled but it's trying to click and it's not becoming enabled within that time frame sometimes versus other times you also may see that expected to find an element but it didn't find it that typically indicates that it hasn't been rendered to the dom yet um and then also you know errors about tests failing because an element is detached from a dom and so uh maybe you may have seen that one before as well so some of the examples of dom related flake um we just kind of talked through a little bit of them but you know if an element loads within the side out get time out sometimes and other times it doesn't so that was the example that we were looking at earlier right now this could be because of instability it could be because of the api but ultimately what's happening is the element is not always rendering within that four seconds additionally um you know like the first error we talked about it goes to get an element or when it tries to click it it's disabled maybe there's another process that's happening with the state management and it hasn't updated yet that enabled that button within the time frame uh this is a really common well it's not really common but it's common enough that the dom re-renders between a psy.get and an action command so this is where you'll see that detached from the dom element and it is because there's just a lot of rerenders happening with the application and it grabs the element but then the time it goes to click on it the page has re-rendered and that element doesn't really exist anymore um you know we see this a lot sometimes with with drop-downs especially drop-downs that have search fields so for example if you type in an input field but the application is updating its state management with every keypad key press event you know it may not update in time or the element may re-render and detach from the dom while it's still processing that key press so that can all these are all examples of how dom related instability can cause flake and then there's network related flake so this occurs when a network request responds inconsistently and it can be an internal api it can be your own back end or it can be a third party or if you have a serverless or microservices type architecture or if you're hitting an integration endpoint for something like stripe or you're a login provider then you may experience flake because sometimes it responds sometimes it doesn't so this is where you may see errors you know about too many elements being found because the response didn't come back properly to delete elements uh you can have a timeout reach a retrying for a wait so side.wait has a default timeout of five seconds and if a network request doesn't respond within that time period then you know you can experience a failure even though it's not your actual application it's you know the staging api or it's a third party api it's what you're trying to uh to reach is is inconsistent so again some examples you know a slow api response results in a dom element loading outside of the default timeout that was the example that we used earlier where because we were forcing that inconsistent delay sometimes it did sometimes it didn't additionally like i said a slow api response from a third-party logging provider can cause tests to fail especially if it's the first um if that's the first thing that happens or it happens in a in a hook of some sort that can cause a cascading um issue where all the tests are failing uh a microservices endpoint i see this occurring more and more you know it may have if you have a lambda that you're hitting every once in a while it may have a cold start or a start delay so it fails the first time because it takes a while for it to respond initially but then on subsequent attempts it'll it'll pass because it's now warmed up and it's ready to receive requests and then finally there's environment related flake so this is flake that's specific to the way that you're running your tests so it can be for example if you have inaccessible or inaccurate environment variables in your ci provider if you are running tests across different size machines so sometimes the test you know the application will run slower and it'll cause failures other times it'll run quickly and it'll be fine other times it'll run too quickly and it'll run faster you know cyprus will kind of run quicker than you expect and you'll actually have something that way too um you could have a failed dependency install so something goes wrong and you know the binary of your application just doesn't install properly or something happens there and then uh last but not least if you have inconsistent data management across the environments that you're running tests on so if you sometimes are hitting a staged environment sometimes they're hitting a qa sometimes a dev or sometimes local and you're using like you know the user data that you're trying to access is only available in certain environments then you can have flake and again these are none of this is related to your application none of this is related to the way the tests are written this is only as a result of how you're running the tests and the infrastructure that you're running on and seen a lot of questions in the chat to just remind y'all when you're using slido so slid.do and then flaky is the event code thank you olivia for posting that okay so uh for taking a look at types of flake uh go ahead and uh take a look here at the next poll which is going to be what type of these types of flake uh what type of flake do you most experience and uh you know we have dominated flake network related like environment with like not sure or a combination of all three stuff i talk to people sometimes that experience that um across the board all different types of flakes so anything that you wanted to add their mic on those different types that you've seen out in the wild um i would probably say that dom related flake is is a top one for me um but network related flake i think has been something that a lot of people struggle with that they that they really have a hard time getting around you can put in you know you know weights and things like that uh to try to to try to circumvent it but the fact that cyprus allows you to allows you full control over um network and stubbing and things like that and waiting on on requests specifically that's um that's maybe that's why this is you know for you cyprus users maybe that's why uh network related flake is is so low um because cyprus does a great job at um at eliminating that yeah and it's something um i like what you said is about how sometimes there's just not really a way around it because that's one of the things that's that's a trend too with people that i talk to is that uh with environment related flakes right sometimes you just don't have any options you're you're tied to a certain uh ci provider or a certain number of machines or certain kind of resource constraints um and so and a lot of times too there's issues with the underlying application right so if you have a slow staging api or a slow qa api um that may be something you can't really get around and so uh those those are things that that flake is typically you know if it's dom related physically or network related you know it's it's a indicator of underlying causes and sometimes you can't fix the underlying cause so what we talk about is way are ways that you can get around it ways that you can approach it from the test writing perspective so uh yes looks like almost half uh say the dom related flake is you know the most common for them so that tracks uh kind of what we were talking about and it looks like you know quite a few of you almost a quarter are experiencing all three uh so are a combination of all three so let's take a look about how to kind of tackle some of these so i'm just gonna switch that and i'm gonna go ahead and turn off the poll and looks like we have quite a few questions as well so make sure that you get those in we're going to be doing a q a um at the end here so all right so let's talk about managing flake let's see there we go so we'll start with dom related flake uh because that was one of the most common ones that y'all said and then it's also one of the most common ones that we see it also tends to be uh one that can be a little bit harder to diagnose especially if it's the results of underlying issues in the application and you're not able to you know make changes to the source code if you're kind of just uh stuck with the application that you're that you're given right so there's a couple of things that ways that you can approach this first thing i want to do is talk a little bit about how query command retriability works with cypress so with the cypress syntax this is an example of kind of just a very simple test it is a to do application we're adding two items so we're getting the new to do input field we're typing to do a hitting enter typing to do b hitting enter and we're saying that the to do list list items there should be two of them so i've probably seen this test if you've ever done a cypress testing workshop repository or if you've kind of been in our docs we use this example quite a bit so let's kind of talk about this section here so uh sci.get and we're saying that the to-do list should the list item element there should have a length of two so there should be two list items now a couple of things can happen if the assertion that follows the psi.get command passes then the command finishes successfully so if we get the to-do list and list items and it has a length of two meaning that both of those to-do items that we added are on the page then it'll pass if the assertion that follows the psy.net command fails then the psy.comman will re-query the dom again and again until the assertion passes so this is one of the really powerful things about cyprus that makes it um such a good tool for combating flake if we so say we go back to think about our original test we type to do enter a to do enter b i mentioned before that if you have an application that processes you know each key pressed and it updates the state it may take some time to actually see that re-render in the dom especially if your application is running slowly if you're in a slow environment so if you were to go ahead and grab that to-do list and only one list item was on there and then proceed with the test um it may have a list a length of zero or a length of one and it you know that assertion is always going to fail what cyprus does is it continues again to re-query the dom it gets that element over and over and over again until it passes up to the default timeout of four seconds so just by understanding how that retriability works in the command level you can see how that can reduce a lot of the instability that you have with the dom uh so again on this example here we can see in the gif that's playing that for a period of time if there's a delay uh the assertion's actually failing right it's expected to have length of two but got zero but because of that built-in retryability it'll it keeps re-querying the dom for up to that four seconds and then ultimately it does pass so that would have been a test that failed but because of retriability it actually passed so how can we use this information in order to write more to write tests that are more resistant to dom related flake one thing to keep in mind is that only the last query command is retried so what we recommend is using a single query command instead of chaining so you may see something kind of like you know psy.get a selector so let's say h1 dot contains and then text in this situation um if you had an assertion after it only the dot contains is going to be retry uh same thing with that like a second example sci.getselector if you use a dot it's only the dot it's is going to get retried that initial dot get will not be retried in this in this type of syntax use so if you get an h1 and then it the dom refreshes or you know then it updates you're always going to have that initial h1 and it's not it's going to fail instead what you can do is leverage psy.contains which allows you to pass through the text and a selector or you can essentially do sci.get and then you can pass through a regular expression or something very specific so you don't have to chain the query the queries and so in this case if you were to do psi dot contains selector comma and then the text the entire query is going to retry so it's going to keep pinging the dom until it finds the element that contains the text and both of those are true the other thing that you can do is you can alternate commands and assertions so for example uh in this case we have psy.get a selector and we're immediately chaining adoption on there so we're saying dot should have length of three again that dot get is going to retry until that assertion passes and then we now so if we have something over you know making sure that it has all three and then we're gonna go get the parent now we know that we have the correct element before proceeding so if we were to get that selector and it had a length of of two and then so we know that it's not the correct one then we went to grab the parent again the dot should is going to link to the parent so only that parent is going to retry so by alternating commands and assertions in this way you can ensure that you have the correct element on the dom before proceeding with the next command in the test now this is a general best practice um you know but you can also think about it in the areas that you're seeing flake in your test suite we have an entire kind of a series of blog posts about flake and i but you want to keep in mind that essentially you have to think about what state does the dom need to be in before i can proceed with the action and if you think about it you want to think about it like the user right so if a user is typing a username a password and then a submit button if they type in their username and they type in their password and halfway through the password it freezes and they notice that it hasn't filled out yet you're not going to they're not going to hit submit until they see that everything is loaded right so you could have an assertion on those input fields saying that the value is a certain value before you proceed that will ensure that everything is settled the way that it's supposed to be before you proceed with the test so just think about writing your tests the same way that a user would interact with your application and then as i mentioned we have an entire series of blog posts related to flake um it's just cypress that i o slash blog slash tag slash flake i also wanted to point out this a really great webinar that we had um a while back about using code smells to fix flaky tests in cyprus so as i mentioned a lot of times non-related instability is the result of your underlying application so you can take those flaky tests and you can learn from them and so uh josh justice and gleb in this webinar go through some of the causes of dom instability specific examples and how those are addressed specifically so definitely recommend checking that out as well as the blog post um if you have additional additional examples of how to wait for the right state before moving forward all right so let's talk about network related flake so as i mentioned before network related flake is when you have an api or any kind of network request that are inconsistent there's a couple of test writing best practices that we'll talk about um they're both going to be leveraging sty.intercept uh so with side.intercept you can spy and stub all network requests if you've been using um cypress for a while you may know it as no psy.route which was the previous iteration of this site.intercept allows you to intercept all network traffic and can be very powerful with how you handle both requests and responses so there's a lot of things that you do with with side intercept i'm going to talk specifically about how you can leverage it for network related flake so the first thing you can do is you can wait for a long network request before proceeding so if you know for example that there's a certain api response that's just has a has a really complex query or you just know that that takes a long time you can intercept that request you can essentially spy on it and you can set up a weight so in this code example here we're setting up an intercept for get requests to our api endpoint and we're saving that as the variable get account then we're going to go ahead and visit our page what we have on line five is we are saying please wait for this request to go through and instead of the usual five seconds um you can wait up to 30 seconds because i know that it may take a while now it's not going to wake the full 30 seconds if it doesn't have to it's going to wait up to 30 seconds and so this allows this is very powerful because it essentially says for certain routes that you know may be slow take as much time as you need or take more time uh maybe you know you don't have to three minutes maybe just get a new api or something at that point but um it essentially says make sure that this request has completed before proceeding because again if you go to get the selector and it should have text account name that's going to default timeout after four seconds now maybe wondering why increase the timeout on the network request weight and not just increase the timeout on the dom selection so if pinging the dom is more expensive from a resources standpoint right so if you're waiting for a network request and you know it could take 20 seconds and you have a 30 second timeout you can be waiting for the network request for 20 seconds and then you have the dom proceeding whereas if you're pinging the dom over and over and over again for 30 seconds you know that's not as efficient of a test run so by waiting for the network request first you know that all of that has been completed before you proceed with going ahead and querying the dom and i've seen some very long timeouts so actually i'm kind of curious if y'all can pop in the chat like if how like what the longest default timeout is on your test suite as i've seen some very long ones um so one of the other ways that you can use scilab intercept is you can stub inconsistent network requests to control the response so in this case we are actually not just spying on the request on the route we're actually stubbing it we're saying hey no matter what happens in real life with this request we're gonna say that it was successful and we're gonna tell our application it was successful so if you normally go out to a third party to process your payments for example you you probably don't want to really do that in test environment you don't want to actually process payments um and then not only just because you don't want to process payments because it could be inconsistent and that's also not really a result of your application test that's the result of the third party so you can say hey whenever we're going to this inconsistent url go ahead and just send back a success response just you know tell the application everything is good to go and you're essentially tricking your app into moving forward so uh if you have a inconsistent urls if you have certain endpoints that are always problematic for you you can just stop the response and mock it out that way instead i did want to comment on an additional um thing that i've seen and that addresses the cold start problem uh it's not kind of covered here but if you do have you know things that have a cold start i've seen people leverage ci their ci pipeline in order to make a request to the api to start it up before starting the test uh that's kind of a different way that i've seen people approach that specific issue if you have an api endpoint that you know is gonna need some time it would make sense to start that earlier in the ci process rather than doing it in the middle of a test run if you can so i think the uh that the top one that i've seen so far is maybe 300 seconds nice um i think mine was maybe 90 or 100 so one of the situations that i ran into was a test environment that i was using wasn't being cleaned up so we would be running tests to say like create a user and so the user base on you know in the database of users was getting into the thousands and so there were some tests where it would actually query for all of the the users and it would just take forever and so i would have to use this this approach to get around it but really the response to that wasn't you know pertinent to the to to to the test and so i probably could have just stubbed that out and just responded with you know some fake data um to uh to to get around that but yeah yeah just like a church just said in the chat you should be cleaning up the data not increasing the weight right um yeah and that's yeah and that's absolutely so test data management is something another one of the kind of the top topics that i talk to people about um but another thing to keep in mind too is when the nice thing about side.intercept you know you can pass through a fixture file so if you have static data you can leverage the cypress fixtures folder and just have a bunch of fixture files or json however you like to do it and you can pass that through as the second option you can also do a callback function so if you needed to do something more specific with whatever the response is you can also you know get certain information gets from the response and the request body and you can leverage that in hell in this stuff if you needed to so again a lot of different options inside of intercept definitely recommend checking out the docs there's a lot of examples uh this side intercept on the slides is a link to the api page there so and then so we talked about dom really didn't stability we talked about network instability uh now i want to talk about some ways that you can kind of combat all types of flake so the first that i want to talk about is test retries so test free tries is a cypress test runner feature that will retry an individual test a specified number of times before failing now you can configure test retries on a global level so it will say you know every single test that runs in our test suite i want you to retry it a certain number of times you can also specify the configuration at the at block level so you can say this test only rerun it you know retry it or this stuck only you know in this describe block of this context block then i want to reset the um i want you to retry this this spec because we know it's specifically flaky i would recommend enabling test retries globally unless there's something a very specific strategy that you have in place where you only want to retry it in certain areas so in order to turn on cypress test test retries you do have to be on version 5.0 or higher and in your cyprus.json configuration file the absolute easiest way to do it is just set retries and then the number of times that you want it to retry so you know this will say it'll retry up to one time or three times or five times i've seen again i've seen all kinds of numbers everyone's test suites are different um and then you can also configure it by mode so again the cypress.json you can set it to um retry a certain number of times when you're running cyprus run versus executing cyprus open mode and what retries does is it'll you know when you're using the cypress dashboard so like i said not only does it retry the test and it'll show in the output that it made multiple attempts and then you know on the third attempt it finally passed uh in the cyprus dashboard you can also see the artifacts from the attempts so on the right hand side i just have a screenshot of showing you know an attempt one there it failed attempt two it failed um and so we have the artifacts of screenshots the video of the entire run from those attempts those multiple attempts that took place so i wanted to chime in real real fast so um if you can head over to the to the documentation on um on configuration for more details on this but what you could do is you could actually set um some a local setting for retries so if say you're using cypress on your on your you know local machine and you don't want retries or you want more retries or something um like in open mode for example you can you can do that and it'll override um you know what what you have set for you know for for ci um and that's just something that's local that you don't you know commit to um you know that that you don't commit to to uh to your version control um that's just something that that i've done that that can be helpful yeah that's a really great tip and that's also allows you to you know um if you if you have if you have different um like preferences across different developers right and that's not checking the source control so everyone can kind of have their own control there uh looks like in the chat people are saying usually two three uh one one two three typically tends to be some of the more uh popular responses for how many retries that they have turned on and again this can be helpful for all types of fleet so if you have dom related instability and it misses it by one second you know maybe the second time it's faster if you have the cold start api and it takes 10 seconds and so it times out the first time the second times it's fine test retries will help with this if you have something that goes wrong in your you know um with pulling in the environment variable or something with the environment that where your staging environment is really slow and it's starting to have issues you know the retries can help accommodate that so uh it's essentially a really powerful all-in-one kind of or a blanket approach um i call it a horizontal approach because it affects all three categories of flake uh by enabling test retries and again you do have to be on version 5.0 or higher so when you have retries enabled if you are also using the cypress dashboards the cypress test runner is our free open source test runner we also have a cypress dashboard which is our essentially our dashboard product that allows it builds on top of the test runner and records all their tests to a decentralized location where you can review them so if you have test retries enabled what the cypress dashboard does is identifies flaky tests and flake severity based on the number of times a test is retried so a test becomes flaky if it fails but then the subsequent attempt passes and so like i said test retries must be enabled in order to be able to see flight detection in the dashboard but it allows you to see artifacts from failed attempts on flaky runs so that's an image i just showed that has the multiple attempts and it shows the screenshots from those but the nice part is you also get flaky test analytics so we'll show you the common flaky tests amongst your test suite uh most common errors historical freights and then also the test definition cases so uh just to kind of give you an idea of what that looks like we'll come to this tab here so this is the the flaky test for the cypress real world app so as i mentioned our cypress real world app is open source we make all of our dashboard runs public uh so that's what we're looking at right now and we've added some specifically some flaky tests to demonstrate so we can see here that 20 of our test cases are flaky not great again we've added some purposefully but we can also see that the majority of them have a low severity so again severity is based on how often the test is flaky so our highest severity of flake is flaky with cookie 38 out of the last 74 runs and it was last flaky five days ago but if we click on this individual test case uh we can see the last the runs that it was flanking from we can see the most common errors so we can see that the times that it was flaky we got this assertion error 47 times that it re that it timed out retry and define content appreciate it so that gives you some some like immediate insight that you know what this always has the same error because sometimes you may look at a transaction and it may have multiple types of common errors and in that case like for example this flaky test case has four different types of assertion errors so that may indicate a different type of problem maybe that's something more related to to the network or the environment versus one specific dom element that could be flaky again you also do have the flake rate so you can see you know historical information and you can also see the test code but the other thing that you can do with this information is with the dashboard jira integration you can create a jira ticket directly from a flaky test so this allows you to not just kind of say all right it was flaky but it passed eventually we'll just move on with our lives uh you can actually you know quarantine and address these flaky tests before they cause more and more issues in your test suite and start to build up and also by getting a sense of what some of the most common errors are and having that bird's eye view you're able to maybe see trends and say hey you know what like this api endpoint is causing a lot of problems maybe we should just dedicate some time and clear out that technical debt right so um so that's something a way that you can use the flake detection with the test free tries with the dashboard this is huge um because i you know and i don't know about about you guys but you know a lot of times i'll be you know working on a test and the test itself is flaky but it's a whole different ball game when it's not actually your test that's flaky you're running the whole suite you know locally and it's you know a totally different test and you know maybe you don't want to take the time to to go addressing another test you're working on something else and so this is great because then it allows you to see that flakiness to have it record it's you know it's it's in the history and you can create a a a task to have it you know to be accountable for it to taking care of it at a later time so that you don't have to be constantly playing whack-a-mole with your with your your your flaky test that is the perfect way to describe it it's playing whack-a-mole um and so yeah definitely want to avoid that save it for you know save it for the carnival um and then i also want to talk a little bit about some of the features that we have around flake management so with the dashboard as well you do have github and slack flake alerts so we will post back to the github pull request a comment uh specifying what tests were flaky and a link directly to the flaky test same thing with our slack integration we'll post back to a slack channel and say hey you got some flake so you could even have a dedicated flaky test channel that you're leveraging the slack alerts for and then i also wanted to talk about a new feature we have coming out uh called test burn in so this is going to be a new flake management feature in addition to the flake detection that will actually identify if new tests are flaky before introducing them to your test suite before kind of making them a regular test so what it will do is it will identify that a test is new first of all it's like hey this is brand new test case we haven't seen this before and then it'll automatically retry the test multiple times to configure configurable amounts so you can do it five times ten times um however many times that you'd like to send it for to check for flake because again flake is when a test fails but then passes and so if you run the test once it passes you don't necessarily know that that's not a flaky test until you run it multiple times over and over again you're sweet and then you start to see failures so it kind of allows you to you know battle tests uh sort of stake your tests as soon as you bring them in as soon as you write them and introduce them to your test suite so it's easier to fix early before it becomes an issue before you see you know what like this is i'm seeing this test fail sometimes and seeing it starting to pop up in our flaky test uh analytics this allows you to just know right off the bat hey this test we've ran it 15 times and it passed all 15 times this is not a flaky test in the future if it fails it is a real failure pay attention to it so again just something that's going to help increase that confidence overall in your test suite i'm excited about that because there have been times when i'll run through my tests and just like i hope it passes you know and and uh you know and you know this is this is one of those things where you know it's it'll it'll really give you that confidence like you don't have to you know assume okay it ran through it's fine you know everything's good we're good to go like this is gonna you know really battle test it so that um you know you're you're like like like we're saying that you know addressing it sooner rather than later yeah absolutely it looks like a couple people in the in the chat are excited for it as well so uh yeah so keep an eye out for that um okay so uh let's go ahead and take a look at our uh next poll um looks like i need to um just kind of turn it on here so uh next poll is going to be which of the flake strategies that we talked about will be most helpful uh for your test suite and you can choose multiple ones just trying to get a sense of what some of the things that we talked about what was helpful for you what are you looking to kind of introduce um to your test suite and right after this we're going to go ahead and kick off the q a so you know last chance to get your questions in at slide.do event code flaky and uh just so you know too we will also be sharing the link to these slides so i did see a comment about the screenshots uh thank you for pointing that out uh definitely we'll keep that in mind uh for the future but if you uh want to visit the slides you know you can feel free to view any of those images as well okay nice mike checked every box i love that okay okay yeah so looks like uh definitely all training commands and assertions a popular one absolutely a best practice i definitely recommend getting in the habit of doing that it um can solve a lot of issues um with with dom related flake specifically it's like about half of people are interested in using insider intercept for network calls uh whether that be intercepting and waiting for them or stubbing them and it looks like uh yeah a lot of people also interested in test retries and then around 30 for flight analytics and test burden so yeah it looks like that test vernon's going to be popular mike once it releases um all right awesome so again let's go ahead and i'll go ahead and close this poll out and we'll just answer some questions so again um i'm cecilia martinez i'm technical account manager at cypress and joined by mike who's um our dx engineer on our dx team and so we'll go ahead and see what questions that y'all have now again in slido you can also upload questions so if you see one that you um oops let me go ahead and turn the poll off still there we go if you see one uh that also pertains to you please thumbs up it so we can know which questions our people are most interested in getting answered um i could take this first one here yeah um it says i have flakiness uh five percent of the time because of slow response from outside api we're using for part of our app um how do you deal with such cases um and it looks like this is this is at the at the top of the list and it's it is a common problem and what i would recommend um is stubbing that um if you can you know if if if uh i'm trying to think of it like an example of an external um service what's what's a good example of an external service and maybe i can yeah typically i see um you know payment services is a big one you know using paypal or stripe vibrations um also sometimes just to get like like maps you know sometimes they like data like that if you're hitting an external api in order to get like google maps information or anything if it's not actually built into the application i've seen issues with those as well yeah yeah so like if in the case of say um with a payment if really you're just waiting if the most important part is you know before you can continue on is the status of that payment um rather than you know wait for uh you know paypal to respond to you you can actually with side intercept just return a um you know fake response that includes you know that status or even not you can even stub the you know a bad status and and test that um as well and then you don't have to worry about any of the you know gymnastics involved in in trying to get it to fail um you have full control even if it's not your um third-party um api um yeah that's that's what i would suggest i yeah absolutely you have some i i agree that's definitely recommend in most cases yeah absolutely um we're going to talk about the next one about flakiness in the fore and after hooks please uh so this is a really good question for me to point out is because if something fails in a before each hook it will go ahead and and you know it'll fail it'll fail to run and this is because obviously if something fails in a before each hook it's going to fail before each test right and so what what you want to understand is essentially is why you're experiencing flakiness in that hook and what you're using that hook for so a lot of times i see people use hooks to set up test data right um so maybe they're hitting an api endpoint and they're using that to set something up to set up the state for the test so to speak um so yeah i don't like if you have any specific approaches to this but typically understanding what's causing the flakiness in the hooks and then making sure that you're using the hooks correctly and you're um and you're leveraging it in the right as a way it's intended to be used yeah yeah i agree i think um um you know i'm i'm hopefully i'm i can you know we're answering this but uh yeah if you don't want the whole suite to fail um as a result um because that's an advantage i think is having the whole suite fail because you don't want um to continue on and then have something else be you know be broken because that prerequisite hasn't been met so if it's not necessarily a prerequisite um the thing that's you know that's failing then then um then you might want to put that um you know in in the test that's yeah that's my best uh answer though yeah yeah i agree if you have a failure in a before each book it makes sense that it would you know set up the entire run uh there are certain things that you can do to tap into certain types of events like you know on before like a test failure to if it's a certain error type you know things like that but in most cases i would say you know think about what the use case is and whether having it in the hook makes sense versus in the test so yeah i always err on the side of putting things in a hook if i feel like it's not necessarily part of the test it's like i was saying before like a a prerequisite of that um that way it it really helps delineate um between the test and and what's required for setup but it also takes advantage of that um of that you know how hooks work and and how it fails the the whole thing yeah just a reminder y'all can upload questions so that they pop to the top if you see any that you like um and then also make sure that you're using slido for questions and not the chat so um this next one here you know click the test because an element won't load the first time uh in my opinion i test retries you know would handle that situation also if it's not loading the first time that may be an application issue um maybe i would definitely want to check that out but a test retry would like automatic retries that would um that would address that issue specifically and then how to handle flakiness in iframes do you want to talk about our iframe roadmap uh mic at all or i know that added some additional support there to uh to test run a roadmap um i'm actually i'm not sure okay yeah so um so as far as flakiness and iframes again uh retries is going to help here um that's if you have an issue the first time i did also want to basically i was kind of hinting at it was we've added iframes like full iframe support to our roadmap for the cypress test runner so uh keep an eye out for the issue uh on our roadmap we show both the github issue that we're tracking as well as if the pull request has been uh submitted so you can always kind of check that for the latest updates on support for those new features yeah and if there's something specific um you know feel free to reach out to us um maybe we can put something in the in the last slide um yeah information on how to how to get in touch with us yeah yeah that's a that's a good question yeah that's a good point um so the um next question here can we increase the timeout in every increment in retries wow that is i haven't seen that before um so typically it sounds like what you're saying is if it's on retry number two then please increase the timeout to 60 seconds um i know that there is a if you can if you can leverage the module api for site with cypress in order to count um the number of attempts so there's some you could probably code it where you're saying if attempts to then reset the cyprus config default timeout value to 60 instead of 30 um but it would require some customization so there's probably a global i i'm sure there's something in the global context that that states um which retry we're on um but i'd have to think about that one it's an interesting question i i actually like that and if there's um you know if there's enough demand for it we might we might be able to come up with with something that um that everyone can you know can take advantage of um but yeah i'll have to think about that one that's an interesting question one actually somewhat related to this that um in case you guys don't know is um is you can actually increase retries at a per suite or test level um just by adding um another argument it's the what is it the sec i believe it's the second argument to an it or a describe um it just you just do you just add retries in a in an object and you can set it to whatever you want and it will just apply to that um to that test or or or suite nice thanks yeah so i'm kind of looking at some of the other questions here um how do you imagine hearing flaky tests i depend on third parties without sacrificing confidence by mocking um so i've said that a lot and so you know if an interest if a test is inherently flaky because it's dependent on a third party api you want to think about what it is that the purpose of the for what you're testing so if you're testing your ui right and you want to make sure that your ui doesn't have any regressions because you push code is the you you know that's does the ui test really need to test that third-party api versus if you're doing an end-to-end you know full end-to-end test or if you're doing you know like contract testing where you're actually specifically making requests to a third party to test it that may be those may be better applications than using a third-party api in your ui test because really the purpose of the ui test is to test your ui um and then as far as if you something i don't know if you've if you how you feel about that if you agree with that kind of approach um yeah i have some well i have i have an idea so one thing that you could because i've always erred on the side of pure end to end in that i i won't mock anything out but obviously um and that was just based on my own circumstances in the teams that i've been on in the past um but one approach that you could take is by having actually two versions of the test one where it's mocking that you could run most of the time and then have another one that actually hits that real back end and that way it kind of gives you a little bit of both um and because i i can understand how if you're mocking um a third-party service and then that third-party service changes but you don't update your mock now you're now you're going to have some problems when you go live with that um so i think that if you have a um you know another you know true end to end that hits that back end service or perhaps another test that just makes sure that ensures that your contract um with that third party is is is right that you're making the right assumptions um on that that third party service that that could be a you know way to go yeah that's great yeah thank you uh so we are coming up towards the end of the hour so i'm going to um kind of i want to answer one more question it's another one that got a lot of um a lot of up votes just in it specifically around the fleet heinous versus the question which is um kind of different but my team is experiencing increased flakiness the more we scale our test code what's the likely causes so i have some ideas but i'll let you go first mike um i mean i don't know what that scale looks like but maybe if if the flakiness is increasing because the the test sizes the the the spec sizes are increasing um it could be happening as you scale because um like what i experienced before you know where um it was environmental um those are the uh those are the those are the only guesses that i have those are the top two that i see honestly uh so the number first one you said is that the test um when spec files become really long so if you have i would say maybe like just yeah it kind of depends like long as it's a little bit arbitrary because it depends on um if i want to say thing to run but the more and more it blocks that you have this back right the more tests you have in a single spec file um you could over time start to see that those tests start to run slower which means that they're gonna probably having performance issues so the application may be running slower it may start crashing halfway through and that's gonna cause flake uh to that point as well if you have infrastructure issues on the machines that you're testing so as the test suite starts to scale uh it's the resources on the machine aren't enough to actually run your application and the test and cyprus you might need to like bump those up to prevent crashing or prevent flakiness there um the other thing that it could be as well as that i've seen is if you depend on what kind of abstractions that you're using in your test code so if you're using custom commands or utility functions or selectors that are getting harder and harder to maintain or if you have some things that are hard-coded into your test and a lot of changes are happening on the code side that if you that could be breaking things um definitely worth taking a big picture look and saying hey how are tests organized you know how are we splitting up our spec files how are we leveraging abstractions how are we leveraging um what i like to call the low hanging fruit which is things that you're doing in the ui in your test that you don't actually need to do uh logging in is a big one so if you're logging in via the ui for every single test that's inefficient and over time once the more and more tests that you have the more resources that's going to use in your application if all you if you're only logging in because you just need to set up state for your test you can do that programmatically either you know using api using your front-end state management tool if you're using one so in the cypress real world that we have three login commands we have login by ui login by api and log in by xstate and we leverage them throughout our test suite in order to be efficient when we can and that speeds up the test run prevents you know performance issues over time as this test week gets bigger so awesome so um all right so we are out of time thank you all so much for the great questions great comments uh so much participation today um we will be taking a look at all the questions you get saved in slido so i'll be taking a look at some of them and kind of answering some of them on twitter so again uh if you'd like to follow us i am at cecilia creates on twitter and mike is mccateldo uh on twitter yeah this is two c's there you gotta keep that in mind and then also you know feel free to reach out to us uh we are at cypress hello at cyprus.io is kind of our general email address but we're also on twitter and we're also on discord and you can get both of those links in the documentation as well so uh thank you definitely recommend discord reach out to us there and uh yeah stay stay active in the community we're we're we're here for you yeah and cyprus is what it is because of the community because of all this great participation because of our you know open source contributors and so uh definitely please keep uh giving us feedback giving as questions we're here to help so oh yeah people are asking for the discord links let me just see if i can grab that really quick for y'all matt got it for us thanks matt oh awesome thank you thanks matt great awesome well thank you so much everyone for the time and um have a great day see you later
Info
Channel: Cypress.io
Views: 4,025
Rating: undefined out of 5
Keywords: Flaky Test, Test Management, Testing, Cypress, Cypress.io, Test Automation
Id: AhgkBjOF5Ts
Channel Id: undefined
Length: 62min 32sec (3752 seconds)
Published: Thu May 20 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.