Entity Framework Community Standup - Performance Tuning an EF Core App

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hello welcome to the ef core community stand up hopefully we are live here we've had a few technical issues in the last few minutes so if we're not live guys okay we're not live because it says live on my thing forget that didn't happen and we'll start again right [Music] welcome to the ef community stand up we have a kind of perf focused uh session today um john p smith is with us author of uh entity framework core in action and he's gonna show us performance tuning um of uh an application using ef core which is uh great and then we have jeremy uh myself and shai from the af team here as kind of per normal um so since we've got quite a lot to cover today we might run over let's uh let's get started and do the state of the unicorn i can hide that so what are we what's going on right now um hopefully everybody's aware on this call that ef core six preview one is live on new get now um it has um about five thousand downloads as of this morning um it's uh we haven't had any bad feedback there's no real reason to do it best version of vf call has ever been because it's got all the latest code in so get the daily build now or get preview one and start using it also 503 is out there with with patches um so if you're using five zero make sure you upgrade 503 to get the latest bug fixes there um so that's the f core uh six preview one what is the team currently working on so andrei is currently working on uh metadata refactoring for compiled model so if you look at our metadata infrastructure it's basically set up to allow different access to different kinds of model at different times so when you're running the application you get basically a read-only version of the model not mutable but when you're building the model it's mutable that's all been done with interfaces and extension methods and it's being refactored so that but currently you get the same model at runtime as as you do when you're building with the compiled model the actual implementation will be different under the surface and so andrew's doing a bunch of refactoring to it to make that happen um you see if you're interested you can look at a bunch of pr's from andre uh maurici is currently working on temporal tables prototyping so he's you know we have the design uh discussion kind of on the issue uh there so go read that if you want to figure out what we're planning to do maurici is now making sure that we can execute those queries and they do what we expect and it all works so that's pretty exciting um perth we're working on perth in fact shai is working on perth and in fact there's this great issue where this is leaked from you know the the af core six plan and everything but this is basically the issue that shy has for improved performance on tekken power and down here you can see we have the pull requests and the improvements they made you want to quickly go through this show sure i'll just a few words so perf is obviously very complex um this specifically is about improving ef core runtime performance so this is not about uh this specific thing is not about generating better sql for example which could also improve ef core performance which is also very important that's covered elsewhere this is really about reducing the overhead of what happens when ef core sends queries for you it's also very specifically about uh uh the the what is called the teca power fortunes benchmark so this is kind of like an industry standard scenario which is used to measure uh you know basic web performance with a database backing it and so on uh it's something that a lot of people look look at and i th it's important for us that uh this before this specific kind of scenario which is also a common user scenario and that's an area we don't want ef core to add a lot of overhead so our like let's say our goal which we we're not sure we're gonna be able to reach is to have the same kind of performance as dapper which uh which is a micro orm where you specify raw link sql it's a very very thin kind of thing compared to evcor which is a very different kind of beast so it's a very ambitious kind of target but if you're interested in you know in the kind of improvements take a look at this issue and you can look at the specific prs um this is basically about making eve core just run faster for everybody doing queries with with with the of course especially non-tracking queries but yeah excellent and you can see here that this this uh we started um with 69 600 uh requests per second dapper was at 93. even with this uh the the low hanging fruit i guess you would call it that we've done so far we're up to 80 000 um so we're already creeping up on that performance um so it's good to see that whether we'll get there you know shy and i were talking about that this morning but we've got lots of ideas still to go so it's looking good um okay let's go back to here so that's perth dock of the week so we have a dock of the week this time we're not going to spend a lot of time on it but this is uh performance guidance free of course so we had a ef6 white paper this is essentially the same kind of guidance for ef core on how to do performance um here it is you know shy do you want to just say a few words since this is your doc basically sure uh so it's it's basically you can see this is a top level section in the ef docs if you look to the left it's very easy to find um i'm not gonna go there's a whole lot of contact here it's basically split into different uh sections i think the most important thing is to first read the the introduction and the diagnosis the diagnosis is about how do you know if you have like a problem right what do you do in order to even pinpoint so i know something is running slow but how do i know which which query is problematic for example so that's going to give a lot of tips into how via logging via you know tracing via uh the ssms if you're using sql server via which kind of tools once you know you have some sort of problem then you can go into the efficient querying kind of that's that's by the way an execution plan so this is a good way of knowing whether a specific query is maybe problematic and maybe you're missing indexes for example that's like exactly how you're going to find out then you have like once you've been pointed a problematic query then there's a whole lot of tips if we could go into efficient querying like super quickly again i'm not going to go through it but you can see this some of this is quite basic for people who who have done you know database performance and who know how database databases operate but it's surprising how much you know in issues we see people who kind of don't apply a lot of the basic things because it's very easy to miss it it's not because they don't know it's you have to think about it so there's a whole lot of like very important kind of you could look at it as tips and tricks or pitfalls to avoid and all that kind of stuff basically look through it it's supposed to be very accessible this is not a hardcore kind of under the hood thing it's really stuff that you as a user can apply very very directly in order to make your application like work more efficiently there's another way thinner section in the updating like efficient updating because there's generally a bit less to know but still there's there's some some interesting things um uh there's again one more for modeling and another one for advanced performance topics i'm not going to go into it because it's just take a look at it everybody i anybody who's interested in perf i i really recommend it there's also a whole lot people who are kind of interested in how eve core works internally perf is always the best way in a way so for in order to do perth well you have to also understand exactly how things are working that's the only way you're going to understand why things are slow or not slow right so this also gives quite a bit of insight into how you of course actually does stuff with your queries for example how it what are the different phases of queries you compile it and then you cache what you know the outputs of that compilation and then you execute it so perf is all around that kind of stuff for people interested in the under the hood i think the perf documents are documentation is a pretty good place to get that i think that's that's enough absolutely thanks shai so uh that's the state of the unicorn let's um let's go and look at our community links now um so i have them up here and uh the first one is what we just looked at so if you wanna if you didn't catch the link to that you can go find it for my community links um second one this is uh a post by uh calendar at i believe at jetbrains um and um this is a let me yes i agree to that um this is a really good just overview of things to look out for when you're using the f core potential pitfalls things where things have changed things that you might not think about um you know i'm not going to go into all the details here but there's a lot of there's a lot of great information here um that you know coming coming from outside the team has a slightly different perspective than what we would write but um but there's some really good stuff here so definitely go check that out um the next one we have on our list is uh hot chocolate and strawberry shake for graphql i'm going to let jeremy talk to that sure so this is a episode that you would listen to a podcast but we've been doing a lot of investigation of graphql and especially how it fits into the.net ecosystem you may have seen some of my ad hoc twitter polls and hot chocolate is one of the the solutions for net it provides a server and a client and this is a in-depth talk about uh how they relate to dotnet and graphql the problems they're looking to solve and there's some things like how they are trying to maintain parity with the the schemas for graphql etc so definitely a good listen and and we'll be posting and sharing more graphql content moving forward absolutely good stuff um so this is a code article magazine by julie lerman and there's a there's a lot of good stuff in here basically um it's mostly about viewing looking at debugging metadata and change tracking and queries i suppose so there's looking at query strings here how you can see what the query is generated and we've gone over that before in the community stand up logging details so the things that we log and how we do that um responding to events so we've got various events here so there's some good information on that metrics with event counters so some good information on using event counters there interception and uh then also debug view is one of my favorite features mainly because i use it myself so much when i'm debugging issues um so if you don't know about debug views somehow go read this because they're they're super useful so good stuff from julie there um oops that's the the wrong thing so i just wanted to jump in and add we haven't posted this out there yet but she will be joining us on our next community stand up so we are opening and ask me anything and uh so it should be great and uh we'll we'll share more details about that after the show yeah i'm looking forward to that um finally on the list uh we have the reform program which is john p smith and his performance choosing free of core which he's going to go over with us today so if you want to find the link to it it's right there on our community links and then as always we have the link to former uh af core community stand-ups here so go go back and look for the ones you've missed um and i think with that we can pass over to john and let's start on the real content here okay um so yeah i've put this up at the beginning this link here will take you to uh an article that i've written that goes with this um uh video because there's a lot to cover right so i i want to make sure i get things out so i've just finished uh writing the second edition of entity framework core in action for manning and i have three chapters about um performance tuning about 90 pages and i go through a series of work to take up uh take a application and make it better and better and better or well actually giving it more to do and and and seeing how it uh goes so that's what i'm gonna go through um and um all the i'm a kind of open source guy so you don't have to buy my book i've got links all all the data is there if you want to buy my book that's fine you don't have to but if you want to you can get 40 off to by going to my site and getting a coupon so um for that we'll start off i have in the this in this chapter 15 i build a book app oh um which and i've got data from manning so it's got 700 pages uh 700 books in it so this is kind of trying to be a e-commerce selling books and you can do things like uh sort by publication date and uh tags c sharp right so there you are you you can get and this must this is old data i would imagine john skeep would have been doing this version um and you can do things like add a review etc um and let's just um turn that off um sort my votes this is my book and if i click on it you get the more details right um which was on the manic book and you can see it's very similar it's different layout but it's got the same information there so we're working with real data and so what we want to see is how it performs i've what i do is i look first at the http result the how long it takes it takes to come to the screen because that's what the users see and i've instrumented this if i um if i do repeat it i'm pressing f5 to just keep doing it i have a thing here which should catch the timings and you see you can see here it's about taking about 52 milliseconds to put that up and that's that's great that's really quite fast i obviously i'm running it locally um but how did i get a book query that was fast that's the question so um firstly i'll show you um oh yeah so i'm gonna this is not the way to do it so you could i i insist this is not the way to do it but i want to show you could do this to this would get all the data that needed but uh what you're doing is you're loading all the author links and the authors and all the reviews and all the tags and everything but it's going to be very so because you're loading too much data and you'd have to sort and filter in software you'd have to load all 700 books and then do it in software you do not want to do that you might get away with it 700 but so how would you do it so i've got five rules for making a good read only query first bit is um i use a select so let's go and have a look at the code ddd so i have um this is these are called query objects so they take in an iqueryable and they push out an iqueryable and you can see here i'm i'm taking all all the specific properties that i need right so i leave everything else behind uh forget all these comments on the side that's for the bookman for manning um so that's the first thing uh the second one was don't use includes so on the on the on the select one so um we often refer to that as a projection right so select is how in link you do a projection meaning you're projecting out only certain fields from the database not the entire uh row yep yep very important and it's it's important to recognize that that ef core understands how to translate this and doesn't just bring back the entire row and then give you the bits you want it only requests from the database these things which is why this can be so good for perth yeah i use it all the time um so um the other thing is uh i want to get a string of the authors accommodate limited so rather than read all the stuff in i just read the name of the author in the right order right and i string john you string dot join here and for tags i just get the tag id that's that's a string you know which has she ship c sharp or whatever in it in there so that's that's the second one so it's interesting sorry i interrupt again but i think it's interesting to look and shy can correct me if i'm wrong here if you go back to that that code that you you were just showing um the string join there is i believe the one of the places where we will ef core uh five zero three plus will do client evaluation um and that's fine because it doesn't require any more data to be brought back from the database it's not inefficient to do it there so we and yet it's very convenient to be able to do things like do string join and create one string from the results um so that's uh just a little tip bit about how ef core translates that yeah so the other one which is more interesting is if possible move calculations into the database right this gets a little bit more complicated but it's well worth it if you can do it in and it makes a big difference in this application so if i say reviews count efcor will say i can do that in sql right so it will produce the sequel to do that when it comes to getting the average of the num stars in the reviews it can do it as well but you have to remember that this is working on a database if you run average in software's using c sharp if your collection is has got nothing in it you'll get an exception if you do it on a database you'll get a null right and to make this work you have to cast it to um a nullable value otherwise this will not work unless you've changed it but i don't think you have no um this is a this is a tricky area um yep the you know obviously nullability differences between the database and uh the language c sharp um is tricky at the best of times but when you're talking about these aggregate functions and what link expects them return and what the c-sharp compiler expects them to turn versus what the database can return because as as john said it's it it's different it's not clear at all what the best syntax or the best way to handle that in link is and i think we've come to a reasonable uh a reasonable place where most of the time we deal with the nulls correctly but you're required to cast it to for example a a nullable type if that's what you expect to get back yeah but it makes a big difference so it's worth you know just if if you know that the database can do average or max or min or whatever you've got to think there must be a way to do it right and it took a while to do this and i think and it it's a double on sql server i think on i tried it on something else and it wasn't a double it was yeah you know so it's a tricky area but it's going to make a big big difference absolutely to what you're doing so it's worth pushing through on that um i'll also add uh uh about the general point of pushing uh calculations in the database uh the the like a very strong very powerful point here it's not just about the fact that it's a calculation but the count and the average if it's calculated in the database that saves you from having to bring all those reviews and all the you know all that information from the database so if it were just a matter of you know adding two things it doesn't really matter if you do it at the client at this at the server basically it's it's not a big deal but the fact that um an ad group basically we're talking about aggregate calculations here aggregate calculations allow you to express the operation in the database over a whole lot of rows and then you get that just the result this is a huge win and it's always worth thinking about this i really like this and and also the i you know when you're looking at uh um books sorting on boats is gonna be a pretty important thing isn't it that's what we do so that is absolutely important right now you won't have might not have that count i've used loads of times on uh client stuff so you know look at that next next thing um add sql indexes to any property you sort of filter on now i even i i i got a um soft delete um which is a boolean and i thought oh they'll be fine that was wrong once they had a lot of um um a bigger amount of books it really hit it so you really want to put them on like uh shall i show you how i do that yeah such a classic yeah the boolean one is surprising to me though as well like you wouldn't expect that to make so much difference but i mean you have to go through all the rows right yeah yeah when you think about it yeah exactly yeah i tell you at heart at half a million books uh if you have that on account takes 250 milliseconds if you don't it takes over a second wow yeah big change that's a big change that is really important uh and i've changed my automatic stuff to add that right so next thing um as no tracking that's already been oh sorry um that's already been talked about um it um i think it's important um i will show you my code um so i put it in there right so um my tests say sometimes it doesn't make any difference but if you're loading relationships it can make a big difference both in term of making it quicker to load and also not filling up your db context with tracking stuff now in this case i don't actually need it because i i map to a class called book list dto data transfer object view model is another term i don't really need it but i still put it in because maybe i'll come back and i'll change something and i'll i will put something in so i i always put it in it so that i know it's going to be right so i was going to talk a little bit about when to use as no tracking and and when not to use it um yeah so um i think i think there's a um an important thing that you say at the beginning of this which is that you're basically doing a read on queer only query and essentially what what that means in terms of like a web application is basically you're reading stuff from the database and then you're sending it to the client and you're not doing updates to it immediately right so in that situation this is what we call a disconnected scenario because basically you are reading stuff and then sending it off to another tier and in this case the web client where it's disconnected from from ef core in that case it's almost always the correct thing to do is to use as no tracking because there's no point you know even though as john says if it's just a pure projection wouldn't track it anyway but from a conceptual standpoint there's no point in tracking that right now if you are doing database updates where you have a single unit of work to do the update then you definitely want to use tracking for the query to do the update and that may actually apply in a disconnected scenario when you're bringing data back to do an update depending on how you do it there's a lot of patterns that work there but the important thing to realize the important thing to to understand is that if you are doing a no tracking query to say for perceived perf improvement and then you take those entities that are returned from that no tracking query and they tap and attach them or update them on the same context instance so you're essentially doing no tracking query and then saying track things then that's wrong and we see a lot of people doing that um and i'm not entirely sure why in a lot of cases um but it may be one of the reasons is that they perceive no tracking to be faster which is true in a very general sense but not if you're then going to turn around and track them because the overhead of querying and then tracking is higher than the overhead of querying and tracking together and it also loses fidelity as well so use no tracking when you're doing a read-only query and you're sending it to the client don't use no tracking if you're then going to use that context to do updates to the database hopefully that made some sense yeah that's good so we've looked at that um what i want to do now is uh take things up a bit this is 700 and i and i i've shown you a good thing but i'm gonna go up to uh a hundred thousand books and uh half a million um reviews right and that will start to make things a little bit more confident hard to work for right so here it's coming up um so yeah so a hundred thousand books um over half a million reviews etc so i have now have this is the original uh query that you saw and i've got some others which i'll explain so the first thing i'm going to do is i'm going to sort by votes oh just to say these are these other books i've got a little thing up here called generate books which will take the original 700 and just create new ones yeah to work with um by the way guys this is really quick uh i it can put 50 000 books and a quarter of a million reviews in one minute into the database i think that's pretty good that's that's good there's there's more proof to be found there yeah it's it's worth it's worth saying i mean there's there's quite a there's already quite a bit of optimization one of the differences between ef core and ef6 by the way the old ef6 is that updates are actually batched so when you're inserting a lot of stuff it's not as fast as using um you know some low level let's say it's pretty fast already because everything is batched in not necessarily in one go but in big big like batches when we're sending all those books we don't send like one book and then one book and then one book we send like a huge quantity of books and then another huge quantity of books and that's where a lot of the proof comes from there's still techniques that are faster than what efcor does like sql server has a dedicated like a very special sql ball copy kind of thing so does postgres and those are still going to be faster for like insane amounts of books but e4 still does a pretty good job so one of the nice things he of course actually does compared to ef6 yeah so we we have a quick question here i want to look at from the community about no tracking i know we've moved on but just quickly um is it the statement redundant and increased passing time if i had no tracking when it's not needed so my opinion on this is because i was as we were talking about as no tracking i was actually thinking in my head well the query pipeline actually does some things more efficiently when it knows from the beginning that it's a no tracking query and so it may actually be higher performing to do as no tracking even if you're doing a projection i don't know that that's true and i suspect it's negligible for most people but my point is so is adding as no tracking that will be negligible for most people link you know the link passer has a lot more to do than that basically so measure if you think it's important measure but don't just assume oh we shouldn't add it because the passing is going to be lower because you don't know what's actually going on under the hood and it could be faster without adding it okay i'm going to go through sorry i'm going to quickly go through each one of these and explain them and see their performance on votes because sort by votes is the hardest thing to do on this so i'm going to do the original query that we talked about and i'm going to run it multiple times i'm pressing s5 and if i go to his it's taking oh yeah a long time right because there's half a million um reviews it has to work through yeah so that's that's interesting and for my e-commerce that's bad you know that's not a good thing so my first step is to do a little change to it um you i use user defined functions uh i think paul uh middleton yes um so this is a way that you you can keep your link but if you know some something that you can do you can improve then um you can change a little bit of the sequel um to um by adding a udf i and what i found i said oh is there a way in sql where i could make a um comma delimitated list like that and i went to um i went to um sdn the docs no i went to um stack overflow yeah i found something that will do that so i created a function a u a udf and i have a migration where i added this code manually don't be afraid to add your own code in here so this will get this will go into the date into the database i then tell ef core about it and the brilliant thing is if i go down here uh i can show you the so the author orders and tags string don't don't have all the the link it had before it just has this definition of that udf and ef um core will then turn that into sql so let's have a look what it did i'll start by votes but uh oh oh yeah that's right so here we are it still takes a long time but it's it's gonna be a bit better but you can see there that's that's the the the author udf and that's the tag udf and that makes it it's not going to do anything about the um working out the average but it's going to make it a simpler um uh sequel to run and if i again get some timings i love the little helpers to show the logs by the way that's really cool yeah i i was doing so much of this that i i built it all in so you can see here that it was about eight eight hundred before and now it's down to seven hundred so not not great but it was a very simple thing to do and look at udfs if you want to just tweak something uh it doesn't take much effort and it can give you a good game so i think i think this is this is great in in the sense that it shows um how you don't have to completely discard all of your link stuff just because you want some of it to be written as raw sequel right so what what what john has done here is like taken snippets of the sequel implemented those and mapped them to udfs right to a user-defined function um so that basically then it can be in your link query so the rest of the link query is still there it's just using a function and in link it just looks like you're calling a c function to do something and obviously because of the way it's mapped ef core knows to map that to the the function in sql server so that's a very good way of incrementally uh changing bits of your query to for perf without having to say oh i need to dump link and go do you know from sequel and do the entire raw sequel which you may do in other situations but it's not always necessary yeah i i think it's a great thing it it doesn't you know it only goes so far but sometime that's all you need so so uh we've already talked about dapper and um i've got daphra and i'll show you how i do it in a minute but if we do i i studied in detail the link uh the the sequel that uh ef call creates and i found one thing that it didn't do it didn't take um uh take into account there is a um if i run this i get this um this is getting very secretly but this is what you got to do um what in a sql query you can use as a value from the select in the order by but you can't use it in a filter aware right so ef core doesn't take in account to that smith knows about that there's a index up up but it gives me a chance to improve this because i only have to do that once whereas ef cord has to do it twice so if you look at the timings [Music] so now we're four seven five so we're if we you know that's that's a good that's near not quite half um the time it was 850 say um so yeah that's a that's a really good um thing to get what i want to go and say though before you get too happy about that is that i looked at other queries that where it didn't include votes and dapper was not really very quicker you know so don't think oh it's close of course uh is is slow i'm gonna take that and put it in dapper if it's the same sequel you're gonna gain a few milliseconds and when you're put perfect you want fifty percent at least double you know you really wanna pull it back right um so i i'm gonna jump in just one thing uh a lot of thing like stepping back a lot of people a lot of things that some people miss when you know we're we're talking about perf and you know dapper being quicker faster than ef and so on in the normal case when you're programming the overhead of anything like dapper or efcor is gonna in most cases going to be negligible compared to everything else that's going on so if you're actually talking to a database and sending a real query that causes your database to go uh you know read stuff from the disc like in a real world scenario the overhead that ef core adds compared to any like any of these frameworks is likely going to be negligible now i'm not going to say it's always negligible and that's also why we're spending the the effort the time and effort and this released him reducing that overhead but don't assume that you know uh in a real world application and something that's not like a tailored benchmark that these things actually are going to matter to your final like business results so to speak that's a very important kind of fallacy that a lot of people sometimes think about absolutely can i add something to that too um so i'd also like to add it comes back to the old adage that we've mentioned already which is basically test test your perf uh you know because yeah you might not get an improvement in dapper and you over af core that's useful to you so so make sure you test i also think that you see there's things like the tech and power benchmarks that we we mentioned earlier which are basically those are usually doing relatively simple tasks and so you know the fortunes benchmark that we we often look at again is uh creating a from wrong shy but it's reading a single entity from the database and returning it to the client is that correct it's not one row it's i think about eight rows eight right but a small number it's not just that i mean it's a small number it's also so the network is basically negligible because the database is like connected via super low latency link everything is cached at the database of the databases never doing any any actual i o so this is not what we call a real world scenario yeah it's it's a benchmark let's let's let's be clear though it it is a real world first scenario for certain cities for certain people like having a very high throughput uh application that serves simple queries very quickly can be an important scenario for people it's not often it's not a typical real world it's not typical for most most applications that we see customers working with but the reason for bringing that up is those simple scenarios are where you do see the overhead of vfcor or dapper playing a big part because as i said before you're not doing all of that complex stuff on the database that's going to be slow so then the overhead percentage wise that your your framework uses is uh is more and so in a sense when you look at benchmarks like the the tech in power fortunes benchmark that's kind of the worst case for something like ef core and in a lot of cases as john is showing really in a real world scenario that overhead that you see between difference between dapper and ef core is not relevant to shower sinks so it's important to really understand your scenario and what you're doing i think is that is the take home from that i'd also um i'll just show you the dapper code because it's not trivial to write it's it's very easy to to to run a dapper in um ef core it uh there it is you get the context get a dvd context and away you go but look at all this i have to manually build all the sequel in the right order and put it all together and everything it's a it's a bit of a pain right so just be be aware of that okay we are um i've got one more no i'll drop that so i'm gonna go to another level um let's go i'm gonna just shut down so are we yet there yet no we're not we're going to go to half a million books um so let's start this up so um i wanted to push it again and um i it's starting up so um oh yeah i left that over so what i'm doing now and this is in chapter 16 this is i wanted to go another step up so i used command at a query responsibility segregation cqrs is easier to say which is a a way where it it talks about the reads are different different to a right so and what happens is what i have done is just for the display of books when anything that happens to the book the reviews that are you know whatever it it does this projection um which is like um uh oh i didn't didn't show the cash one we're running a bit behind time um that's a pity we can go back to it i mean like you said we can run over a little bit if we need to okay um yeah all right i will okay sorry i was just looking at the time um so i will just go back to this uh go back to where i was um and started again so there was one um thing i didn't show you sorry about that um and it's uh c sql cached so what i do here is i um pre-calculate um some values um and particularly things like the average um um votes and the counts and all that sort of stuff and and i i call it sql uh some people call it denormalization but i i want to call it cash because everybody knows caches are difficult well no they're easy but they can go wrong yes so that's why it's one of the the two hardest things in computers right caching validation and naming yeah and um what i've done here i'll show you the code uh dndd um here books going to the book um what i've done i'm not using redis or anything like that what i've done is i've put these into the database into the i've added extra properties into my book entity right um so it's in the database i quite like that because um if you're running multiple um instances of your asp.net core application it's all in the database it's going to work for everybody right um so i um that's what i do i fill all those in and go back to the code you can see particularly this this one's going to help us isn't it because i pre-calculate it right so if i now i'm on sql cache sort by votes one two three four five six bang now that's what i call fast wow yeah that is fast and i've used this with clients and it's a great way of working you have to be very careful and i'm not going to explain all about this now we haven't got the time but in the article there's a link to another article that explains all about this right it's a great way of working um i will just um so you you start we started off with how many um when it was uh about 50 when we started that was with 700 bucks yeah and now we're at half a million yeah and it's still the same speed because of the changes you made yep that's awesome that's how it should look like yeah it's and that's great so i use something that uh jimmy bogart um came up with called domain events so when i uh when i add a review or remove a review or anything like that i it it goes in the normal way but i send a an event and an event handler says oh it's it's changed i better update these cash values right and so that's how i get it to do i um the thing to watch out for is is multiple concurrent updates right that's that's the thing and there's a couple of ways you could handle it you could do it with a a locking in the transaction but that has problems on indexes so i did it by conca con currency catching and that's explained in that article it's a great way to do it but it's it is a bit hard work right so yeah denormalization is i think really one of the most powerful techniques for speeding up a database uh a lot of people like you know people think a database should be normalized like in database design and that's very true so some people kind of stay away from this kind of thing but there's denormalization isn't actually contradictory to normalization for people coming like from the more theoretical side and it's extremely important for perf i'll just point out uh in the performance docs which we discussed in the very beginning basically most of the page on updates is about this uh so it shows there are various tools so john here showed like a more man like a more manual approach to doing invalidation basically cache cache invalidation that hard problem which is where your application detects when you know something is is going to be changed and invalidates or recalculates the value that is cached there are some things that relational databases provide you that can be used to make life a bit easier uh there's computed columns which can work if what you're caching is in the same row you have materialized views for completely different views of you know something that's cached that you can kind of recalculate like twice a day if that's enough for you or once an hour or whatever so really i mean i love what what what john is is showing there and it's also quite important to show that it's possible to do it reliably via the application take a look at all these mechanisms and in general it's a great way to enhance performance yeah so we're gonna i'll go back to this um ding ding so i really like this i used this in the first book with um a nosql database called ravendb and i wanted to do it with cosmos um so this is like the caching this projection is like a caching but i store it on uh cosmos db which isn't no sequel why do i do that because cosmos db and nosql in general are you can make them more scalable by having multiple versions of them you can have you can have multiple sql servers uh but they're quite it's hard work because you've got transactions and acid and all that sort of stuff with with no sql databases um there's this thing called eventual consistency which basically basically means that it might be out of date a little bit so what i like about this is the sequel always has the perfect absolutely right stuff right it's gone through transactions all the rest of it it's not going to be wrong and then you bang it out to this within a transaction so if it fails the sequel that spells as well but the nice thing with cosmos db say you you were selling in the usa you could put a cosmos db on the west coast in the center and um on the east coast and make it go to those so you get two benefits you spread the load and you've also made the data more closer to the user so so that's really leveraging the the power of the cloud as well as it being a nosql database yeah exactly i've just got one uh just to test things i've got in london i'm running a um sql server azure and a cosmos db and they're kind i try to make them about the same price yeah so that i could do some tests against them um and and that's what's running now so this is now running in my home 50 miles away i'm getting the data from so if i run this and i'll just just [Music] let's see what the timing is like 83 milliseconds to me [Laughter] so you can see slow at the start and then that that doesn't move around a lot you oh there you are yeah so you've got quite a lot moving it it varies quite a lot because i'm using the the internet now but i'm getting 120 uh 130 i think that's pretty good um for for what i'm doing so why is this slower than the sql cache one that we showed is that just be well i'll let you answer so so the the 100 000 was running on my pc here right yeah yeah right that always works best yeah exactly yeah so so to um to compare sql and cosmos cosmos is only in in his years i have to have a sequel in the azir yeah to get do a valid comparison yeah yeah it's interesting to me how you we have this uh these these various layers where we can cache things so obviously there's the cdn approach where you're caching like the final sql sorry html response or jsons or whatever which is like the most processed caching the end result or as you've shown here that's that's actually a really nice intermediary thing where you're you're you're caching kind of like the database results in cosmos right i mean which is like a layer back or you can even use something like a materialized view to do the same again to apply the same thing but in the sql server database itself right i mean you can you can like various degrees of how far you want to kind of push it down it's it's very interesting yeah i i i quite honestly i i worked two or three years ago i worked on a uh architecture system for a u.s company that was going across the whole of the usa and they wanted to also go outside of the usa and there wasn't cosmos there then i would really really have liked to have that that would have made that project it was it was multi-tenant sharding you know and cosmos db would have been a great tool for doing that so let's do so on votes um i i have to say i also like this because it's using cosmos to be in a way that that makes a lot of sense right it really leverages the nosql thing so sometimes we see people trying to replace their sql server with with you know with cosmos db and basically trying to impose like a relational thing uh onto cosmos db which is not a relational thing so they're not the same kind of thing they're not the same kind of data store and this is using both of them in a very complimentary way i really like this yeah and also i'm only using cosmos db for the showing the views when it comes to ordering it goes through the sequel because there's not the same problem i i yeah it's a great way so anyway um i i could go through all these things but we'll get going i'm going to go to the results right um uh i will skip over that um well what we'll invite you for another dedicated cosmo session yeah what i would say is that there are a cosmo any no sql is different to a sql database so there are differences there and also ef core has some limitations and i have to work around that um you will see if i go back to this that one of the things that ef core can't do is count so i have to move over to using uh you know next and previous that's not much of a problem because that's what how um amazon works anyway so i couldn't count so i got rid of it and i did the same with these so that i they would not do account because the sequel takes a lot longer to count um so back to this um so i think it'd be worth mentioning a little bit about where we're going with cosmos on the of course for those who who perhaps uh are not fully up to speed on that um so to to shy's point about it's a different kind of database um when the intention is not just to try to replicate relational functionality on cosmos or to expose that through ef core but to expose sane reasonable correct usage of cosmos uh in af course so that you can use it as a nosql database in the way that you you get the benefits from it um we're at the point now where i think we've got plenty of feedback from customers saying yes we want to use efco with cosmos so that question is kind of answered and right now we're looking at these specific problems like the ones john p smith has put here we're looking at those specific limitations and seeing which one of the which ones of these limitations really make sense for us to address in ef core which is not based on just people want to use them but also based on is this going to be a good experience and and have it be a pill of success for cosmos um and so that's kind of where we are with the ef current cosmos and we we hope to certainly make some of these improvements in af course six it's in the plan but then going forward there'll still be more to do i'm sure okay just to say what i did is because there are limitations in using um cosmos uh via es ef i built a version called um cosmos direct right which uses the cosmos um uh sdk uh um the net sd dk um so i could then you know check the two against each other and you can see that cosmos direct can do counting in fact it's flipping come fast it's it's much faster than sql um so this is this is the thing here as i add more to the database does it get slower not really the count is about cost you about what 20 milliseconds more um but sql uh would add 110 more so that's the one thing just for time i'm going to go straight to this um basically um you once you get to this stage sequel and you know trying to calc recalculate the the um the reviews manually is is a no-no you just can't do it i'd if dapper which is the best one if i try and do that it times out at 30 seconds right so you just can't do it so you've got to do something um and you can see across here that um um cosmos and um and it's count it does this about the same value all the way across sql's okay but overall i would say cosmos ef is um cosmos is a better than cashed because you get this scalability and i'm done this this is you can get this on that article i i give you an idea of how good it was in perf in terms of of performance and how difficult it was end for me excellent stuff now this is this has been really fantastic there's uh there's lots of great uh real world experience in here real world advice um uh i think jeremy's put up the link there to to go find all of our all of the the show links and stuff i'm not sure if that's that what that one is or not but i'm sure it's interesting um so uh i get i guess that's all that's all we have prepared um we can answer questions if they if they still have any um thanks for uh appreciation of alice um my cat that was nice um she likes to oh she's back there now again uh anyways yes the official performance gato v of core alice is not a performance cat mac the orange one who was here a little while ago he's the performance cat alice is she like the reliability captain or yes very consistency she's consistent and reliably slow okay so um yeah i think that's it for us so we can we can sign off um unless anybody else has anything else to say this has been great i i love this content it was awesome thanks for sharing john yeah i really appreciate having you here john as always i'm sure we'll have you back again in the future for more good stuff so uh and remember next week on two weeks time we're gonna have julie lerman ask her anything um and uh that's that's gonna be really fun so i'm really looking forward to that so hope we can uh see you all in bye [Music] you
Info
Channel: dotNET
Views: 9,713
Rating: 4.9557195 out of 5
Keywords:
Id: VgNFFEqwZPU
Channel Id: undefined
Length: 65min 44sec (3944 seconds)
Published: Wed Feb 24 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.