Learn about MariaDB system internals

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Music] hello my name is Manjit Singh I'm bleedin surprise architect at Murray and EB and we're going to be speaking about how the queries move through the Morea DB engine so let's talk about users you may have a an application user and you may have an application end user and you may have a DBA and you may also have your cron jobs let's say and what these guys are going to do is connect with various users so this might use the connector and the DBA may use the my sequel command line the cron job maybe it's using my sequel admin and what they do is connect to the my sequel Dee daemon so the my sequel deed daemon looks for data and how does it do that it does that with the application the handler interface the handler interface connects two storage engines storage engines and if you don't know about storage engines they basically allow you to do many different workloads within the same database with the same sequel and that's that's the value of these different types of storage engines so the handler interface kind of sits right here but let's talk about the internals of your storage engine before we do that dig a little deeper so the query comes from here goes through my sequel D sort of a black box here to the storage engine so what's inside of my sequel D what we have first is your query actually goes to the thread cache thread cache basically stores your threads as you connect caches them and allows and keeps track of those so normally a query a single query for a single thread single CPU thread right and with thread pool actually which could replace this thread cache you can have better than caching you can reuse threads within that pool with with varying connections instead of one connection per thread but let's talk about one connection per thread great so these guys connect through my sequel D to the thread cache from the thread cache the query goes to the query cache now this is if you have query cache on I highly recommend disabling query cache for most workloads and the reasoning is is this is not multi-threaded the query cache feature is actually single threaded it compares queries exactly as they are with comments spaces everything and if they don't match it will ignore it but it still has to go through that to find that match so what I would do is disable that with query cache type and query cache limit off but if you have a good use case for it keep it on so from there what happens is the query goes to the first level of the parser and what the parser does is it creates this little parse tree it token eise's your query validates the sequel make sure that the semantics and whatever makes sense what we'll do is before I talk a bit more about that we'll talk about the second part of the parser and the second part validates our objects our table names or grants and so so the first part basically it takes the query so let's say you do select XY & z from table 1 it takes that and it token eise's it says okay this is a read-only query this is you know looking for columns X column Y column C and table 1 once it's done that says ok this is what we have this is our game plan let's go to validate do these objects exist and these objects here we're gonna say ok does table 1 exist oh yes it does ok does column X exist does column Y exist as column Z exists and then next what user is trying to access these does this user have permission to access column x y&z on table 1 so that's that's what that second part does now from there the query moves to the the optimizer and this is where a lot of people get hung up and the optimizer is very complicated I'm not going to go too much into detail optimizer but basically that parse tree goes to the optimizer and the optimizer is a cost-based optimizer what it'll do mainly is determine your join order it will evaluate your algebraic equivalencies it will optimize your count your min your max it'll do some sub-query optimizations and some of the modern versions of mariadb can do some pushing out of the sub-queries evaluate them first in other threads and bring them back and then it outputs your query execution plan so just a little game plan you know like where do we where do we go what do we do so this plan is is a data structure just like these are but now that it's it's basically been optimized and there's much more that the optimizer does there's many variables that will control this from the optimizer we go to the handler so the handler is really just the query execution engine and what this does is it follows the plan that the optimizer gave it it says okay now I need to do this then I need to do that then I need to do that so let's look at the simple case of X Y Z X Y Z in table 1 so let's say this is an in ODB table we have an N ODB storage engine you know dB what the handler will do is say hey you know DB give me columns XYZ from this table you know DB will say here's Row 1 here's Row 2 here's Row 3 and it's actually just a fire hose and it will continue going then what the handler will do is it will hand this back if you're using the query cache it will create a copy there of each row of the response and it will hand it back to the thread and then the thread hands it back to the happy user and that's pretty much how queries move through Mariya DB one of the other things that you may notice is this was I was speaking more about reeds when there's a write query it's pretty much the exact same plan it has to has to parse it it has to see validate the objects and the syntax and then it does have to figure out how do we insert these rows to the storage engine so it will follow the exact same game plan but instead of returning rows it will return a result you know is it good or bad did it work and then the storage engine handles where how it is it is stored and even when it is stored and that's that's pretty much how Maria DB works and let's there's a few other storage engines that all I'll add in here so there's Aria and then there's column store which is our massive parallel analytics engine Aria is great for reads and then there's my rocks and my rocks is a highly write optimized engine that I would look - as well so that's how it works and thank you for your time

Info

Channel: MariaDB

Views: 328

Rating: 5 out of 5

Keywords: MariaDB, technology, innovation, cloud, database, DBaaS, education, tutorial

Id: jRX7ycVr_zE

Channel Id: undefined

Length: 8min 50sec (530 seconds)

Published: Mon Jun 08 2020