SQL Server Training - Data Page Internals - (By Daniel AG- US Based DBA)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
good evening guys this is Daniel AG again with this particular video I'm going to talk about SQL server page internals I'm getting a lot of because from instance to have do you want to have a better knowledge on how externally data is actually stored internally so what I'm going to do is the lignum in I don't want to use this particular spreadsheet so normally I don't use any fancy slideshows I only use Excel and sequel server management sphere which is more productive in terms of teaching and learning so well first of all let's do let's go to some jargon here so we already know what a data record is data records are nothing but data stored in a row format in sequel server specifically sequence over data files now a row row row can be a you know a record from a cable a an index information index data or metadata database boot pages the boot information is their ghost record forward record lob records or version records these all these can be stored as a row in sequel server now when we talk about storing rows in Brooklyn in sequel server we always use a technique called we are always used a format called as fixed war format the name came from the fact that that row consists of both variable and fixed data so that's why it's called as a fixed what format fixed variable format now we already know that a sequel server database can contain one or more data file extension of the data file can be either MDF or and DF file now each data file can be subdivided into multiple 8 kilobyte pages now as I said these pages can contain index data or table data metadata ghost record and all those kind of information can be stored but a single 8 kilobyte page can be multi divided multiple divided into multiple 8 kilobyte data pages now let concentrate more on what house equalent in store data in this 8 kilobyte pages so I'm going to open another worksheet here alright so this is a representation of 8 kilobyte page I'm not going to call this 8 kilobyte frequently I'm going to call this one as data page ok that's more convenient for you and me so it data page consists of well the actual size moving every time we call it as 8 kilobyte page but in fact the actual size of a data page is 8 1 9 2 bytes out of that 8 1 9 2 bytes it's actually divided into two categories the first category is header header consists of 96 bytes now what information is stored in the header part error contains or the metadata for an individual equal abide page so metadata corresponding to an 8 kilobyte page is what's stored in the header part of a data page we will see how you can read data direct or how you can see what is there in each 8 kilobyte or database we will be seeing that through some dbcc statement probably 5 10 5 10 minutes from now and the second kans area is called as body of data page which is 8,000 96 bytes now the body of data page also contains another type of information is what we call does work record offset array now when you are inserting a when sequel engine insert a record into a data page well that record is can be residing in the body of database but in addition to that there is going to be an entry made into record offset array each element in a record of cetera is what we called as a page slot now each page slot size is going to be 2 bytes so it is the single data page consists of mud consists of an area called as record offset array so this array consists of multiple two byte information which is what we reach to by this what we call paid slot now what is the significance of this to byte information stored in each of those page slot so when I record get inserted into body of data page the starting position of that record in terms of byte is going to be storing that page slot for example when I insert when sequel engine insert the first record into this data page well that record is can be inserted in the body but at the same time there is going to be a value inserted into the page the rightmost page slot clearly explains ok which from which byte that particular record is going to start that record just got inserted where it is what is the starting position in terms of bytes in the body of that data page now the page start always going to start with 0 the first slot number the page load number of first page slot number is going to be 0 not 1 is always going to start from 0 now the order of the record the physical order of the record in turn vary every time when you when sequel engine insert a recording to the body of data page don't think that it is always going to be in a particular order for example the 54th record inserted might be the first record in that body of the data page but the first record get in certain might be the 34th record so the the actual order of record the way the record is stored in the body of data page is not going to be in a particular sequence or a particular order but the page slots are always going to be whatever the physical order in which records are kept in that particular table okay if we have a particular order in which record need to be stored in a table then the slots are going to be in that same exit or not the actual record will be seeing that there shortly all right now the fully qualified name of a record Amin if you want to add this when sequel engine looking for a particular record in a particular table in a particular database well the fully qualified name is also what we call this record pointer it's going to be file ID called an page ID colon slot number file ID means the unique ID of a particular data file in that particular database each database each data file is having a unique file ID and so the file Ari : then each file consists of each data file consists of multiple data pages so the each database is having its own page ID so that is what the second part is now each page is having multiple slot number so each slot number is pin point Lee savor that particular record is consistent so a fully qualified pointer for a record in a particular table in a particular database is going to be file ID : page ID : slot number now let's so that is about some general information about 8 kilo byte page now let's go to my next worksheet here now let's talk about the records right so the slot number contains a pointer pointing to the start of each record in that particular database now how exactly records are stored in our in that data page now let's assume that I'm going to I have a table consists of three columns two columns are the two columns are let's say integer and car another column is worker now don't think that I have three columns or only three column where those three column values what going to be either in each record that is not true in addition to those three column values there is going to be bunch of additional metadata stored along with each record in that table so that's what we are going to investigate okay what are the additional metadata going to be stored in each record all right so this is how typically sequel engine is going to store a record in database each global database so when a record get inserted the first two bytes of that particular record indicate what type of record that one is now when I say what kind of record a record can be different category a record can be a record from a table your record from a index or record from metadata or it can be a ghost record or forward recognize so many different categories now but here we are only dealing with a record from a table or and record from an index for now so the first two bytes going to indicate what type of record it is it is also going to indicate what is the location of the bitmap now we haven't talked about bitmap but we are going to do very shortly now the third information going to be the in that status by this whether this particular record contains any variable length value now a record can contain fixed length values as well as variable length values so that the first satisfied by also going to tell us whether that going to contain this particular record contains any variable length columns now the second set of bytes is going to store the total number of fixed length columns in that record if I have that same example I have three columns out of that two of them are fixed length and one is variable definitely the value stored here is going to be two because I have Li two fixed length columns or values here story is going to be two then after that whatever that that two column values that is going to be stored in the next batch of n byte okay that's what the third batch the fourth batch here the number the number of columns in null bid map now let's talk about what is the null bitmap is now sequel when or when a record can store null values now secretly is not going to consider null as a fixed length or variable length it is going to cancel considers more like a boolean value true or force is a null or not right what is a null is not a value no null is not a value it's a state of value right so we always make this mistake saying that hey that's a null value but null is not a value it's a state of value unknown value known or unknown value all right anyway so for C colon T is going to store a sequence of bits indicating what is the status of null for each column in that record so let's consider the same example I have a column I have a table with three columns if the first column is not null second column is null third column is not null then the bitmap image is going to be 0 so the first 0 1 0 because those two columns which are not null value is going to get a value of 0 the column with null value is going to get a 1 so the bitmap is going to be 0 1 0 so the fourth batch of biases can just tell us the total number of columns in the null bitmap that is equal to the total number of columns in that record all right so it's only going to give you the total number of columns in the null bitmap but the actual bitmap is storing keep the batch of bytes all right so next n bytes is going to actually store that bitmap which I explained so one bit per column that's what I explained like a couple of seconds ago so the second and third batch of biases going to store the variable the fixed value the fourth and the fifth batch is going to store the null information in that record and the next three batch of by Cisco and store variable length data now talking about variable length data the first batch the this particular batch is going to store the total number of variable length column in that particular record if I'm just coming up with another example so I have a table with our 10 columns out of that 10 columns seven of them are variable length and three of them are fixed length so the value you are going to see here is seven because that table contains seven columns which are variable length now the most important part of this discussion is this one if I have seven variable length values variable length columns the next batch is going to contain or is going to store what is going to be the ending position of each of those variable length value if I have seven variable length values so this particular best section is going to store seven end points each end point represents where exactly the variable length value stops and you also need to understand that each column offset we call it as arm offset ok column offset so each column offset is going to be stored in two bytes so if I have seven column offset value need to be stored the total number of bytes needed is seven x two dot log fourteen okay so each column offset need two bytes to store very very important now after I have after storing this column offset for each variable length value then after that that last batch is going to contain all that variable length value store in the sequence which we'll be seeing shortly we are going to see how to how to Cori this data page directly but then you will see whatever I am talking about now I want to concentrate more on this variable length part so I want to come up with just a very simple example here so if you look at this example let me get my pen okay hold on a second all right so this is the this is going to indicate that the total number of variable total number of variable length columns so in this example 0 0 X a decimal 0 2 0 0 indicates that it is there's going to be 2 2 columns all right now so that is that is equivalent to this column right so this column indicates that there is going to be 2 columns now which are which contains variable length data now the next part is going to be the offset right so there's going to be two variable length columns so we need to have to offset numbers so what is an offset number the ending position of the ending position of that data for that particular column so the first offset is going to be 14 the second offset is going to be 17 so these are the two offset now how many what is going to be the how many how much storage needed for this one so here this information is stored in 2 bytes right you can see that this is 2 bytes right so this is also going to be 2 bytes because offset is always going to be storing 2 bytes this is also going to be 2 bytes so metadata of that variable and metadata of variable length column need in this particular case need 6 bytes not the data this is a metadata of the variable length column all right okay now after this the actual this from here onwards you will see the actual variable length data but you know to retrieve the data if this can because because here I said that okay the 14th byte is going to be the ending location or ending position of the first column 14 byte but when I say 14 byte well we are not talking about just a 14 14 it is actually 14 plus all the metadata size so what is the metadata size here the metadata size is 2 plus 2 plus 2 66 plus 14 20 so the actual end point of the first offset is going to be 14 plus 6 24 here to the 20th position this is going to be the first column value what is going to be there now so that's the first column first variable length column how about the 17 17 is not just a 17 17 plus that 6 metadata offset 2 plus 2 plus 2 right 17 plus 6 23 so the next column value variable and column values starting from here to 23rd byte all right so this is how the variable length is actually stored and retrieved well hopefully you understand this if you have any question you know you can certainly put that on the the comment section I am more than happy to explain it further okay all right so now the next thing is we need to create couple of tables and then we hopefully we could able to visualize you know whatever we discussed so far I'm going to erase all this get my cursor back and going to my secret seven management studio so I'm going to make a very simple example for you guys so what I'm going to do is I'm going to create a very simple table consists of three columns customer ID integer so in de jure is a fixed length data because each integer is going to be it's going to be we need four bytes just respective of what data store it is always going to consume four bytes the second column is going to be customer name so it's a worker or 50 so it is a variable length and customer state is a car of do so it's going to consume two bytes a fixed length so only variable length here is the customer data so let me go and create this table is created and I'm going to insert bunch of records here so for records are inserted and let's see so all those records are in center now the first thing is I want to find out if I want to see how data is stored internally in all in eight color by page or internally storing data page corresponding to TBL customer if I just create a table select start from TBL customer I'm seeing the data but that's not what I want I want to see how information stored inside that that 8 kilobyte page for that TBL my customer table so what we need to do first is we need to turn on a flag called as 3 6 0 4 so this is going to arm if I am currying that I mean if I'm going to use the D BCCI and D and D BCC case well from the sequence of a management screen if you want to get more information for under see all this information you need to turn on the trace so BB CC preys on 3 6 0 4 that's what I'm going to execute first so I could see the output of the remaining two DBC statement so that's why turn on the trace now if I want to find that the data pages or 8 club ID pages associated with a particular table then I need to use BB CC IND that's DBC say statement the first parameter is going to be the database second parameter is going to be the table -1 means if there is an index just include that index information too so if I execute this alright so these are the two 8 kilobyte data pages associated with this particular table now here I am only looking for our page type one page type one is the data page so you can see two of them the page type one is actually that is the IMF I am file which we'll be discussing later so the page file we always want to look for a page type is always going to be ten so if you look at here you can see the page ID is two six three three six okay so that's what the page ID is I'm going to copy this and keep it somewhere which we can use so the page ID of this particular table is two six three three six now if I want to get more information about if I want to see what is inside that page we need to use a dbcc statement called as dbcc page the first parameter is going to be the database second parameter is going to be the file ID now how do I know that the file ID is one when what we need to do is if I want to see so select star from sistered database files and point into adventure work database all right so that's going to show us a two files associated with adventure works database one data file and one one log file I mentioned that each data file is having a unique file ID so the file ID is going to be have only one data file here so obviously the pilot is one so we know the file ID the file ID is going to be one file ID is going to be one and page ID we already find out that what the page ID is right we got the page ID file ID so when you use the dbcc statement the first parameter is going to be the database name second parameter is going to be the file ID which is one the third parameter is going to be the page ID which is two six three three six and the four third one is what we call the status the information what information you want to display well there is going to be zero one two three depends on what value give you will be getting different set of information the three the style value three is going to give all the information out from that particular 8 kilobyte page so I am going with three now bit that sorry this part is an optional component if you want to display everything in a table format then you can say with a tabla result um in table result but if you don't want you can comment that is an optional component I'm going to go without that and later we will input and see what the differences so the DBC is the deficiency page is a command you can use to see how data is stored inside that 8 kilobyte page and DBC I and D it can be used to identify all the data pages associated with a particular table so you need to use these two to do to get the result we are looking for all right so I executed that dbcc page and here we go this is what or this is what you stored in that page all right so you can see that the page ID is going to be 1 : 2 6 3 6 6 and the total number of slots we inserted for records right so there is going to be 4 slots so this is a page header by the way I forgot to mention that so this is the page header the page header contains the page ID it's also going to contain the total number of slots now let me see last log sequence number and the ghost number of rock ghost records well there's no what is a ghost record ghost records are records already been officially deleted but still kept in the 8 club by base sequel engine will later go and clear it out now so I mean I mean there are couple of other information which are we are not really worried about now if you come down all right so this is a slot zero slow but it's a slot zero slot zero is the the first record right slots always slot number always start with zeros or slot zero now what is the record type the record time is a primary record that means it's a actual table record now bitmap now record attribute whether this particular column contains bitmap and 3cs this particular column this particular record contains variable column and CCS all right record size is going to be 30 in this case now if you look at the memory dump you will see all the information now let's go back to my insert statement for a second so what is the information about the first record first record is customer ID one customer name is Tom Thomas so George Thomas and the New York so if you look at here you can see information like in New York George Thomas etc right so you can see that but if you really you know follow this you can see that ok what is going to be slot 0 column 1 alright what is the value customer IDs 1 start 0 column 2 customer name is George and slot 0 column 3 customer the customer state is New York right so you can you can go inside each of this and then you can and if you scroll down you are going to see other information as well now if you want to see all this information in a tabular format now uncomment this with our table results execute the same information you are seeing it in a more nicer tabular format goody okay all right so let's do a quick recap here now there is a concept called as now how now based on this particular record what is going to be the size of each record well if you look at this column data type customer ID is going to be four right customer name is worker of 50 so I'm going with a maximum size of 50 so 50 4 bytes plus 2 dot log 56 bytes so apart from this 56 bytes how many more metadata bytes needed so if I go back here so two bytes so we already have 50 images at 50 plus 4 54 plus 256 now to 2 bytes for status so now 56 now 1558 light yeah 50 plus 454 plus 256 now 2 bytes for the status now it is 58 now 2 bytes to store the total number of fixed length column so now it's 60 right now this why this one is going to be we already calculated that I am going to ignore it because that's all be part of that 56 V or calculator or let's start over again I mean we can follow this one okay so let's forget about all those column data type so 2 bytes here another 2 byte for now how many fixed column well we have a integer so it is going to be 2 plus 2 4 plus 2 4 4 integer it's going to be 8 then 2 4 car that's going to be 6 so I can say let's do so here 2 bytes this is going to be another 2 bytes this is going to be 4 plus 2 equals 6 alright so number of that can be two bytes number of bitmap now the number of it might we have only three columns but it is going to be stored as a one byte so it's going to be another one byte here the number of number of variable length column it's always going to be to hear how many variable length columns are there there's going to be one so but still going to be two because each column it's offset is going to be two and the number of variable length column is going to be the size of the variable length column is going to be 50 so each record in this particular table it is going to be 2 plus 2 4 plus 6 10 plus 2 12 plus 1 13 15 16 + 58 dot log 66 fights weight 16 case I mean if the number of Records our number of characters in customer name is less that's fine but we have we are calculating this for the extreme situation right so the maximum size of a record in this particular table the table which we create a TBL customer is going to be 66 bytes all right now you tell me how can you tell me how many records can be stored in that 8 kilobyte page well the total size is going to be nine eight let's make it as 8000 okay loudly to eight thousand so eight thousand divided by eight thousand divided by sixty six right so let's my calculator so eight thousand divided by 66 so 109 twenty-one that's probably two hundred twenty-one right 121 records can be stored in each data page corresponding to that particular table okay so this number the maximum number of records can be stored for and for a data page for a particular table is what we called as fan-out okay n o UT that's a fan-out the total number of records per page possible for a table is what we called as felt alright so that is all about the internals of a data page I will also be coming up with another video on you know i GS m and a lot of other things need to be covered in general so I'll be using another video for that for now this is um this is about the internals but again if you liked the video please subscribe my video if you have any question you can email me a G dot Danielle at yahoo.com or you can call me two six seven seven one eight one five three three and the best thing is that you can put a comment under my video that would be great as well okay alright thank you very much for watching this video you take care guys bye bye
Info
Channel: EmpireDataSystems
Views: 1,443
Rating: 5 out of 5
Keywords: SQL Training, SQL Server Training, DBA Training, sql tutorial, sql server tutorial, learn sql server, sql server course, SQL Server 2019 Training, learn sql server 2019, sql server 2019 course, sql server dba training, sql server training, learn sql, sql server 2019 classes, sql server classes
Id: gIuV-qYAdm0
Channel Id: undefined
Length: 33min 47sec (2027 seconds)
Published: Mon May 29 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.