10 Million Rows of data Analyzed using Excel's Data Model

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
if you haven't heard of the excel data model or powerpivot then this is going to amaze you let's go i'm going to import 10 million rows of data into excel or more specifically the excel data model and analyze that data slice and dice it in a whole bunch of ways that traditionally would have required maybe a hundred million vlookup formulas but now with the data model in excel you can do it easily i'll even build a 3d map so my data let me bring it in here we go we have a whole bunch of csv files each one's got about half a million to a million records in each okay totaling about 250 megabytes of data i'm just going to grab this folder path and pull that data in and using power query in excel and power query in excel can stack the data each file can be stacked on top of each other consolidated into one nice table and then i've already got if i pop out the power query panel i've already got a little calendar table loaded for me to slice and dice my data by year and by month and by day of the week whatever i need i've got a location table loaded which is simply a little lookup table with my location code and what country state city postcode that actually means and a small cost center table with a cost center code and what cost center that is and what manager that is and what all these things you traditionally use vlookup formulas to give you so those are already loaded i need to get my data so under the data tab we go to get data from file but not from text or csv from folder so i click on from folder i've got a whole nother video explaining this from folder i'll get a little link to pop up in the video here so here's my folder path and i'm clicking ok it'll then prompt me with a little list of all the files that are in that folder and then give me some options down here the one i always choose is combine and transform because it gives me a chance to check my data before i load it then it'll ask me what type of csv file it is i never know so i just always click ok so i click ok then because i said combine and transform it'll open up the power query window and it's opened it on a different screen so let me just pull that across so here we have it all my data starting to be stacked on top of each other okay and i'll just rename this as just to show you and the only thing i'm going to change is this cost this is multiple decimal numbers and when i was just preparing for this demo if i load 100 million sorry 10 million records with this level of decimal places the excel file at the end is about 250 megabytes if i change it here to currency that truncates the decimal numbers and the file halves in size so the excel file at the end of this would only be 125 meg but to be honest if you go whole number the actual file shrinks down to about 15 megabytes so that's 250 megabytes of source files shrunk down now if you need the decimals for accuracy then go for it but if you don't and if you're building a bi solution or a report do you really need those decimals the compression and the performance would be so much better so i'm changing that to whole number then i pick close and load two now this is going to be 10 million rows of data excel can handle about a million in a sheet so you cannot load it to a table you have to tick the add to the data model only create connection okay that means it won't get loaded as a sheet and i click ok and down the bottom right hand corner you can start to see the consolidation happening here we go 240 300 400 000 records now to save you just sitting there watching this tick away it'll take about 60 seconds i'll just speed this up about four times and here we go 10 million rows loaded that's incredible you really need 64-bit version of excel to do this successfully okay if you're not sure what version you got you go to file account and click about excel and in the top here it should say 64-bit that can use a lot more ram on your machine than the 32-bit version right so what do we do with this well now we go to the manage data model diagram view and here is my consolidation table and now i can do all sorts of lookups so i want to go from my location code to my location code now that's 10 million records doing the equivalent of one two three four lookups so that's 40 million vlookups essentially replaced or x lookups these days then i'll do the same here here's another 10 million times another three so that's another 30 million lookups and if i also hook it up to my calendar so that i can analyze things by year month quarter financial year day of the week then i'm into the you know over 100 million formulas being replaced there that's it that's my data model awesome so if i go into a new sheet and i now insert a pivot table i can close this little panel on the right and here's my values so i can go to my consolidation bring in my cost go to my calendar do it by calendar year and month let me just make this a bit wider and let me format my numbers okay actually not month number i wanted calendar year excellent and then i'll turn that into a chart i'll go to one of my templates that i've set up here's one that i use all the time beautiful and there's my chart from those 10 million rows and let's say i want to see this by manager so let's just go to the cost center right click on manager add as a slicer and here's my manager slicer okay and i can just happily slice and dice this chart by whichever manager i want pretty awesome so let's finish off with a quick 3d map so if i go into insert 3d map now you might have to go into add-ins and switch this on it's called power maps or 3d maps i click on this and then it'll just kick me in to a map and i can collapse my little tables here so i want to do country and i could go to state level if i wanted to or i can go city or even down to postcode and then for my height of my bar i'd like to go to my consolidation and put the cost in there this is pretty nifty and you can start zooming in okay and you can zoom around with your map and you can get down into individual suburb levels etc pretty cool change the angle zoom back out rotate the earth pretty awesome okay and you can even add a time axis so if i go into my calendar and i put date into the time field i then get a play access and i can click play and see my maps and my data growing over time that's pretty awesome so the data model in excel can handle millions of records i'd recommend you need 64-bit excel and power query can stack all these tables on top of each other from a particular folder for example or from any data source and relationships between tables replace all those vlookup formulas final note the reminder the cost column if i kept all the decimal places my excel file when i save it so let me just go here and do a file save as okay and i'll just save it into this folder as final and go save let's see how big this excel file is because i took out the decimals here is the file and this file 14 megabytes in size even though it's consolidating 250 megabytes of data so do you really need those decimals it's a good question so if you like this please click on the big access analytic icon to subscribe give this video a like and tell all your friends we'll catch you later you
Info
Channel: Access Analytic
Views: 100,085
Rating: undefined out of 5
Keywords:
Id: Od9ev90PB1w
Channel Id: undefined
Length: 10min 56sec (656 seconds)
Published: Fri Mar 26 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.