Complete Python NumPy Tutorial (Creating Arrays, Indexing, Math, Statistics, Reshaping)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
how's it going everyone and welcome back to another video today we're going through the numpy library which is considered to be kind of the fundamental package for all scientific computing and python so it's a super super important library it's kind of the base for a lot of the other like major data science libraries and Python select pandas that I've done a video on before it builds pretty much like entirely off in the numpy library so it's super important and kind of because it's important is because it's this like base the way we're going to do this video is I'm going to start out with kind of a background information on how a numpy works and I think really having that intuition is helpful for when you actually start writing code with it so we'll do the background information and then after that we'll jump into all sorts of useful methods and ways you can utilize this library as far as actual code goes as always make sure to smash that subscribe button throw this video a thumbs up follow my tweeters tweeter gram insta gram Twitter github - hit the bell for the notifications throw this another thumbs up yeah to begin numpy is a multi-dimensional array library so what that means is you can use it on pi to store all sorts of data and one-dimensional arrays two-dimensional arrays three dimensional arrays four dimensional arrays etc and so the common question you kind of ask or I'm commonly asked when you know you first bring up numpy is why do you use numpy over lists so the main difference comes from the speed so lists they are very slow meanwhile numpy is very fast and so why are lists slow and numpy fast well one reason is because numpy uses fixed types so what that means is imagine we have this three by four matrix three rows four columns and it's all integer values and we're gonna kind of look at how the integer values differ between numpy and lists so let's just zoom in on that five that's there that matrix so our computers they don't see five they see binary that represents five and so this is the number five in binary and it's 8 bits which makes up a byte so our computers read information in bytes so when we use numpy this this one bit five is actually by default going to be casted to this int 32 type which consists of 4 bytes and so it represents 5 in a total memory space of 4 bytes say in 32 and you also we can even specify so by default it's in 32 but you could even specify that you didn't need all 4 bytes I represent this value so you could specify within numpy that you want to maybe in in 16 which is 16 bits or 2 bytes or even if you have really small values into 8 which is just a single byte on the other hand with lists there's a lot more information you need to store as an integer so it in lists lists use a built-in int type for Python and so that built in int type consists of 4 different things it consists of the object value which you know has its own bits associated with it object type the reference count how many times that that integer has been specifically like pointed at and the size of that integer value and so if we break this up into the actual binary that it represents we can take the object value and that's represented as a long which is like 8 bytes the object type same deal reference count same deal and then the size I believe is a little bit smaller I think is only 4 bytes but as you can see that's a single integer within lists using the built in into it requires a lot more space than numpy so basically the takeaway from this is that because numpy uses less bytes of memory the computer can read less bytes of memory quicker obviously so it's faster in that regard another reason that I didn't specifically say is that when we're iterating through each item in a numpy array we don't have to do type checking each time so in Python built-in lists you could have a list of like an integer then a float then a string then a boolean and you'd have to check each element you're looking at what type it is but numpy we don't have to do that so another reason it's faster is that there's no type checking when iterating through objects moving on another reason that numpy is faster than lists is because numpy utilizes contiguous memory so what that means is imagine that this kind of array like structure is our computer's memory so we could store information in any one of these memory blocks so when a list the way that that would look in a lists memory is that our list would be kind of scattered around so maybe we have a list that takes up eight memory blocks the thing is that these memory blocks aren't necessarily next to each other so you have some information here you have some information here you know a good amount of information in here then you skip a block here here and skip two blocks you have some information here so it's all kind of scattered around so kind of if you have an eighth item array what that looks like is that that array is actually just or that list is just containing pointers to the actual information that's scattered around our computer's memory and so it's just the all the information is not right next to each other kind of if you have to bounce around your computer's memory and it's not super super fast to like rapidly go through and kind of potentially perform functions on all items at the time or subsets of the items numpy array however uses contiguous memory so all eight blocks in this case would be right next to each other and this has all sorts of advantages and also just to mention real quick you'd also kind of have to have to store somehow where the start of that memory is and then like the total size and the type of memory block but it's a lot easier than this kind of pointer structure that's up here and so the benefits of numpy using this contiguous memory are a couple of different things so the first benefit is that our CPUs or our computers have these Cindi vector processing units and so when this memory is all like right next to each other we can utilize this unit and basically what cindy stands for is single instruction multiple data so we could like if we have to do it in addition of like a lot of values instead of just doing one addition at a time we can use this cindy vector unit and basically perform computations on all of these values at one time so it's quicker in that regard another reason it's quicker is that we more effectively utilize our cache so are kind of our quicker memory in our computer basically if we load in all these values we can keep them close to where we need to access them and I perform all sorts of operations well as in the list case you maybe load in like half of this but then this other half because it's scattered around in different places you'd have to like go back and like reload that in to your cache like you know just be overall slower because you'd have to do more like longer memory lookups within your computer ok so we kind of went over some of the performance benefits but how are lists different from numpy well this we can do insertion deletion append concatenation etc and we can also do those same exact things in numpy I guess the big difference though is that within numpy we can do all that and we can do lots of lots more and we'll see the lots lots more throughout the video but as a simple example imagine we have these two arrays one thing that we can do that's really nice and numpy is that if we try to multiply these one item at a time we could do that in lists you couldn't multiply 1 + 1 3 into 5 + 3 etc but when we do the exact same computation within numpy it allows us to do these you know single value like item wise computations which is pretty neat and pretty useful so that's one example and you'll see a lot more throughout the video so applications have known pi there's all sorts of applications I think the first one the kind of the first one that comes to my mind is that is a kind of a MATLAB or placement you can do all sorts of mathematics with numpy I think I should say that I think the sci-fi library has even more mathematics like functions and whatnot so numpy it isn't cutting it for you try to look through the sci-fi documentation you might be vilified and even more but yeah it's pretty powerful the math that numpy can do it's useful in plotting is the back end of many different applications so pandas which I've done a video on before it is just like the core component of pandas library it really allows canvas to work if you've seen my Kinect for how to program the Hat video I use numpy to store the board and then in future videos that I'm going to do you can you can actually store images through numpy it's like PNG images you can use numpy to store all the information and like do all sorts of cool stuff at all post future videos on let's see also another I think useful reason to know numpy is that it's like pretty important for machine learning applications both directly and then also kind of indirectly because one of the key libraries or key kind of concepts you learn with machine learning is the idea of like tensors and tensors are pretty connected to kind of like the tensor libraries are pretty similar to like the Danone pi library it's just a way to store all sorts of values so knowing numpy will help you kind of be able to do some stuff with machine learning all right to get started the code the first thing you want to do is import the numpy library and just so we're on the same page I'm just a Jupiter notebook to use to code this up but you can use whatever editor you prefer also all this code that I'll be going through will be on my github and the link to that will be in the description okay so import numpy as NP if that works for you great if it didn't work you'll have to do a pip install so you can go ahead into your terminal and type in pip install numpy and so it's already installed for me so and if pip doesn't work for you try pip 3 install numpy that should work so the first thing that's important to know is how to initialize an array so we'll just say that a equals NP array and then within this we just basically pass in a list so 1 2 3 this would be a one-dimensional array containing the values 1 2 3 as you see and you can go ahead you're not using different notebooks in print hey okay cool so we could also initialize a little bit more complex arrays so we could do like a 2d array of floats and I could do that the following way we have a list within a list so here's some floating values and then we're going to make this two-dimensional so here's some more float values and let's go ahead and print be cool so now that we know how to initialize arrays and you can keep doing this like I can nest lists within a list within a list to create a three dimensional array etc some other useful things to know about this is how do how do you get the dimension of your numpy arrays so if I did a dot number dimensions so this tells me that it's one dimensional for a and if I did B dot and it would be to shape is another important function so get shape if we do the first one a shape this was always the day oh it's a vector so it's only going to tell me the one dimension because it only has one dimension so it's size three if I do B dot shape it's gonna tell me the rows in the column so this is two rows and three columns so this should print out two by three as it does okay other things we want to know how much memory are numpy array is take up so we can get to type and you also get the size so if we want to get the type we do just a dot type sorry a dot data type in 32 by default so even though these are small values by default it specifies that it should take up 4 bytes or being in 32 if we wanted to specify what type we want it to store us so maybe we knew that we didn't have many like big values so we could do like an in 16 and so that would take up less size and you can see the difference in size and I say so right now it's in 16 and if I want to see the size there's a couple different I guess important functions with this we could do a dot item size so this should tell me to bytes as it does if we left this as an inch 32 it will tell me four bytes down here as it does you can also do I think the total size I guess a dot size is the total number of elements so the total size would be a dot size times a dot item size another way to do that is I think just number of bytes as you see that's the same thing and you can also do this with be just like feed item size these are floats and I believe that this is an 8 byte type so if I do B dot item size as you see it's 8 so floats are gonna be bigger than floats are bigger than integers usually unless you define this as like in int 64 and so yeah you really I usually don't even worry about the datatype too much I don't specify it but if you really want to be efficient try to specify this so that it fits all your data but if yeah I guess it fits all your data as tightly as possible alright so now that we've gone through some of the basics let's actually show how we access flush change specific elements rows columns etc so imagine we have the array there's gonna be a two-dimensional array so I'm gonna make this kind of long and you'll see why in a second okay so this is a two by seven ray if I print that out okay and I could prove that it's a two by seven by doing it up shape that's just a reminder so what if we wanted to get this specific element well to do that we can use this notation of row comma column so this is the row index this is the column index so I could just do something just like a let's say I wanted to get this 13 right here well that would be in the second row but because we started Python indexing it zero be the first row and then the zero one two three four five fifth column so yeah that gives us the 13 as you see down here the one thing that's kind of cool is you can also use the negative notation similar to lists so I could also say the negative second element would be 13 as well because this would be negative one and then negative two so there's a couple different ways to do this but we'll stick with the first one okay let's say we wanted to get a specific I can't spell row that's pretty straightforward as well so in this case if we wanted the first row we would do 0 and then because we want all columns we use the basic slight syntax similar to lists I can just do a single column and that will get me everything in the road that's nice what if we want to eat that specific column well if you know how to do rows you probably know how to do columns a let's say we wanted this row or this column right here 3 and 10 that would be all the rows and then 0 1 2 column that gives me the 3 10 and from here we can do even some more like tricky stuff so [Music] we're just say getting a little more fantasy and we have the start index this is just a reminder start index and index and then finally the step size so if I wanted to let's say get between the numbers 2 & 6 every other element so 2 4 & 6 will just specify that I would do well they want the first row and then I want to start at the first element the two and I actually screwed that up it should be 1 so I start at the 2 then I want to end here at the 6 which is the its exclusive so that would be I want to actually go to the 6th element and then I want to step by 2 because I wanted to for 6 so I do 1 6 2 and that gives me 2 4 6 and I can also use the negative here and do like negative 2 yeah what happened there Oh see that was going to give me backwards I didn't want to change it there I wanted to change the 6 to be negative 2 okay it's exclusive so when this to actually be negative one a little bit more of a fancy way to do that okay so that's how you access elements and then if we wanted to change something it's pretty straightforward to it say I wanted to change that 13 that I originally accessed well I can just do like 20 if I print out a now that original element that was 13 is now 20 and you can do the same thing for series of numbers so like for an entire column let's say we wanted to replace this 310 column I would do something like a colon 2 equals let's say I wanted it to be all fives I could start like this and as you see it's all 5 5 5 and then if I wanted it to be two different numbers you just kind of specify the same shade as what you've subsequence so I'd be like one two so now you see that we have a one two in that position really quickly that just show a 3d example if I had a 3d so we'll say B equals numpy array of all this and if I print B so if we want to get a specific element here the recommendation I have is work outside in so work outside in so let's say I wanted this for right here well the farthest outside would be which one of these do I want and I want the first set so I want this area right here so if I wanted that I would do B 0 and then now that I'm in here I want the second row so I want the 3/4 so that would be 1 and now that I'm within this I want the first or the second you have a second element but the first index like that so that gives me the 4 and you can do similar type stuff with like the colons in here so each one of these dimensions that you're indexing you can be all fancy with how you access elements so I can do something like this and you know get 3 4 7 8 you can kind of play around with this and see how changing different things changes what you get and if you wanted to replace in this case basically just have to create a subsequence that's the same dimension so if I did it'd be 1 this it gives me 3 4 7 8 let's say I wanted to change that to 9 9 8 8 as long as it's the same dimension it's gonna work so 9 9 8 8 if I try to like do something like 9 nine eight eight it's gonna have an area alright so that's the basics of indexing I think it by the it at the end of the video I'll do a little like challenge problem on like some advanced indexing so look at the end of the video for that alright next let's go through how to initialize all sorts of different types of arrays so to start off let's initialize in all zeros matrix and to do that there's a nice built-in function called NP zeros and we can first I guess actually all we really need to do is specify a shape so I did like MP zeros five it's gonna just give me a vector of five like five but I also can pass in a more complex shape so if I wanted it to be like a two by two or two by three let's say as you see there I could do three dimensional 2 by 3 H by three could even do four dimensional if I wanted to 2 by 3 by 3 by 2 yeah it gets pretty crazy but yeah you can do all sorts of zeros with that next let's do in all ones matrix pretty similar to the last one and P dot ones of let's say four by two by two and there you go and you're getting also specify the data type here so if you wanted like all ones but in 32 you can go ahead and do that so all ones all zeros however you might want to initialize some matrix that's not ones or zeros and the other number so for that you can do NP full and this one takes in two parameters so the first is the shape so two by two and then the next is the value so if I wanted at all 99's then it's a two by two with 99 another useful and you can you know that has a data type too so that to be float32 there you go and I'll put a link in the description to a list full of these like array creation routines useful to know is there's also this full like there's this full like method and basically that just allows us to take a shape that's already built so let's imagine we wanted to reuse that this array that we I guess had in the last section a I think that's still loaded and then we just make sure well I can pass in and make a array that's the same shy size of fours let's say by doing full like they're actually I think I don't even have to pass in eat up shape I just have to pass in hey there you go if I didn't use full light I would have to do full of a dot shape I don't know if that's that useful for you but I guess it's potentially good to know ok next one let's say we wanted to initialize a array or a matrix of random numbers so random decimal numbers to start so do that we do n peed a random door and and we specify the shape so let's say 4 by 2 [Music] actually confused tuple state oh okay yeah this one's a little bit different so instead of passing in a tuple you can pass in directly the integers you wanna the integers of the shape so it's a kind of weird thing to remember so if I did the four by two this way I would actually pass it in like that and when you get errors like this often times you can just do a quick Google search and realize that that's what you need to do so I can even keep going so I could do a four by two by three random numbers between zero and one I could also pass in something like a dot shape I don't know if this would work let's try yeah so if you wanted to pass in like a shape you can do a random sample data shape and that now you see gives us the same shape as our a from up here so yeah Rand and then there's random sample which is another method we'll keep it as a brand of four by two okay what if you didn't want just decimal numbers but you wanted random like integer values well to do that we can do random and I see I'm getting NPI random brand int and in this one we're gonna pass in the start value or if you don't specify a start value it's gonna just start at zero and so if you don't specify a shape then it's just gonna do one number so let's say we wanted a three by three yeah what did I do wrong and this is not shape it's actually size and yeah all the documentation has these like you know you're not expected to memorize all of these things what I think it is helpful to see is that you see that you can do these types of things so like when you're thinking about a problem you can like kind of point back like oh I remember that that's possible maybe do a Google search how to get it but yeah Brianna Brandon 0-7 with size 3 by 3 is here you could also specify like a different parameter so that's I went forward to 7 and I think and if I keep running this to kind of cool you can see it changing and so it looks like that 7 is exclusive so if I wanted it include 7 I would stop a little bit later get also photo in like negative numbers here cool all right what else other than random integers maybe you wanted to do like the identity matrix you do identity of 3 this one only needs one parameter because the identity matrix look by its nature is going to be a square major matrix what else is useful maybe it's useful to repeat a array a few times so to do that you could do say we have the array 1 2 3 let's say I wanted to repeat that three times passing the array you want to repeat and then let's print r1 see what happens okay and then if I specify the axis equals zero I don't know did do anything what I can do is make this a two-dimensional array I think because it was a vector it didn't do what I wanted to what I wanted to do is one two three or one two three one two three one two three so if I wanted to do that now I made this a two dimensional array and it will repeat the inner part on the 0th axis so I'll be basically making it rose there you go so if I make this equal the one that's gonna be what we saw before cool okay so next here's a picture of an array I want you to try to initialize using everything that we kind of just went through so all these different methods so look at this picture and then try to put it together without just manually typing out to all the numbers because you can imagine like this isn't too too big but if you got into a matrix that was massive you'd want to know how to build it up using these kind of like fundamental concepts okay so here's the solution to that so I can do output equals I'm going to start with making everything ones so ones and so we have 5x5 of ones print output so this is what I have now okay and now basically what we're going to do is fill in this middle part with zeros so Z I wanted to say equals NP dot zeroes and that's going to be a three by three and if I print Z now we have this now what I can do is fill in the middle element so that's one one with a nine and now if I print Z we get this and then finally we need to replace the middle part of the ones matrix so output the middle part so that's gonna be the first row to the third row so I want the first row to the third row and that I want the same thing with columns because it's the middle first column to the third column and actually this is exclusive value so it needs to go to four and that's gonna equal Z and now what happens when I print output is yay we got what we're looking for and actually one thing that I think it's nice is instead of using four I can also do negative one so basically the from the first element to the last yeah and as you see it didn't change this last initialization I want to go through I guess is a little bit different it's a over on the concept of copying but something you've got to be really careful about so I'm just going to quickly mention it I want to act my age invoice there we go okay so imagine we have two arrays or we have one array let's call it a and so you know a is just a normal array as you can see and let's say we want to make be a direct copy of a so now I'm going to just do B equals a and then print out B and as you can see it's still 1 2 3 and so I'm like okay I have this copy like things are cool it's fine I want to change the first element in B so I'm gonna do B 0 equals 100 here's the issue I print out B looks good the issue lies in if I print out a look what happens I just printed out a and a now has a 100 instead of the 1 2 3 that I initially said it as and that's because when we did B equals a we just said that the the variable name B points the same thing as a does we didn't tell like numpy to make a copy of what is the contents of a so that's the that's why the because we're just pointing at the same exact thing that a is pointing when we change the value it also changes the value of a so if we want to prevent that we can use this dot copy function oh sorry I shouldn't do it yet B equals a dot copy and then when we run the sale as you can see 1 2 3 still there because now we're just copying the contents of what's in a and if I print B it has the 100 200 100 2 3 okay so one of the big uses of numpy is all the math capabilities it offers just to kind of show some of that one thing that it can do is element-wise want to make this four values element-wise addition subtractions element-wise I guess arithmetic so here we have a printout a and if I wanted to do something like a plus two adds to each element you can do a minus to subtract two from each element a times two as you can see a divided by two divides everything by two one thing to note with and you can also do stuff like a plus equals two so now if I printed out a in this column it's going to be two plus everything it's kind of cool you can do like the same type of math that you can do in Python you can also create another array and P dot a and that's like let's say one zero one zero I can do something like a plus B and that should be two two four four oh and because I added their degree run this okay two two four four like we expect so all sorts of useful things you could even do like a to the second power one four nine sixteen and that might have made it a bigger data type I'm not sure cool we can do stuff like take the sine of all the values so let's say we had a we do MP dot sign passin a gives us all the sinusoid of all those values which you know and you have like cosine of all those values all sorts of useful things that you can form on an entire array or entire matrix all at once and if you want all the different things that you can do I'll paste in a link here this will all be part of the as I mentioned before I have this on my github so if you look in the description you can find this exact notebook say yeah look up their routines right here for math all sorts of cool stuff alright moving on we're going to still be in math but let's jump into linear algebra type stuff so here we are doing a linear algebra ok so this was kind of like basic all sorts of functions you do on elements linear algebra so this is like really I feel like when I'm using MATLAB it would be doing these linear algebra type stuff so let's say we have two matrices and the big difference with linear algebra is like we're not doing element wise so like in this case this B we're doing element wise computation so like 8 times be in you know linear algebra you're trying to multiply matrices and that's a different process so let's say we have two matrices will have a and I'm going to use this syntax we learned about earlier I want to say this is a 2 by 3 matrix of all twos actually let's make this 2 by 3 matrix of ones so we have a as you can see and then we'll have B which is equal to NV dot full it's going to be a 3 by 2 and it's going to be a value 2 so if I print out B now we have this and if you remember linear algebra you have to have the columns of the first matrix be the equal to the rows of the second one so as you can see this has three columns and this has three rows so we're good there so we would multiply this row by this column and you know you do the the process of my Trischka multiplication I don't walk through the whole thing but we should end up with a 2 by 2 matrix the end if we want to do matrix multiplication and it doesn't just automatically affirm if you try to do any times B it's not going to work because these are different sizes so we can do is MP has a matrix multiply function and if I pass an A then passing B we get six six six six should I say enough sixes I don't know but yeah did multiply those two matrices you know and if I try to switch up this dimension in the middle it's not going to work because it's now incompatible that's matrix multiplication you could also want to do maybe some other stuff with matrices so let's imagine I wanted to create the herb to find the let's say determinant of a matrix so we could as a sanity check you know makes C equal the identity matrix and if you are familiar with linear algebra you know that the identity matrix has a determinant of one so if I do a linear algebra determinant of C we should get one one point now as we get so find determinant you know and there's all sorts of other good things like eigenvalues you know the inverse of a matrix so what what do you multiply by a matrix to get the identity matrix and so yeah all sorts of good stuff on that like I guess I'll do mmm and if you want to have all this information on the other types of linear algebra stuff here is some useful information definitely go to this link and as I've said a couple times in this video this notebook is on my github page so you can find all this there but yeah there's so many different things that you can do with matrices in lineage by using the by library okay continuing on words let's look at some statistics with numpy so kind of the easiest things we might think about when we think about sorry statistics there's like men mean max etc so let's say we have this array so let's say we wanted to take the min of it you can just do NP dot min of stats that's gonna give us the one that you see there you do NP max of stats six you could also do it on like a row basis so if I said axis equals one that's going to give me the min of the first row and the men of the second row or maybe this is a better way to see it if I said axes equals 0 well it's gonna give me all the values that are up top here because those are all the the mins so yeah you can do all sorts of cool stuff with min to the max with this same thing with max let's say axis equals zero x equals 1 3 & 6 is the biggest value 3 is the biggest value in the 6 is the biggest value you can also do in feed out some of stats if I do it just as is it's gonna sum up all of the elements in the matrix and then same thing I can do row or column so actually equals 0 is going to add up all these terms going downwards next let's talk a little bit about reorganizing arrays so kind of the I would say the key method within reorganizing arrays so if I have the array I want to call it before and let's say that that is equal to this value right here so we have it before I'll print 4 out looks like that so let's say we wanted to instead of this shape that it currently a two-by-four let's say we wanted to make it a I don't know a eight by one or something or maybe a four by two or a yeah all sorts of different things we could do I'll start with a by one so we have before and if we wanted to make it something else we can do after equals before dot reshape and then we pass in the new size we want it to have so if we wanted it to be an eight by one and pass it in like that and we can print out after as you can see it's an eight by one now I could also say maybe I wanted it to be a four by two so now you got that you could even pass it in as a two by two by two as long as it has the same amount of values like it's fair game so as you see two by two by two still works with the reshape what doesn't work is like if I you wanted it to be two by three the values don't fit in so when you get errors but using that of shape it's usually because there's a mismatch between the shape you're trying to resize it to versus the original shape moving onwards this look at vertical stacks so vertically stacking vectors or matrices and you know dimensions are important in vertical stack as well so vertical stacking matrices let's say we had these two arrays if I wanted to stack you know one two three four on top of five six seven eight I can do NP dot V stack and I can pass in V 1 V 2 and as you see now they're part of the same matrix and one two three four is on top of five six seven eight what I can even do is keep passing these in so let's say I wanted like three copies of this five six seven eight and only one copy of or I could enter tweet we them that's a vertical stack horizontal stacks are pretty similar and also note here like I can't do that the size of some dish mass Mitch mis-match so yep horizontal stack very similar let's see we had well use a sub notation we've learned before we had these two matrices so if I printed out each one you gotta like that and then each two is this well I want each to be on the back of each one I can just do an MP dot each stack horizontal stack and that it will be H 1 and H 2 and that did not work because it did not surround this in parenthesis either parenthesis or brackets I think they both work yeah there you go so now we've horizontally stacked to the zeros on top of the earth to the right of the ones alright let's get into some miscellaneous things so first off imagine you have you know some sort of text file with all sorts of data and for whatever reason you choose you don't want to use pandas but you want to load all that data from that file into a numpy array well we can do that without too much trouble so I have this text file that I created as you can see here this is on my github page you can download it there this is just really simple data but it shows kind of what you can do with it all delimited by commas called data txt but what I can do is I can do MP and I can use this function called gen from text and I pass in the name of the file which is dated txt and then I pass in a delimiter which is the separator and that's a comma and if I do that I see that I get that data that I just showed you you get that crease the zoom here I get that as an array so that's pretty nice so I'll just call this file data equals and file data yeah one thing you notice though is a automatically cast it to a float type and what if I wanted it to be an integer well I can do another function as type which basically copies all the data into a whatever format you specify so I'll say in 32 and as you can see now all this stuff is here and if I go ahead and print file data now it is back to what we had originally and the reason it's back is that this actually makes a copy because the float type in the in 32 type are different sizes they can't just like in place copy everything it doesn't really make sense to so if I did file data equals file data dot as type int 32 and then printed out file data as you can see now it's all floats so let's say you load data from a file and you could change up this delimiter based on how your data is split but I think that this gen from text will handle your new line breaks properly if that's how its formatted right in the comments if you have any questions about this okay the second thing I want to go through is what happened there I didn't want that to be markdown the second thing I want to go through with this miscellaneous section is some advanced indexing so there's some really cool stuff you can do with numpy I'm gonna say boolean masking and advanced indexing so what can we do here so let's say I wanted to learn where in file data the value is great than 50 so if I just type in file data greater than 50 it's pretty cool that you get false or true based on whether that specific location was greater than 50 so as you can see there's four falses and then a true if we go to our data for falses and then 196 is in fact greater than 50 so that's like one way and you can do all sorts of cool stuff with us like you degree of any equal to you know all sorts of different combinations one thing that's pretty neat is you can do file data and then you can index based on where it is greater than 50 and by doing this you grab only the values that actually have a value greater than 50 so that is pretty cool and kind of the reason that this right here works is that one thing I did not mention until now is that you can can index with a list and numpy which is pretty coolest if you have the array one two three four five six seven eight nine and I wanted that say zeroth spot the second spot and then the last spot I could do MP or let's say that this is a I could do a of 0 1 and or I wanted to 3 and 9 so I would do 1 2 and then 8 as you see that gives me 2 3 and 9 I passed in a list and it indexed those spots so basically it also works if you like had choosen falses it like basically if it is true then it knows to take it if it's false doesn't so that's why this up here works we could do all sorts of other things so let's say I wanted to figure out if any value in any of these columns was greater than 50 so I can do a NP dot any file data greater than 50 and the axis of zero so that should tell me like if we looked downwards on all of these or any of the values greater than 50 let's see what happens so false false false false true that's correct true these two values are greater than 50 this even though this one isn't false true yeah true true false true cool so this is telling us yeah where what columns have a value greater than 50 and I could also do NP dot all and as you see there's less throughs in this case I think the only time that all of the values are greater than 50 are right here yeah you see there's one true in the fifth spot which corresponds to this right here what else can we do with this if you did axis equals one just going to get rows you can also do multiple conditions so I could do like I want file data to be greater than 50 and let's say file data is less than 100 and this syntax is very similar to pandas here and as I said before numpy builds is what's the base of pandas so it makes sense no the final truth value of an array with more than one element is ambiguous how do I do this I think if I do something like this it will work no positive let's see No I need to end that yeah cool so this is all the values that are greater than 50 but less than 100 and so like the first shoe should happen at the six spot one two three four five six as you see and I could do some of them like all the spots if I wanted to make all this not so this means not greater than fifty and less than one hundred this is going to be the reverse of what we just did so yeah now the sixth spot is the first false so this meant not so yeah all sorts of cool stuff you can do with this boolean masking an advanced indexing I mean yeah any sort of like condition I'll put a link in some more information about this all right quick little quiz on indexing this is kind of using all sorts of advanced stuff that you just learned in that last section include and then also like some that original stuff so first question basically pause the video after I ask it and then try to figure out what the command would be so we have this matrix and how would you index this part of the matrix so this is the second and third row and the first and second column or 0th and first column so it looks something like this rows columns next question how would you Index this this is something we haven't done before but you potentially with that last section might have an idea if not no worries so to do this one you need to use two different lists within your indexing so it's going to look something like this we need the zeroth first second and third row and then to second third and fourth columns that's what that is and then final question how would you Index this this is like also if something we haven't immediately looked at but you might be able to get especially with that last one take a second all right that would look something like this where you get the 0th fourth and fifth row zero fourth and fifth rows and then you want columns three onwards so this would like 300 works you can also do like three to five you'd also do three like a list of three four but yeah that's one way to do it it's a fun little quiz I know it's I guess good to revisit this type of thing and like think critically about it all right thank you guys very much for watching I think this is all I have for this video if you've learned anything make sure if those video a thumbs up and also subscribe because I'm gonna be making some like very useful tutorials I hope at least that they're useful as well as I'm going to be demonstrating some cool projects that I'm working on so I'd be awesome if you subscribe that's all we have today Pisa [Music] you
Info
Channel: Keith Galli
Views: 496,762
Rating: 4.9622455 out of 5
Keywords: Keith Galli, Python, Python Programming, Python3, NumPy, Numerical Python, SciPy, Matrix, Array, NumPy vs Lists, NumPy Tutorial, Linear Algebra, Math in Python, Why use NumPy, Beginner Numpy Tutorial, numpy tips, numpy tricks, Linear algebra python, how to use numpy, scipy tutorial, matlab, benefits of numpy, numpy lecture, Advanced NumPy, Numpy lesson, how to learn numpy, array library, lists, python lists, fast, performance, dimension, data science, initialize, analysis, data, science
Id: GB9ByFAIAH4
Channel Id: undefined
Length: 58min 40sec (3520 seconds)
Published: Wed Jul 10 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.