COS 302: Matrix Basics

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so now we're going to talk about matrices now unsurprisingly matrices are incredibly important to linear algebra and are the basis for lots of the computations and lots of the ways we think about solving problems with vectors here i'm just going to talk about them in the abstract and some of their algebraic properties that you need to know for other things we're going to do in the class so matrices to start off with you can think of a matrix as just a table of numbers now in the book and in material for the class you're generally going to see matrices written with a bold capital letter i can't write bold on here so in general when i'm at the blackboard or at this light board i'm going to do something like that where i put two lines under it to indicate that it's a matrix this is kind of like when i write vectors i'll put one line under it and we can think of this uh this matrix as just being a big table right so we've got a 1 1 this is a scalar entry um and then we maybe have a 1 2 out until we get the number of maybe maybe there are n columns and then maybe we have m rows and so then we have this m by n matrix so this is a big table of numbers often we'll write this kind of thing as we'll say that a is in r remember that's the set of real numbers to the like m by n and this is kind of analogously to the way we talked about vectors maybe living in rn these matrices live in the space are m by n m rows in columns sometimes matrices will only have say one row or one column in that case we might call them a row vector or a column vector respectively so we might have a situation where we have say a matrix let's say b and it happens to only have one row so m equals one we might have a situation like this b one one b one two and so on out to b one n we might call this a row vector similarly we might have something that only has one column and so then we would have a situation like this where maybe we have c 1 1 c 2 1 down to c m 1 and we might call this a column vector matrices add element wise much in the same way that vectors do at least vectors in rn and so if we have uh two matrices say a and b and they are the same size then we could write something like we would say that a plus b is a matrix in which we have a 1 1 plus b 1 1 and that's true for all of the entries and then we have b we started a 1 m sorry 1 n plus b 1 n and then down here a m 1 plus b m one and then same thing all the way across so really this is just applying to all of the entries nothing surprising here just what you would expect from the vector case the other thing we can do is we can of course multiply these things by scalars again just like we could with vectors in rn so if we have some gamma and we have say gamma multiplied by a matrix a then it just multiplies each of the entries in the matrix so we would have gamma a11 gamma a 1 n gamma a m1 and gamma a mn one thing we often do with matrices is transpose them a thing that makes sense because they're tables and numbers and when it's transposed then we're sort of flipping the table so that rows become columns and columns become rows so if we had an a that is in r m times n then a transpose which we write with a little t superscript then that is going to be n r n times m okay so that's following assuming that a is an m times n and in that case what we do is we are writing each of the entries in kind of the opposite indexing scheme so we still have a11 up in the corner but now when we go all the way over here we're going to have a m1 and down here we're going to have a 1n right and then down here we still have a m n so what's happening is any entry in the matrix is basically hopping across the diagonal and vice versa one special kind of matrix is a square matrix that is m equals m that is equal to its transpose if that's the case then we call it a symmetric matrix so a square symmetric matrix is where a equals a transpose and this is symmetric one of the things that's kind of funny about matrices is that they multiply in a particular special way and it generalizes a dot product that you're probably already used to so presumably you've seen things like you know you've seen things like x dot y for two directed line segment type vectors and if x was equal to if we had x equals you know a i hat plus b j hat and y equals c i hat plus d j hat if we had vectors like this then we would write the dot product x dot y as the scalar quantity a multiplied by c plus b multiplied by d and of course this is a thing that we can easily generalize to arbitrary vectors in rn so if we have instead we would say we have an x that's an rn and we have a y that is an rn then we could say that x dot y is equal to i'm going to say a sum so this is summation symbol over the indices i from 1 to n of x sub i so the i component of x multiplied by the i component of y this summation symbol here will use quite a lot it's just a concise way to write a sum over all of the components so now let's take this a step further and imagine that x is a row vector and y is a column vector then we would have a situation like this where we had say x1 x2 up to xn and we would have yn y1 y2 up to yn and this gives us a scalar that is the same as this thing we have over here so this is the same there's kind of a different way of writing the dot product of these of these two things and so this is again the same as the sum from i equals 1 up to n of x i y i in fact in linear algebra and machine learning we often don't use a dot product but we write it in a slightly different way we imagine that all of our regular vectors like x and y like things that live in rn we imagine that those are column vectors and so then y is a column vector and x is a column vector that has been transposed so we might write this whole thing as instead x transpose y and this is the same thing as writing x dot y in in other situations the appeal of writing things this way is that it connects the dot product like you're used to to matrix multiplication and so then we can just do the same thing so matrix multiplication just to remind you is the idea that if we have two matrices that share a dimension so let's imagine that this is a matrix let's call this a and it has dimensions maybe n m by n and then we have another matrix that we want to multiply it by and maybe this one is b and it has n as this dimension here as this long dimension and then let's imagine that its other dimension is say p then what's going to happen when we multiply these these together is that we take every one of the m rows and we multiply it by the associated column in matrix b and we do that for all possible for all possible pairs of rows and columns and this gives us a new matrix that is now going to be in this case m by p where each entry corresponds to the inner product between the row vector in a and the column and the associated column vector and b for all possible pairs of row and column vectors this is potentially a little bit of an awkward way to sort of think about this so it's a case where writing out the summation convention can be a little bit more convenient so let's imagine that we did this and let's call this matrix c then if we were to think about the particular entry in c let's say the i jth entry in c then that is going to be a sum in this case let's say from k equals 1 up to the intervention n of a i k and b k j so the key thing to know to notice here is that a we're grabbing a row with the dummy variable k and here we're grabbing a column with the dummy variable okay and the first entry in the index some key properties to be aware of the first one is that in general matrix multiplication does not commute so that is to say that if i have a multiplied by b that in general does not equal b multiplied by a and in fact in this simple example here you can see that the dimensions wouldn't even make sense because it would be trying to the inner dimension would be m and p however matrix multiplication is associative so if i have a b multiplied by c that does equal a multiplied by the quantity b multiplied by c matrix multiplication is also distributive so if i have a multiplied by the quantity b plus c then that does equal a b plus a c and then the other thing to realize is the way that matrix multiplication interacts with the transpose so if i have a b transpose then what happens is it transposes each of the inner matrices and then flips their order and so we get b [Music] transpose a transpose so 90 of the time when you're multiplying matrices you're doing it the way i just showed you where you're computing the inner product between row and column vectors for all pairs of rows and columns however sometimes in machine learning you actually do need to do the element-wise product so that is to say if i have a pair of matrices both with exactly the same size so both a and b are in some are of a particular size then sometimes i'll write an element-wise product element-wise so what i mean by that is just that we're going to take the so if i look at the uh if i say that c is equal to a and then i'm going to write as a product like that then the entry c i j is just the actual product of the two entries a i j and b i j so all we're doing is multiplying these things together directly sometimes people call this a hadamard product when in doubt it's always the regular matrix product and not the element-wise product but sometimes you see notation like this in machine learning and that's the element-wise product note again that for this to work they have to have identical dimensions note also that this in general does commute since it's just uh it's just a bunch of scalar multiplications then the last topic i want to talk about here is the idea of a matrix inverse so matrix inverses don't always exist and when they do exist they only apply to square matrices the first thing you need to know though is you need to know about the idea of an identity matrix an identity matrix is one that does not change the value of another matrix when it is being multiplied by it so this is a matrix that we often write i and since they're square matrices if it's an n by n dimension dimensional matrix then we might write i sub n which is what the what the book does and this is a matrix that has ones along the diagonal and then zeros everywhere else and it has the property that if i take some matrix a that has the uh that's compatible with the dimension of the identity matrix and i multiply it by identity then i just get a back and this is also true subject to dimension compatibility that if i do if i pre-multiply i by a matrix a then i get a back so this is the key thing that we need in order to be able to talk about matrix inversion so if i have a matrix a that is square if there is a matrix b such that a multiplied by b is equal to i that is equal to the identity then b is the inverse of a and we usually write that as raised to the power of negative one so again let me emphasize that the inverse does not necessarily exist but if it does exist it is unique a couple of useful properties to know in the case where the inverse exists is that it commutes so a multiplied by a inverse is equal to a inverse multiplied by a if i take the inverse of a product a b and both inverses exist and the inverse of the product exists then i get b inverse a inverse so it is the inverse of each of them with the order switched and in particular a thing to note is that the inverse of a sum is not the sum of the inverses in general that is to say that if i have a plus b inverse this is not going to be equal to a inverse plus b inverse and you can sort of convince yourself why this must be true by just imagining that these a and b were one by one matrices that is scalars and so then clearly that wouldn't work in general for scalars that is to say if you imagine that say i don't know a was 2 and b was 3 then a plus b inverse is so a plus b inverse then is one over five but that is not equal to 1 over 2 plus 1 over 3. just as a simple example to demonstrate why we would not expect this to be true
Info
Channel: Intelligent Systems Lab
Views: 333
Rating: 5 out of 5
Keywords:
Id: 2KAxE47AOVY
Channel Id: undefined
Length: 18min 43sec (1123 seconds)
Published: Sun Jan 31 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.