Word Embedding - Natural Language Processing| Deep Learning

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hello my name is first hanger welcome to my youtube channel so guys today in this particular video we'll be discussing about word embedding nowadays in natural language processing I've completed bag of words I've completed tf-idf you know and I in deep learning playlists also I am still in a list iam had covered their list iam theoretical part now the main thing is with respect to implementation so this particular topic will be added in both playlist natural language processing and in deep learning playlist because both these particular videos are important with respect to NLP also with respect to deep learning also now if I come towards the practical implementation of a list iam right recurrent neural network suppose I want to do a sentiment analysis for the text data at that time I can efficiently use this embedding layer which is available in chaos but before that we need to understand what is exactly word embedding now if I go back guys I have already taught you about bag of words I have already taught you about tf-idf right now you have seen in bag of words the representation and you also understood the disadvantage that in bag of words you will not be getting much more semantic information in tf-idf bit of semantic information you will be getting now the third technique that I would like to you know just to overcome this particular advantages will be using something called as word embedding techniques now in word and meetings I have two techniques one is what do I can the other one is glove I will be covering both this in NLP playlist I've already uploaded the explanation regarding word to week right and then we will also be taking with respect to globe also okay so let's understand what exactly is embedding technique and how we are actually overcoming the disadvantage of TF ideas because understand the most important thing in NLP is basically the text pre-processing converting this text data into some vector representation so that you the algorithm will be able to generalize this words to do any kind of predictions or to do any kind of sentence generation and many more things so that will try to discuss so let's let's go ahead and try to first of all understand some important or basic terms with respect to word representation so I'm just going to write over here as word representation okay now suppose in this word I have a dictionary which is having somewhere around ten thousand words okay so I'm having addition area where I have ten thousand words now if you remember bag of words I can also say it as one hot representation okay suppose I take a math word like man and this man is actually present you know in this particular dictionary of words at the five thousand location or suppose I can say it as it is present in the five thousand location suppose I'm just taking an example if I want to convert this man into a vector in the form of one hot representation what I can do is that I can take the ten thousand word vector representation like this all the other words will become zero only this man word here will be becoming one and remaining all will be 0 this 1 will be the index of 5,000 right so this is the index of the dictionary and remember in dictionary you have all the words in sorted order with the index right so suppose this is present in the 5,000 index similarly suppose I have one more word like hooman okay so suppose I have one more word like human now this woman is present in nine thousand index right so how will be this vector representation it will be somewhere like this and at the one thousand location I'll be having a one right so this will be my one thousand okay nine thousand location sorry so now you can understand that with the help of one hot representation we can actually convert this word into this kind of vectors remember in bag of words we are actually trying to convert in a different way tf-idf we are trying to convert in a different way which again had some disadvantages but the efficient way to do it is by using word embedding now before understanding word embedding I am just showing an example of one what representation so this is my man representation vectors representation and this is my human material presentation now understand one thing guys this this size this whole size right it is somewhere around ten thousand dimensions right ten thousand words are there and we are actually converting this man into ten thousand dimensions or ten thousand factors right so this is the vector representation and this is a very sparse matrix and similarly if I have many many words right all the representation will be in this form only right where I'll be having zeros and there will be one index which will be having one now what is the main problem with this particular representation try to understand this okay now when my machine learning algorithm is getting applied or a deep learning algorithm which is getting applied it is very difficult you know it is very difficult to generalize these vectors okay I cannot understand like if I want to find out the similar words right it is very very difficult to understand from this because some of the other index in every word you know that only one vector will be one and remaining all vectors will be zero so definitely similarity of the words also cannot be found out there is no much semantic information of the words also so if I say semantic information that basically I want to find out similar words let me take a very good example suppose I say that I want to eat a pineapple cake okay the second word is that I want to eat an apple cake now suppose I have in my training data set where I have the sentence that I want to eat a pineapple cake right now pineapple and apple if I if I see with respect to this representation I cannot conclude that they both are similar right both are similar now suppose if I want to predict what will be coming in front of Apple then it will not be able to find out it will not be able to generate the sentence that is just an example what I'm taking right so this has a lot of disadvantages you know it is sparse and you can see the size is very big right the size is around ten thousand vectors just just imagine ten thousand vectors sometimes if you have it would be very difficult for you to execute right the car the model will not be able to also give you a very good results right so you can understand that this is very very high dimension and it is spa it is having a sparse matrix sparse matrix basically means that you have so many number of zeros and you have less number of ones okay so this is what a sparse matrix basically indicates now in order to overcome this particular disadvantage you know there is a concept that is something called as word embedding now in word embedding there is a concept of feature representation now what we do is that we take all these 10,000 words or suppose if I have considering this words like boy girl king queen apple mango we will try to come them into vectors based on some features now what are the features over here suppose we have taken some features like gender royal age foot blah blah blah blah somewhere around 300 features are there okay so it can be any number of features like action you know or I can say good bad something like that right so those kinds of features can be selected and based on this particular feature whatever our words are we'll try to convert that into vectors now how this particular feature looks like understand suppose this gender is now related to boy-girl king and queen it is not related to apple and mango you can understand this right because gender is nowhere related to apple or mango if I take fruit fruit is an example which I can relate to Apple and mango at that time I cannot relate to this particular feature now suppose I have got vectors like minus 1 1 for boy you know which is uniquely representing vehicle you have king and queen which is uniquely representing like minus 0.9 2.93 right but when I come to Apple and mango at this point zero and point one right now in this particular example you can see that gender we can see that we can clearly see that okay there is a representation with respect to boy and girl king and queen right now suppose if I go to royal definitely if I if I have a word if I have a feature called as royal if I want to relate that I can only related king and queen I cannot relate with some other features or some of the words away right so at that time this values will be little bit higher and this will be similar also since the value is very very much near this value is very very much near these values in very very much near right so instead of making this one hot representation what we focus on is that we want to create this feature ice representation a feature is representation basically means that I am taking this particular word whatever word are there I know the index number based on this index I will be I will be having some features and that with with the help of this features it will be represented in form of vectors and what is the advantage of this understand guys this all features suppose I consider at 3 to 300 dimensions right so this vector that is getting represented for this word is somewhere around 300 dimensions right similarly for this word I have 300 dimension so this would have 300 dimension so here I told in one word representation we have a higher dimension and sparse matrix Oh we have low dimension and dense matrix I can say it as dense matrix right now when I have this okay it is very much important now what is the usefulness of this guys I I made a video regarding cosine similarity okay I made a video regarding cosine similarity in my machine learning playlist you can go and check that particular video that will actually help you to find out the similarity between two vectors it will help you to find out the similarity between two vectors not let us take this particular example okay suppose I'm just going to rub this so that we'll be able to understand it properly oops sorry I I rub this part also but I hope you have understood this if not you can actually refer just going little bit back okay now if I have a scenario where I say boy tends to good suppose I want to find out the analogy analogy okay boy tends to go then what what do I say what will King refer to now how this thing is computed just understand this now boy and girl I know the vectors is over here and here now what I can do suppose I I consider that boy and girl vectors I'll consider this as my x1 this is my x2 now if I do X 1 minus X 2 that basically means I'm subtracting this vector with this vector right so I will be getting somewhere like minus 2 then all the values will be remaining zeros right in this particular case you can see it will be 0 right now similarly if I go and take king and queen like suppose if I take King now I want to find out which word is an allergy of King right which is what is actually similar to King right now if I take an example of king and queen now here I will be actually finding out the difference and this will also be somewhere near to minus 2 right understand this is minus 1 minus 2 minus 1 minus 1 is minus 2 right if I do same thing minus 0.9 to - 0.93 it will be somewhere approximately around minus 2 right and the remaining all the values will be 0 approximately equal to 0 understand this - I'm not saying exactly 0 approximately equal to 0 because these two values are almost similar these two values are almost similar right these two values are almost similar because these two are similar words and we are actually creating this vectorize representation based on feature now this particular point all analog is considered from gender right now based on gender and based on this feature value we can find out suppose if I am having a neurology saying that Y tends to girl the model King tends to definitely by seeing this difference will be getting this particular vectors with respect to Queen now if I try to find out the cosine similarity between them cosine similarity see guys cosine similarity is heavily used in recommendation systems also right it is heavily heavily used okay so if I use the cosine similarity I will be able to find out that this distance is very very less right this when this distance is very very less then definitely my model can say that okay the most similar word to King is Queen right so like this the whole representation is basically done and this vectors play a very very important role for doing this because you will be seeing that guys in the future in the in the upcoming classes I'll be implementing LH TM recurrent neural network for some sentiment analysis right at that time you will be seeing that I'll be creating this embedding Vettel's I'll be creating these vectors based on the text that I have by using the same concept of one heart representation and then I'll finally I'll convert that one hot representation into this kind of feature eyes vector by using chaos in chaos we have something called as embedding layer okay you'll love that probably to modern day I will be uploading those videos because you need to understand this thing and this is the intuition part of word embedding lives okay now one more thing I want to basically represent I'm just going to rub this again now I know that this is my three dial 3 this is my three hundred dimensions right you know that there is 300 dimension now if I convert this 3 demanded dimensions into two dimensions suppose if I'm you know modifying these 300 dimensions into two dimensions at that time you will be seeing one more interesting thing okay suppose I have this point as king and queen you'll be seeing that king and queen will be near you'll be seeing that and mango will be near you'll be seeing that a boy and girl will be near or you will be also seeing man and woman will be near so this kind of scenarios will be seeing where you can actually group them into similarity like when you are trying to convert this three-time 300 dimension to two dimension you know by using some dimensional reduction techniques at that time you will be seeing that this all features will be very very near to each other so definitely a word embedding layer helps you to find out the most similar vectors because though the vectors is created in that particular way only right so this is pretty much interesting to learn guys and word to make is one of the embedding word embedding techniques you have glove and you have also libraries like Jensen and all we shall actually help you to get those vectorized kind of format of data as you know you just have to pass the data whatever data you have and automatically it will be creating this kind of you know our vectors based on the number of dimensions you have actually considered okay and again if you say that Christy even though in early internally how it is working right it is very very difficult to say guys it is very very difficult because you cannot just extract each and how many features has been considered what are features as we considered it is very difficult to say that but this is a wholesome idea regarding word embedding there will be some features that will be considered and based on that these vectors will be created right so this was all about fear word imprinting guys and so yes guys this was all about this particular video I hope you liked it please do subscribe the channel if you have not already subscribe and share with all your friends guys because tomorrow I will be coming with the practical implementation of LS TM where I'll be specifically using word embedding techniques and I'll be actually showing you by showing some very simple examples so that you can solve any kind of problems
Info
Channel: Krish Naik
Views: 72,427
Rating: 4.9434166 out of 5
Keywords: data science tutorial javatpoint, data science tutorial python, data science tutorial online free, python data science tutorial pdf, python data science tutorial point pdf, what is data science, data science tutorial tutorials point, data science course, natural language processing
Id: pO_6Jk0QtKw
Channel Id: undefined
Length: 15min 9sec (909 seconds)
Published: Fri May 08 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.