Live Day 1- Introduction And Roadmap To Natural Language Processing And Quiz-5000Inr Give Away

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello guys am i audible i'm live i guess uh i hope you are doing fine please to give me a confirmation if you are able to hear me out and uh yes so we are going to just start the live session in some time till then let's have some chit chat and let's see whether you're able to hear me out i hope everybody is able to hear me out right yeah so yes today we are going to start the nlp series and so just give me a confirmation if you are able to hear me i think i can hear my voice too from my youtube channel itself so do hit like because this is going to be an amazing series of nlp where i'm going to cover machine learning and deep learning and apart from that today i'm also going to give you 5000 rupees you know so by making sure that there will be a simple quiz in hand and if you really want to participate in the quiz uh make like i will start the quiz after i probably complete the session and after that once we complete the quiz i will distribute you somewhere around 5000 rupees uh all you have to do is that make sure that you follow me on instagram and whoever wins you know at the end of the day they can communicate me through the instagram and i'll give you the money live through my google payout phone pay you can actually give whatever information you want okay so today we are going to have a quiz and we'll select three prices i guess um let's make it like first price will be 2000 second price will be 2 000 and third price will be 1 000 rupees okay so we'll do it like that and uh uh the nlp plan the agenda will be in such a way that we'll try to cover everything from basics and we'll make sure that you know we go till bert and transformers so there's just not a simple session of basic session that we are going to cover but instead we are going to cover in an amazing way so if you are new to this channel please make sure that you subscribe the channel and again uh to participate in the quiz after every session will uh i'll make a quiz so that everybody participates into it and one more thing that i really want to talk about is uh how the learning process will be uh what all topics we are going to cover every day i'll be talking about it so instagram you can see the pinned comment uh all the information is there just go ahead and click it or just go and search for krishnak in instagram you will be able to see that too okay so [Music] hello hello hello hello everyone yes so let's start and uh just give me another one minute i'll tell you the agenda okay and today we are going to have a very interesting quiz whatever i teach you that related quiz will actually happen okay it can be a little bit difficult uh but again it is up to you how many people are actually participating okay so without wasting any time let's go hi ahead and share my screen okay so yes i'm going to use this entire writing everything practical implementation i'm going to do like that so yes uh the day one of natural language processing okay so we are going to have this amazing okay and this is specifically for machine learning and deep learning can anybody tell me how many different types of community sessions we have actually taken from starting i think from january we initially started with stats then we went to machine learning then we went to deep learning i also did eda time series eda time series analysis and many more things okay so the agenda of this session will be number one okay we are going to basically see the road map of nlp in roadmap of natural language processing because if you follow this roadmap you will be able to easily crack any interviews okay so roadmap of nlp okay second thing is why nlp or i can also make this as my first point but we are going to understand this okay and then third uh today we are also going to see a lot of examples a lot of examples okay a lot of examples real world scenarios and all uh and then we'll start with some basic things like uh something called tokenization right tokenization and then we will understand two more words which is called as stemming and limitation okay and then finally the fifth topic we are going to see something called as bag of words okay so we'll try to cover up all these topics today um for the people who are asking me what is the prerequisite this okay bag of words is a technique which will actually help you to convert or just give some time i'll just let you know about back and forth okay so coming to the next thing is that what are the prerequisites in order to learn nlp okay so i'm going to basically say the prerequisites okay and why we have started like this everything will make sense okay so the prerequisites is that we are going to basically learn python okay we are going to learn pyth basically uh you need to know python sorry these are the prerequisites right so you need to know learn pi you need to know python some amount of stats is required uh third at least some machine learning algorithms you need to know okay which i have already covered it through my community sessions fourth a nnn cnn idea you need to have not cnn cnn is also not required but at least a and n with all the optimizers loss functions you should need to know optimizes loss functions and all we like initially three to four sessions uh will we will be focusing on nlp related to machine learning but later on when we move into deep learning there we really need to know r and n and all uh lstmr lstmrn gru okay and uh we will be seeing a lot of examples with respect to that uh we will be having um bi-directional lhtmrn and many more things okay so these are the some of the prerequisites and again all this has been covered in my community sessions i hope everybody agrees to that right so this is all completed in community sessions okay now after we complete all these topics then we are going to have an amazing quiz okay the quiz uh we will be giving 5000 rupees for the first price 2000 rupees inr okay for the second price thousand rupees inr and the third price whoever is coming first top three i'll be giving them this much amount of money okay so in order to participate in the quiz uh and today only i'll give the money that basically means i'll be sending you to either google pay or something if people are from foreign countries i will send it through paypal okay so and uh for participating just go ahead and follow me in instagram okay because there only i will be able to take your information okay so whoever will be coming first to second uh third you will be able to understand this okay so uh you just have to drop me a message if you come first so that i can validate you that you are the genuine person or not otherwise i'll be thinking that okay you made a new instagram account then and there and probably communicated me to just where to attempt the quiz everything so i have already made the entire quiz over here so here you can basically see this entire quiz is there so that is almost ready uh we will start after once the session is completed okay uh sorry so okay fine i can make it to 1500 1500 because i want to distribute 5000 right sharp people so 1500 inr and 1500 inr okay so whoever participates in the quiz will do it okay so shall we start can i get a quick confirmation please uh make sure that you are in touch with me and please make sure that you watch this session till the end then only you'll be able to participate in the quiz in a proper way okay because whatever topics i'll be teaching you the same things will be coming in the quiz little bit difficult i'll make it okay so shall i start so please do hit like 643 people are watching let's make the light to 500 at least and then okay uh we'll start the session okay sir i don't have instagram id don't worry guys create your instagram id then it's very simple facebook account everybody has right so instagram becomes very easy so let's start okay now the first question that we are probably going to start is that why nlp okay so here we go ynlp why natural language processing okay now guys uh if i hope everybody uses google right google every day you actually use right in google uh you search a lot of queries you try to see a lot of recommendations right so if i probably just open uh google news i'll be able to see so many recommendations that recommendation is based on your user profile right um like what the things you are serving so each and every information google actually captures that and based on that those kind of content recommendation is done in front of you okay so this google right it has like a lot of things that is used with respect to nlp that is natural language processing now i have already created a graph which looks uh which shows you the differences between ai versus ml versus dl right so suppose if i make this right this suppose is ai okay artificial intelligence what is the main aim of artificial intelligence is that you try to create an application which can do its tasks by itself without any human intervention so if i consider this as an ai it let's consider that it is a universe okay at the end of the day we create an ai application now the second thing is that if i talk about machine learning machine learning is a subset of ai i hope i have i've drawn this diagram a lot times right in all my community session i usually draw this diagram because this it is a very important thing right it is a very important thing that we learn these things right so that is the reason i draw this so let's say machine learning machine learning actually provides the stats tools to analyze the data explore the data do future forecasting do separate kind of predictions and many more things right and if i talk about the third part which is super super important that is deep learning so deep learning is another subset of ai okay sorry subset of machine learning so this i will say it as deep learning here the focus is that we create a multi-layered neural network right multi-layered neural network and this multi-layered neural network what we do is that we specifically uh try to make sure that the a machine learns like how we human being learn okay so it's it's it's just very simple we are trying to mimic the human brain right in the case of deep learning so these are the basic things now you may be thinking where does nlp come into this see let's say that at the end of the day okay i will just write nlp okay now where does nlp come can we learn nlp in machine learning can we learn in lpnd planning this is the question right now one very simple thing is that nlp can be used both in machine learning and deep learning because nlp specif specifically says that here the data set that we are specifically dealing with is related to text right now understand you have a machine if you say that hey go and bring water right the machine will not just be able to do that specific task because machine language is completely different right the basic language for the machine to understand right it is it is completely different from the words or text that we usually give how we communicate with each other is completely different like how machine communicates with each other so you have to make sure that we provide or we convert that text or voice into a way into into a separate kind of data such that we specifically say vectors okay vectors so that we make the machine understand what exactly it is and machine in shorts understand binaries guys ones and zeros we have to give them numbers let's say vectors basically say that okay it is in the kind of a numerical format and based on that specific vectors only it will be able to understand okay so over here you can see that nlp can be used in both machine learning and it can also be used in deep learning why because over here specifically your data set is text now we need to create a such a model such in such a way that the model should be able to understand the text and based on this it should be giving us some output okay it should be giving us some output now it may be text summarization it may be chat bot it may be uh you know what should be the next sentence that should be coming in you know after a sentence let's say it can be different kind of task okay it can be a language translation right if i talk about google right google translator google translator is nothing but it is a machine translation what we say as a machine translation it is basically converting one language into another language so how that thing is actually possible okay somewhere it will be using techniques from deep learning which i will probably be talking about so what entire on the sessions right we will try to focus on understanding the basics of how these words is converted to a vectors how the machine is able to understand text and how it is able to give you a specific output we'll try to understand both theoretically and practically okay so what is the demand of nlp engineers like recent research you know most of the phds you know people who are doing phds for mit stanford and all right they are literally working in this kind of thesis and researches with respect to nlp because it has a huge scope you know one thing that i feel that is missing you know that is sarcasm you know i may say you that hey you are brilliant so this is a positive way of telling that you are brilliant okay so over here i'm saying in a positive sarcasm way but if i say hey you are just brilliant you don't know anything like that so this is another sarcasm you know so the major challenge that is existing right now is that you know sarcasm the machine is not able to capture the sarcasm you know properly yes google is doing amazing amount of work but it will still take time you know it is going to take time and probably in the upcoming days you know the sarcasm thing is also getting captured nvidia has come up with an open source uh algorithm or open source model which can actually detect some amount of sarcasm and recently uh the github hackathon that we had in eye neuron you know one of the guy used that and he dubbed the entire voice right just imagine i'm talking english right they can now dub my voice and convert it into french spanish hindi bhojpuri uh telugu kannada or bengali or any kind of languages just imagine that powerful those models are right so wherever your text is data you know wherever your text is data at that point of time you specifically need to use nlp okay so this is what is the importance of nlp get at the end of the day we are creating an ai application again understand now in this case the ai application may be a language translator or it can be a chat bot it can be a support chat bot and many more things right so this will be basically the entire uh you know why specifically nlp now if i come to why why because that point is still missing why an lp because we really want to make the machine because we want the machine to do our work right and how the machine will be able to do our work unless an it does not understand what we are trying to say right if it is not able to understand then obviously it will not be able to do your work so nowadays machine you know they are efficient in doing most of the automation tasks you know tomorrow you may be having robots that may be cleaning your house tomorrow you may be having machines will be cooking the food for you right and the communication medium right how they can actually understand us how they can understand our sarcasm all those things will basically be required so i hope everybody is able to understand please give me a quick yes if yes please give me a super heart something like you know and we'll have this kind of fun move right now let's go ahead and let's talk about roadmap of nlp now what exactly is the specific roadmap okay so this roadmap i'll just draw it in the second sheet so here is my entire roadmap roadmap of nlp how do we basically start with nlp okay now the first step i will go from bottom to top approach okay i'm just going to go from bottom to top approach now here you can see that let's say the first step you really need to know in machine learning in nlp specifically is called as text preprocessing text pre-processing now what exactly is text pre-processing see guys there are such scenarios and situation you know when we specifically get text data that may not be clean you know that may have that may be in the form of paragraph that may be in the form of sentences and in text pre-processing we should focus how we can convert this sentence or words into some format which we specifically say as vectors okay vectors basically means that it is it is a numerical format which we will be discussing as we go ahead and that numerical format when we feed it to the uh model it will be able to understand okay and it will try to find out okay what is the relationship between one word to the other word and based on that you can actually build a lot of applications like spam classification you know whether whether whether a comment is toxic or not and many more things even chat bots and many more right so here it is right now in text pre-processing initially we start with something called as basics like we have something called as bag of words we have something called as tf idf right we have uh something called as uh and they are very good amazing libraries which i'll be talking about and those libraries also i'll be mentioning about right so bag of words we have tf idf we have uh if i say word to vec word to vec is actually used both in machine learning and deep learning and deep learning we say it as embedding layers you know which is basically implementing so if i talk about this initial step is understanding the basic of um you know text pre-processing and if you understand the text pre-processing perfectly i think none no one can actually stop you and this basics is very much important whenever you are learning this text pre-processing text pre-processing in short says that how you can clean your data or convert into an efficient uh you know words or vectors how you can make sure that the machine will be able to understand things okay i would also instead of writing word to work over here uh we will also be understanding about two more terms which is basically called as stop words you know why stop words is used we can also say it as something called a lemmatization you know so all these techniques we will try to learn in the basic things right and probably in today's class i will talk about some of the techniques that is actually required over here now coming to the second layer if i go one more step up which is super super important and this step uh i will say text preprocessing layer two okay so let me do one thing let me just rob this thing and make it in a better way so that you will be able to understand let's say in the text pre-processing one first step i will be showing you how you can do tokenization okay how you can do limitization oh today the handwriting is not that good not satisfied at all so i'm going to draw it again see unless until i'm not satisfied i'll keep on teaching you okay that is my that is my style okay so i'm going to make this okay so the first step as i said is nothing but text pre-processing and the text pre-processing one i have various steps like tokenization basically means that how we can convert a sentence into words right then we have something called as lemmatization okay we also have something called as stop words right we also have a technique which is called as stemming okay so all these things we'll see in the text preprocessing one but again understand here mostly we are cleaning the data here we are cleaning the data in such a way that at least when we give it to the model the model will definitely like those data okay so uh still we are discussing about the road back and there are a lot of things that are going to come up now coming to the second layer okay second layer is again text pre-processing but i would definitely say it as part two okay so here i have text preprocessing again and now in this text preprocessing part two i am going to focus on how i can convert the words into vectors so here i am going to basically have techniques like bag of words i'm going to have techniques like tfidf right here i'm going to have techniques like unigrams bigrams you know so here in short what we are doing is that we are converting the words into some kind of vectors okay but yes definitely there will be some problems with bag of words and tf idf which i'm probably going to discuss as we go ahead since we are right now discussing about the roadmap now coming to the third step third step i would like to say this as uh there are some disadvantages with this text preprocessing that i have defined in the second step so this will be my third step of text pre-processing so in my third step of pre-processing here we will be learning techniques like gen sim we will be using amazing libraries like gen sim and i'll not say jensen jensim is a library we'll be using techniques like word to vac we'll be using uh average word to back you know i hope i i don't know whether you have heard this word or not but this word to work and average word to work is again a way of converting your words into vectors there are some problems with bag of words there are some problems with tf idf and the problems that exist over there we are going to remove it okay what is the disadvantage in bag of words and tfidf we are going to remove it with the help of word to vect and average word to it okay so here uh you know we are going to learn an efficient way of converting a words into vectors okay so this is how we are actually going to go ahead with so this is my third text preprocessing technique until here we go we will be having a good amount of vectors a good way of converting a word into vectors okay then after this we are also going to make sure that we solve some amazing problems in machine learning use cases then we will try to solve some good ml use cases the ml use cases can be spam classification it can be uh you know chat bot it can be text summarization to various different kind of use cases i'm going to take it and we are going to solve over here in ml use cases okay so still we are in ml right now still i have not moved in deep learning then uh after this what you are going to do is that now it is the time to move towards deep learning okay so i hope you can see this as a pyramid right your basics should get strong because if you're good at this then you will be able to learn this if you're good at this you will be able to learn this if you're good at this you'll be able to learn this if you're good at this you'll be able to learn this so coming to the next now because here we are actually moving into deep learning okay here we are actually moving into deep learning and with the help of deep learning you will be able to create an efficient model specifically related to nlp use cases okay but it is always good that we know all these things also nowadays many people use this deep learning but this all things should be taught because your basics needs to be very very strong whenever you are implementing anything right tomorrow if you have a simple problem statement why you want to directly use deep learning and try to solve it right if you're getting a good accuracy with machine learning techniques and all these text pre-processing techniques that i have actually told you why you have to actually go above right so that is the thing that what we are trying to focus on so once you complete rnn and all understand for this i've already told you you have to have a prerequisite of a n loss function optimizers and all okay now we go to the advanced text pre-processing side so this advanced text pre-processing side here i will write advanced text pre-processing and here we will be starting to learn about amazing things like word embedding word embedding which is super super important okay for converting that this internally uses a technique of word to work only but this is far most advanced when compared to the techniques that i have written over here there will be a lot of difference there you'll be able to understand how the text are actually handled in an efficient way okay so please make sure that you i hope you are following all these things okay so word embeddings will be there which we'll try to learn and now we will try to move to the advanced deep learning techniques which is called as bi-directional lstm bi-directional lstm we'll be having encoders decoders encoders decoders and this all will be actually helping us to even we'll try to create you know machine translation problem statements i'll do it practically in front of you then we will be having attention models right all these things we will try to learn so that is the reason i am not told you that only seven days i'll take nlp this will go till 15 to 20 days that is what i feel if i am able to cover every day one and a half to two hour session and i'll not go much with it because again i don't want you all to take stress off so many things we'll go slowly we'll try to convert this into 15 days okay then we will be having we'll be learning about transformers we'll be learning about the final thing which is called as birth okay so this will be our pyramid of learning the learning process will be going from bottom to top okay and we will try to learn in this specific way now what are libraries we are going to cover one library for machine learning we are going to use nltk then we are going to use spacey one more library is something called as text blob so with respect to machine learning will try to cover this three and and with respect to uh deep learning we will be covering tensorflow so using tensorflow if you want pytorch you can give me 1000 likes i'll also teach you in pi touch because if i'm teaching i'll teach everything you know but i need thousand likes okay if you give me a thousand likes okay i i i will say okay this is confirmed this is confirmed okay both is confirmed otherwise i'll just teach tensorflow okay okay i'm kidding and here i'll teach you everything okay so all those things will be actually taught okay and uh so i hope everybody's clear with this everybody got it if you're able to understand please make sure that hit like share with all your friends tell them i'm going to teach all these things just in a span of 15 to 20 days right it will be quite amazing i'm really really happy that i took this task because i was trying to do it but again i was really busy you know but the energy level is always high i really want to teach you in the live session so that everybody understands these things right so yes uh we'll do it okay okay perfect uh so this is with respect to the roadmap i hope everybody understood this if yes make it thousand likes i'll also cover pytorch i promise okay okay done i know you it'll it'll be thousand likes i i think i have given you a simple task so this was about the roadmap of nlp now what i'm actually going to do i'm going to show you some of the examples so again with respect to this we create a lot of applications with respect to nlp also like spam classification it can be chat bots right it can be text summarization it can be recommendation systems lot of things is there okay one thing okay i'm sorry i will also teach you hugging phase we'll also use hugging face library okay and this will only be done when you hit 2000 likes okay so please do that okay so 2 2 000 likes i will also complete of hugging face okay so hugging face will also be completed don't worry i will this will be the best nlp playlist that you have probably seen okay let's go now this is done now let's go and see some of the examples and you tell me what example this is exactly okay so i hope everybody has seen google news okay if i click on google news do you see this kind of recommendations that are actually coming when you are okay forget about google news let's consider krishnak if i search about krishnak do you see many things over here on the right hand side you can see this this this this thing that you are probably seeing right okay uh this is from graph graph knowledge okay graph theory we basically say it's a graph knowledge that has been built up so you can see my name over here you know for all the information from all the social profiles that i just collected over here and you will be able to see the recommendation what all related keywords people are searching about you know they are also searching about my wife wikipedia salary age yeah mera h i my age is 32 i don't have any shy to tell my age okay so here you can see all the information okay every information is basically done but understand one thing if i write if i click on images okay if i'm searching for krishna how this text is getting related to images i hope everybody has heard about dally too right dally too i guess everybody has heard about it you just write the text and it will automatically be converting to an image so that entirely thing is basically text to image conversion and that will basically be using nlp anyhow right so suppose i see that cat fighting with cat right cat fighting with cat you can see this images are there right images are there right and these are like not created images right this is these are the images uh that you can basically see over here cat fighting it's it's a real image okay okay cat comedy cat if i search right how google is able to understand anyhow it is able to see see comedy cat cat memes see this how it is being able to show away all the memories that are available in the internet you will be able to see over here right and don't worry we will do this problem statement also as we go ahead because i really want to teach you this time in an amazing way and i have not touched nlp from past many days so my interest is there i will do it okay now so if you say krish youtube okay krish youtube also you can see that first image is basically there okay krishnak i have to add otherwise it is just giving every random krish krishnak youtube right so here you can see all the images right from my youtube channel right and it it can even link with other things you know it can also link with other things if i search about eye neuron intelligence right you can see this it will be linking with me do you see my face do you see sudanshi's face do you see hitachi face because anyhow it is being able to find out information right that are interlinked okay that are interlinked to each other right so that is how super important this is all and they're creating some amazing applications with respect to all these things right and it is super super important all the examples is in front of you right and see eye neural intelligence it has been linked with ashanti is the is the company who have funded us uh again thank you for ashan for trusting in us so that of trusting in our affordable courses and all right so here you can see artificial intelligence you can see data analytics you can see technology internship if i probably search for krishna i may get other categories over here right i may get categories like twitter youtube missing values facebook feature engineering all these videos has been uploaded in my channel right so here you can see all these things is this amazing or not just imagine right so that is why i'm actually saying you that these all are something very super super important okay uh so i hope you have got all these things if i probably search in news also see news has been also been able to pick up my name right so i've just searched something this is like test to text recommendation i've given one text it has basically scanned from the entire graph knowledge of google and it is being able to see that from where my name is actually linked right so with my company with with the things that we actually do everything has been linked and here you can see everywhere probably my photo is coming over here my photo is coming here here also uh probably the name is there you can see krishnak is there here also you can see the name is there right so everything this text to text recommendation also will be happening and there's all things you can actually do it with the help of lstmrn only a bi-directional lhtmr and i'll show you everything we will try to do a lot of examples with respect to that okay so these are some of the examples with respect to this and you know dally to dali all these things are also available with respect to all this kind of things that we are doing okay so i hope everybody's clear with uh the roadmap so everybody is clear with the roadmap uh why nlp lot of examples and tokenization you know bag of words sorry this part i will be taking now okay okay so super excited till here can i get a quick yes if everything is going on fine because after that i will be announcing about the quiz also it will be quite good right so let's go ahead coming to the next topic let's start with something called as tokenization the first step of nlp that whenever you are starting to read is something called as tokenization okay now always understand guys if i consider a ml use case okay and let's say i am building a spam classifier okay spam classifier let's say gmail spam classifier or a mail classifier spam classifier let's say okay so initially for this particular case what do you think our data set should be can i say my data set will be email body or it may be an email subject and my output that you will be seeing is either yes or no that basically means it is a spam or a ham we basically say spam or ham right spam or ham okay so i think this information is there right so i have this three features right so this is my feature one feature two so this is my feature one this is my feature two and this is my output which are my independent features over here the independent features that i can basically say independent features i can definitely say is nothing but email body and email subject okay so these two features you can see that okay i have an email body an email subject and based on these two features i need to predict whether the output data is spam or ham okay let's say my email body say that you won one million dollar okay you won one million dollar and the email subject is that billionaire let's say okay so in this particular case in this particular case what do you think the output will be it will basically be a spam okay this will be a spam let's say uh the other email that i could is that hey krish how are you hey krish how are you so let's say this is my second statement this is my first data point second data point and subject stay there um subject basically i can write is that hello let's say over here definitely my output label will be ham i am basically not a spam okay the third thing will be that credit card you have one credit card okay credit card worth something and you have the winner something so obviously here in this particular case you'll see that this is a spam now think over it i cannot just give a machine okay i cannot just give a machine this kind of sentence and say that okay hey predict it how will the machine be able to understand this kind of words that is the main thing right now you need to do some steps the first step is basically tokenization the first step which we'll do is tokenization i will try to explain you what exactly is tokenization then the next step that we basically apply is something called as i can say stemming and before applying stemming also i can use something called as stop words i will talk about it okay then finally we apply something called as lemmatization okay so these are the three steps we specifically do the first step the second step along with stopwords and the third step okay now i'll be explaining you about the first step that is called as tokenization okay so the first step is tokenization now tokenization is nothing but it is converting sentence into words everybody clear with this now sentence into words basically means what so here in my email body i have you one one million dollar right so this is the entire sentence okay so what we'll do is that we'll try to divide this into words now my words will be u1 this will be one word second word and this will be third word and dollar will be fourth word right so here in tokenization what we focus on we focus on converting sentence into words because this words needs to be understandable by the machine learning algorithm or whatever algorithm you are trying to do so this is the first step of preprocessing as soon as i apply tokenization it is just going to take the sentence it is just going to take the sentence and convert it to words so this is the process of tokenization okay very simple very clear nothing so complex we are just taking the sentence converting into words now let's go towards something called as stop words okay let's say i have a sentence hey buddy i want to go to your house okay let's say i have this sentence okay now in this particular sentence do you find some words like i i'll not say i but at least i'll say 2 2 right they will make some words like off okay off and this kind of words you know may get repeated many number of times and if you want to remove this word because this words are not that important based on the output that we are going to get one word can be very important not right this word will be very important okay not or uh because i may say that i i like to uh i did go to the house i did not go to the house these two are opposite words right i did go to the house i did not go to the house so not keyword can play an now but in the case of two of the he she this kind of words right is not that important for some of the use cases like spam classification or it can be important for some other use cases like text summarization you know and okay for text summarization it can be important but for some of the use cases this will not be important so what we do is that we can remove these words we can remove these words and in order to remove the words we will be applying something called as stopped words now you may be thinking can you can you have your own stop words the answer is yes you can also have your own stop footage you can create a list of stop words and you can compare that how many words are actually present over here just to remove it okay but right now the library like nltk you know the stop words not is also present over there okay so if i want to really create my own stop words i will definitely not remove not i will create my own list like to the he see of go something small small words that you can see and i will probably create a list and i'll make sure that i will not include not okay so these all are not giving that much meaningful information but yes we really need to remove this because some of the use cases like spam classification toxic classification you know in text summarization we may require to go and all but in other use cases we will not require it okay so that is the reason why we are specifically using stopwatch so this is also one specific process in text preprocessing and everything that we are learning over here it's something related to text preprocessing because at the end of the day i want to make sure that the machine will be understanding this text okay so this is the second step so in the second step what we did is that we removed this small small words which are not playing a very important role now coming to the third step which is called as stemming okay stemming now this stemming word is super super important in stemming what we focus on is that we try to find out the base base of this word base of a specific word or base stem of a specific word let me show you one example okay suppose i have two separate words like this historical and history okay now whenever i have this two words okay this two words represent different different things yes context-wise i'll say that it represents something different word but the base and the stem are almost same right so if i apply on this specific word if i apply stemming on this word then this will get converted into history something like this okay it will be converted into h-i-s-t-o-r-i okay why and it is very simple we are trying to it is a process stemming is a process of reducing process of reducing words to their to their base word stem okay so this is my stem okay that is what we are doing but one disadvantage is that this word may not have any meaning not have any meaning this will definitely not have any meaning okay so i hope everybody is able to understand what is stemming right can i get a quick yes if you are able to understand till here yes yes i hope everybody is able to cut so in short here what we are doing is that we are getting a root word very simple okay root word i may take one more example root word also you can say or you can also say something called as base form okay base word stem base form but understand there are many words that can basically be used for defining things okay one more example that i can probably take for this is that let's let's consider one more simple word uh which is called as like uh we we we can definitely use something called as like this like this this word is also very important okay suppose i have three words which is called as finally which has called as final and which is called as something called as finalized now here you know that the context may be different but all has the same base base root right or base step so this will actually if we apply stemming on this we it will get converted to something called as final okay why because c base is almost same final final final right and later on this all words now again the meaning is gone right meaning is gone right so this is one some of the disadvantages with respect to stemming if i if i talk about one more simple example where stemming can really be helpful suppose i have a word which is like going okay goes go on if i apply stemming to these three words i will basically be getting go so in this case i am getting a meaningful word i am getting a meaningful word right i am getting definitely a meaningful word right so understand these things stemming can also be useful because it will be very very quick it is just trying to find out the base word and it is just trying to show you of it okay so i hope everybody is able to understand till here okay but now you know the disadvantage of stemming stemming is fast okay stemming is fast see advantages of stemming i'd say advantages of stemming stemming process is really fast is really fast it can actually help you for text pre-processing for huge data set okay but the disadvantage that i will talk about is that it is it is removing the meaning of the word is removing the meaning of the word okay it is removing the meaning or the word okay so this is the disadvantage now in order to overcome this disadvantages okay guys uh don't say that it will come as final because i have tried this with the stemming purpose many people will be thinking final should come or not but i tried it out with the help of an ltk library over there specifically fina was coming so whatever things i'm showing you over here right i did it and then only i'm showing you now in order to overcome the disadvantage what we do is that we have something called as limitation so this will be my third fourth step uh fourth step which is called as lemmatization tomorrow we'll try to see the practical also so in limitization what we do is specifically in limitization now this is super super important in limitization we try to convert the word but here we will be getting meaningful word okay so when i probably give this to a limitation in an ltk there is something called as word letter word net lemmatizer so here i will be getting a meaningful word history okay and you may be thinking how it is able to do it has the entire dictionary of words it will probably check with respect to the base word and it will compare okay now if i also have a scenario which it says like finally finally final and finalized this will also get converted to a word which is called as final okay so that is how limitations work now over here advantage yes we are able to get a meaningful word meaningful word but disadvantage will be that it is slow right i hope everybody knows why it is slow because it has to really do a lot of comparison with respect to all the dictionary of what it has it is slow when compared to stemi now if i talk about some of the use cases if you say spam classification spam ham classification or toxic classification toxic basically means whether a person is basically writing a comment whether it should give three star one star five star like that right in that particular case also we can use stemming okay but limitation needs to be used in chat bots so if i say about use cases if i talk about use cases okay in case of stemming okay here i can say in spam classification i can use this in spam classification i can use stemming second one is that comments classification whether it is good bad or review classification i can write okay review classification review classification now based on this you can also give reviews ba on my lectures that i usually take okay so it may be good or bad it should not be bad no yeah i know many people have cleared their interviews you cannot say it is bad if if you say it is bad then you definitely require a phd person to teach you okay that is the thing now in case of lemmatization if i talk about some use cases because here meaningful words is important so i can basically use something like text summarization okay language translation language translation here i can also write third chat bots because chat bots also require good complete limit i uh lemmetized words right so all this is super super important okay so it is definitely there and you can definitely check it out okay so okay perfect now yeah we can also avoid stemming and limitation that will try to learn it in a deep learning techniques but this is some basic things so overall if i talk about what is the first step that we basically do we do tokenization okay so in text pre-processing these are the steps we do text pre-processing these are the steps we do first is tokenization the second step that we focus on is applying stop words the third step we can basically do stemming and the fourth step that we can actually do is limitization and sometimes if you don't want to do stemming also you can skip that and you can directly do limitation but again depends on the use cases uh this is amazing right because see if when you are able to understand when we should do this when we should do that you know so that actually uh comes out with amazing curiosity right and then you'll start questioning yourself okay this is amazing this is bad okay perfect uh yes now we have completed this i hope everybody is able to understand okay so here we have also completed this now after doing the steps after doing this steps i really want to convert this words into some vectors the next step i will say main step two step one is this right then step two means what we try to convert words into vectors now this is the step now here we have cleaned the entire text okay here we have cleaned the entire text and now we want to convert the word into vectors okay and that is what we are going to do it by different different techniques the first technique is something called as bag of words bag of words the second technique that we are going to learn is something called as tf idf then we will understand what is the disadvantage with bag of words then we'll try to go towards tf idf then the third step that we are going to learn is about word to back there are some disadvantages in tf idf so we definitely need to know about word 2x so that also we'll try to cover okay so everybody's happy so i think i thought you'll be making the likes to go thousand okay so what two vectors shall i start tomorrow uh or you want me to start today otherwise i can start the quiz okay can we do the step two from tomorrow if you want word to work because that time we will try to understand about bag of words and tf idea but i think the introduction went well would you like to rate something out of 10 please go ahead no tf basically term frequency inverse document frequency okay so this is basically called as term frequency and inverse document frequency okay so this step we will try to take it tomorrow okay so let's go towards quiz uh what do you do i hope uh you like this session and we'll learn it and remember guys all these materials you know i will try to give it in the dashboard that is given in a description of this particular video enroll in this dashboard because there you'll also be able to find out the video link and all these materials will be given you is my handwriting good i like this diagram this diagram is the best diagram that i could ever draw just with my bare hands you know yeah i will be teaching attention models dilip kumar label to vector also i'll try to convert okay complete so quickly let's go towards quiz first of all everybody follow me in instagram follow me in instagram then only i'll be able to validate your quiz okay so i've opened my phone now go to this website which is called as mentee.com and login into it so go into mentee.com and remember please make sure that you follow me on instagram the instagram link is given in the pinned comment because after you win the prize you know then i will be able to send you the money so what you have to do is that message me in the instagram also okay so that part you can actually do it so i will be seeing for your message okay not right now uh but hit like everybody go to this meti.com use the code this one 6831792 and we will start the session whatever i have taught you today that session will be there again understand the first winner will get 2000 rupees the second winner will get 1500 and third winner will get 1500 okay so please make sure that you follow me and make sure that you actually so whoever will be winner they can later message me and then i will transfer you from here itself okay you have to message me your up id okay if one isn't on the instagram what to do then you can drop me a mail at krishna x06 gmail.com okay so go to mentee.com and use the code 68317902 quickly guys i'll give you two to three minutes okay but i hope you had an amazing session today or you can directly use the core qr code also everybody go over here you'll get a chance to win cash prize and this is the best way because every day i'll be taking session i'll be creating a quiz like this so that you will be able to learn this please share this notes this notes will be shared please make sure that you enroll in the dashboard that is given in the description okay so i can see 263 people have joined 266 267 704 are live come on all the 700 people join it it'll be an amazing session whoever is able to answer the quick and the right answer will be the winners over here so this is an amazing uh thing which is called as menteementy.com so go to mentee.com use the code this one okay go to mentee.com and use the code this one okay and please let me know whether you have joined just give a thumbs up and hit like come on let's make it 1000 likes so that we make this entertaining and it's a request see i'm i'm i'm i'm i'm saying you like this it's a request please share this links you know uh you know this this sessions of nlp with all your friends in linkedin you know understand we are we are focusing more on community building right so if community building happens in a better way don't you think everybody should use this right many people will not be knowing okay so it is basically just try to share it it will be good for others you know some people may be attending interviews and all okay i will talk about my email at the end okay so whatever i have taught that will be the questions okay please again it's a request again i know it may not be useful for you but it may be useful for others share this entire playlist right live nlp playlist to everyone tell them come and join i'm actually spending my one or two hours in all these things okay yes i'll also be covering cbov okay okay so 420 people uh have joined so let me go ahead and let me start the session so coming to the next part coming to the next part everyone okay which city are you from so this is the first question everybody start filling it so everybody start filling it this is word to cloud i'll show you this application in nlp also word to cloud so we have people from toronto also pune maximum people chennai bangalore mumbai maximum people are there yes certificate will also be given in the dashboard don't worry dhanbad bhopal oh my god so people are coming from many different different cities amazing sonipath wow this is amazing kathmandu also even from nepal right so this is amazing guys this is amazing many people have actually joined it 482 people have joined come on i want more people to join come on 693 are watching live i want everybody to go to mentee.com cx uh the code is c 6831 7902 go over there from germany amazing right so we have people from foreign countries also and i hope i make this one day i want to see 10 000 20 000 people live over here it'll be quite amazing from paris wow this is also amazing dallas okay bangladesh very nice so 500 people i'll give you another one minute so everybody join it and then i'm going to probably start it okay so here this is word to cloud and i will show you this with the help of code how to develop this word to cloud okay so maximum people are from bangalore bangalore then hyderabad chennai mumbai pune right amazing bangkok also good amazing okay let's go with the let's start it let's start and here we go okay so guys let's start the quiz everybody so everybody may have got this kind of logos only 70 or 75 of those are shown but let's start the question and everybody participate okay so here we go with respect to the first question everyone and if you again want to participate go to mentee.com use the specific code and here we go guys so answer fast to get more points everybody ready natural language processing is used in text classification topic modeling chatbots all of the above so you have 24 seconds whoever answers first and right will be in the top of the dashboard and the answer is three two one all of them all so people they are 401 people who have given the right answer right 106 people have called text classification topic modeling chatbots guys i had actually covered chatbots and text classification right so yes topic modeling no one said but chatbots many people said uh text classification but maximum number of people are actually following the classes right so amazing so let's go with the next question everyone so this was the first one and let's go to the second one now but again understand nlp can be used in all of the above with respect to all the options that we are given now everybody finger fast okay okay so let's go ahead and participate uh you know and let's start the second question answer fast to get more points what is tokenization very simple and here is your option breaking sentences into words converting words into vectors changing the meaning of words none of the above and here is your time's up okay 434 people have actually said right uh breaking sentences into words this is nice right so whatever i've actually taught you you are following is this a good way or not just tell me guys right sentences and you're getting money for studying just imagine right so yes more than 509 people's right for those people who are new go to mentee.com and participate right perfect so let's go to the next thing the leaderboard the leaderboard over here who is the first person so we have abhijit dey abhijit they is first paddy is second god of dudaria is third so this first three let's see uh 30 seconds is more than sufficient guys these are simple questions come on and here i'm giving you very easy in the upcoming will give you difficult at that time i'll keep one minute okay so abhiji dev if you continue like this you will definitely be able to get it okay before that guys so that i keep a track of everyone so that i can validate please follow me on instagram then only i'll be able to send you money okay so please follow me in the instagram krsnag06 the pinned comment is given in the description in the pinned comment basically in the chat okay now let's go with the next question okay first of all have you ever have you followed me in instagram or not first of all because i really want to see people over here so that i'll be able to validate because i want to give you the cash prize live okay and one has done that but i really want to do it okay okay perfect can anyone give quiz link again go to mentee.com use the code use the code 6831 7902 okay now let's go to the third question we over have five questions okay so let's go to the next question and here we go with respect to question number three up screen pay so the next question is why we use name entity recognition in nlp this is a little bit complex i really wanted to give you complex classifying entity into predefined labels creating a set of vocabularies breaking sentences into the words or none perfect so many people have actually said right it is classifying entity into predefined labels which is quite amazing you can see 322 people have said yes most of them are saying it right okay so this is amazing ah now this becomes a competition right it's okay if you don't use insta you can drop me a mail you know so this is amazing okay now let's go to the question four right so many people have told right okay code is very simple guys go to mentee.com and type the code 68317902 i can also give you the comment link over here it is but just go to mentee.com and use the code 6831-7902 okay perfect now the fourth question everyone okay let's go to the fourth question a simple one and here we go google translator is an application of you have 15 seconds to answer sentiment analysis for information and extraction information retrieval machine translation quickly and here we go time's up whoa 378 people have actually told right it is used for machine translation okay not for information retrieval guys google translator you know this machine translation i've specifically used this word don't you feel it i've used this word right d machine translation is the answer okay now this is amazing okay in the leaderboard we are good enough we are going we are going going going right it will be quite fun and it will be quite amazing right so let's go to the next one and before that we'll see the leaderboard oh my god caution is in the first position i guess yes caution is in the first position then we have nitin okay then we have part then we have jim okay guys please make sure that you follow me on instagram so that i can give you the money okay but over here you can see there is a very you know my new difference three seven nine eight three seven eight five three seven two seven three seven one one three six nine seven okay so the leaderboard is quite good okay and here you have gokul krishna so these people you know will definitely be able to win it right okay coming with the final question of the day whoo final question okay and by this you can now determine who will be the winner okay before that hit like make it thousand come on this will be the final question okay hit like make it thousand and here we go answer fast to get more points which is the correct order of pre-processing in natural language processing so you have stemming tokenization limitation limitation tokenization tokenization stemming limitation none i've taken this guys i've taken this just now just a while back just a while back okay and 445 people have actually said it right 47 said it wrong seven this this now see why i've kept the less timestamp right because these are very easy questions right that is the reason i had actually kept it and i definitely want you all to try it out okay so tokenization stemming lemmetization is the answer i hope you like this quiz let's see the winner now the winner is caution okay caution isn't the first gokul krishna i think it is in the second or what okay so we have the winners kosciel gokul krishna nitin ping me in instagram and caution claps caution is the first winner okay caution krishna nitin please ping me in the instagram and daily we are going to have this kind of session caution ping me in the instagram quickly with your upid i can actually see it kosciel sharma okay caution yes i can see your message please drop me a message drop me your insta up id caution first of all and then we have gokul krishna gokul krishna gokul krishna your name should be almost same guys and then we have nitin partial did you message or not yes nitin i can see his message send me your upid nitin naveen reddy is saying sending his new pid okay navin said navin sir is also there okay goku krishna you there okay goku krishna also i have received it google krishna sent me a upid caution is also there and please make sure that you send the screenshot everyone send your screenshot first three winners okay so caution upid i'm getting it goku krishna up id is i got gokul is basically in the second so i am going to transfer gokul first so gokul just confirm or probably right in the chat whether you've got the money or not okay so here i'm pasting it [Music] so 1500 to go cool quiz winner okay so you can see guys i have transferred him okay so if you i'm transferring him just a second i don't know why it got rejected okay i'll do it by phone pay then phone pay okay i'll do it one by one but i hope everybody liked it i hope everybody had fun just a second i'm doing it everyone i'll give it wait just a second okay so for gokul the money has gone everybody can see it over here okay 15k so gokul uh can you just confirm whether you have got the money or not now i will go with koshel now okay nitin has also messaged me nitin please send the screenshot before that i've got cautional upid caution pp01 okay add the rate okay access okay okay caution also caution is the first price yes 2000 for caution so caution can you confirm whether you've got the money so everybody can see to this also some problems are happening yes caution also done caution i think you've got it okay google says received sir nitin also i've sent it nitin please share me the screenshot i guess nitin also won 1 1500 perfect okay so final transaction so i hope everybody liked it now it's your thing guys please share with everyone okay so nitin can you just confirm i think you've also got it uh and congratulations for all the winners and yes we will be doing many more things as such but at the end of the day see this kind of events the questions were quite simple i asked you the questions whatever i have actually taught you uh and uh the reason why i'm doing this so that you don't miss the session after this you know you try to cover everything okay so that is the reason why i'm taking in this specific way so overall i hope everybody liked this session i hope everybody learnt it please make it thousand like one special request is that share it in linkedin tag me over there share to all the people who require this help see someone may not be knowing someone is in the need of learning all these things you know unless and until you don't share they'll not be able to know it you know you cannot share through the entire world just through youtube right word to mouth is definitely required so this is my request from my side for every one of you sharing is caring because of you sharing to someone right it may help them to clear jobs in an amazing way and this has happened that many people say that krish your channel was shared by one friend you know so tell them i'm going to take this entire month with respect to nlp sessions right so thank you and this kind of session i think was amazing i guess and i will also make sure that every time i have a quiz so that everybody earns learn learn and earn right so this is just my contribution from my side right today i give 5000 tomorrow i can make it 10 000 but again what i'm actually going to do is that i'm going to send all youtube money to you all okay okay this was it for my side guys i hope you like this specific video please make sure that you subscribe the channel make it thousand likes and i'll see you all in the next session that is tomorrow at 7 pm thank you one doll keep on rocking bye and hit like and share with other guys and also make sure that you subscribe krishnak in the channel uh because there i teach data science hindi over there okay so yes this was it tata bye bye keep on rocking and i'll show you how to implement word cloud just in a couple of days
Info
Channel: Krish Naik
Views: 131,317
Rating: undefined out of 5
Keywords: yt:cc=on, nlp session, natural language processing for machine learning, nl for deep learning, krish naik data science
Id: CG9iLLhqQF8
Channel Id: undefined
Length: 80min 50sec (4850 seconds)
Published: Tue Jun 14 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.