P Value and Hypothesis Testing Simplified|P-value and Hypothesis testing concepts in Statistics

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
some of the topics are unnecessarily scary in the world of data science one of such topic is p value and hypothesis testing welcome to unfold data science my name is aman and i am a data scientist if you watch this video till end i am going to make these two things extremely simple for you let's start the discussion so what is p value by definition so i have written here the definition p value by definition is something of something being true let me fill these two blanks here p value is probability okay probability p value is probability of null hypothesis being true okay now i'm sure all of you must be aware of probability concept by now if not there is a video created by me the link is right here go ahead and watch that video to understand the basics of probability okay that is about probability coming to null hypothesis what is null hypothesis so null hypothesis again we will define here okay so definition of null hypothesis i am writing here an assumption null hypothesis is an assumption that treats everything equal and similar this is very important guys an assumption that treats everything equal and similar okay what will be the examples of null hypothesis can you think in your mind right now for example we are going through a lockdown due to this current pandemic situation okay so a null hypothesis may assume that situation before pandemic and situation after pandemic are same in other word a null hypothesis will be something like gdp before pandemic i am writing an statement here ztp before pandemic okay and gdp after pandemic i am talking about global gdp here gdp before pandemic and gdp after pandemic are same so this is a statement which treats the before pandemic situation similar to after pandemic situation that is a default assumption now logically do you think this is a correct statement to make gdp before pandemic and gdp after pandemic will it be same so you must be aware that many research are going on and telling that gdp might fall by five percent global gdp country wise also it will be in similar range as well so this is a null hypothesis now what is the use of p value is using the data using the data you prove this hypothesis to be wrong and then you accept the alternate hypothesis so what will be the alternate hypothesis gdp before pandemic is not equal to gdp after pandemic so till now we have discussed p value is the probability value of null hypothesis being true and what is a null hypothesis an assumption that considers everything to be similar and equal all the situations to be similar and equal another example can be front ventures are equally intelligent as the backbenchers this is another assumption means all the students in the class are equally intelligent front ventures are equally intelligent as the backbenchers how can you prove this statement as wrong you compare their marks and then you say you know what front ventures are scoring more hence they are more intelligent backbenchers are less intelligent so with the data and p-value you either accept the null hypothesis or reject a null hypothesis this is by definition now let us see what process do we follow when we go and do the hypothesis testing this process that we discussed now is known as hypothesis testing okay so what is hypothesis testing the process of accepting or rejecting null hypothesis how do you do hypothesis testing practically so in our gdp example if you have to prove whether null hypothesis is right or wrong what you will do you will go and collect the data right so very first step will be collect data okay what data you will need for to prove that gdp before pandemic is different from gdp after pandemic you will take the gdp of different countries in the world right and then that becomes your data next is your significance level that you have to define define significance or significance level now what is the significance level so i am sure if you are into data science for some time or you have been watching some tutorials you know there is a standard definition of 0.05 for p value the significance level the meaning of this 0.05 is if i take data of hundred countries at random if i take data of hundred countries at random then five percent or five countries the gdps will be same which means for five percent of the cases the null hypothesis will hold true that is the significance level that we fix with p value okay to give you with with another to make you understand with another example you can say like this if you take hundred countries gdp gdp before pandemic gdp after pandemic then null hypothesis will be true for how many cases five cases so that is by definition how null hypothesis can be used and how significance level can be used what next if you have a significance level if you have the data then you can either accept or reject null hypothesis either accept or reject null hypothesis the meaning of accepting null hypothesis is you accept what has been said as null hypothesis the meaning of rejecting is you accept the alternate hypothesis so what will be the alternate and null hypothesis by definition they are written as h naught and h a so h naught stands for null hypothesis and h a stands for alternate hypothesis in our case h naught will be gdp is same before pandemic and after pandemic h a will be gdp is not same before pandemic and after pandemic that is how at high level p value and hypothesis testing works now there is one step in between once you have defined the significance level then you have to do certain tests to get this p value so what are those tests okay so you need to do certain tests to get this p value now there are some statistical tests you would have heard of i am just going to write down here some of the tests so if you have heard of something known as t-test if you have heard of something known as chi-squared test right if you have heard of something known as anova if you have heard of something known as z test all these are different different tests that is done on the data to obtain the p value now how do we interpret p value using the significance level now we have the data we have the test and we have the p value output then we can select to reject the null hypothesis so what significance level it chooses chosen normally so all these tests in my upcoming videos i will explain you on what kind of data in and in what kind of scenario these tests are used okay before that one thought here on what kind of significance level tells you what is the strength of your null hypothesis okay or what is the with what confidence you can reject or accept the null hypothesis so in industry there is a common practice where if your p value is less than 0.01 okay if your p value is less than 0.01 then you have a very strong case against null hypothesis very strong case the meaning of this is only in one percent of the cases your null hypothesis will hold true in 99 of the cases your null hypothesis will be false okay if your range is between 0.01 to p value 0.01 to 0.05 then we say that it is a strong evidence against null hypothesis okay if this range expands further for example if your p value is in the range of 0.05 to 0.1 then you can say mild right mild evidence against null hypothesis and if your p value becomes more than 10 or more than 0.1 then you can say no evidence against null hypothesis and you accept the null hypothesis in this case so what you would have seen in general you know internet blocks everywhere that this is the range that they take 0.05 is the boundary that they set and then they say that there is a strong evidence against the null hypothesis this is how p value and hypothesis testing work i wrote some of the tests that are performed on the data to get p value in all the tools are python everywhere you will get the packages which will give you the p value if you throw the data but how do we interpret that and how do we choose the right data for right algorithms that is what i will discuss in my next video if you have any doubts on this topic write mean comment this topic was demanded by many of you hence i created a video again i will finish all those tests one by one and show you how to do that so write mean comment if you have any doubts give me a thumbs up if you like the video i'll see you all in the next video till then stay safe and take care
Info
Channel: Unfold Data Science
Views: 18,154
Rating: 4.9240508 out of 5
Keywords: p value, P Value and Hypothesis Testing Simplified, P-value and Hypothesis testing concepts in Statistics, What is hypothesis testing?, significance of p-value and hypothesis testing, Null hypothesis and alternate hypothesis, Steps to do hypothesis testing, Significance of p-value, p-value hypothesis testing, p-value explained, p-value statistics, hypothesis testing statistics examples, hypothesis testing statistics, hypothesis testing, p value in hindi, unfold data science
Id: hg39AqT9Hdc
Channel Id: undefined
Length: 10min 18sec (618 seconds)
Published: Wed Jul 22 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.