Introduction to Discrete Random Variables and Discrete Probability Distributions

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Let's take a look at an introduction to discrete random variables. Suppose we're about to roll a die four times and record the number of sixes. The number of sixes in those 4 rolls is a random variable that will eventually take on a value. We know when we carry out the rolling and count it up we know that we're either going to get 0 sixes, 1 six, 2 sixes, 3 sixes, or 4 sixes. We simply don't know which one of these five values is going to happen. So the number of sixes is a random variable that will eventually take on one of these 5 possibilities. And these five possibilities are going to have different probabilities of occurring. A random variable is a variable that takes on numerical values according to some sort of chance process. Meaning there is some sort of randomness involved. Random variables can be either discrete or continuous. And this is going to be an important distinction for us because we're going to have to handle these situations a little bit differently. Discrete random variables can take on a countable number of possible values. So if we think back toward our die example, we had the possible values 0, 1, 2, 3, and 4. There were 5 possible value, so that's a countable number of possible values. Another way of thinking about this is that discrete random variables can take on a value from a set of distinct possible values like these. Now discrete random variables do not always represent a count. We might have a discrete random variable that takes on the values 1.5, 2.5, and 3.5. These aren't count data, but we have these three possible values that our random variable can take on, and as such it's going to be a discrete random variable. Continuous random variables on the other hand can take on all values in an interval. So suppose we had some container that had a maximum capacity of 2 litres, and we went outside and put that in our backyard and we intended on coming out in a couple of weeks and seeing how much water was in there. Well the amount of water in that container is going to be a random variable that can take on any value between 0 and 2, so any value in the interval 0 to 2 litres. But it can take on any value in there, so not just 0, 1, and 2, not just these distinct values like up here, anything in this interval. So an infinity of possible values in here, so 0, or, 0.3259862 litres, or any value between 0 and 2, so there is a continuum there, an infinity of possible values between these two endpoints. Here's a couple of examples of discrete random variables. The number of lottery tickets purchased until the first winning ticket. Well, we might get a winning ticket on our first ticket. Or we might have to wait until our second ticket, or we might have to wait until our third ticket. And there's no upper bound, so this is going off to infinity there. But even though this is going off to infinity, this is still a countable number of values. This is still a discrete random variable. How about the number of courses a randomly selected university student is taking? Well the possible values here are 0 (in some situations a person could still be considered a university student if they're taking 0 courses) for instance some graduate students. 0, or 1, or 2, or 3, there's some maximum here but it depends on the University what that would be. So there's some upper bound maybe it's five or six or seven or eight but that would depend on the specific University. So that is still a countable number of values here, so it is still a discrete random variable. Here are a couple of examples of continuous random variables. The time until a newly-released website gets its first hit. Well that's got to take on some value greater than 0, right, so let's just say greater than 0. And the height of a randomly selected adult Canadian male, again that's just going to take on some value greater than 0. There is some minimum value, we don't have an adult Canadian male that's less than two centimeters tall, say. But there is no natural lower bound here, no natural upper bound. So let's just say some value greater than 0, even though in practical cases there's some more practical range of values. In either one of these cases, there aren't discrete jumps. It's not like a person is 174 cm tall and the next possibility is 175 cm tall. Anything in there, there's an Infinity of values between those two possibilities. So these are continuous random variables. Let's go back to discrete random variables and look at an example. Approximately three percent of the United States adult population is under correctional supervision, meaning they're either in jail or on probation or on parole. Now this 3% is approximately true, but let's pretend that's exactly true for the purposes of this example. Suppose we randomly sample 2 US adults. And we let capital X represent the number of adults in our sample that are under correctional supervision. We typically represent random variables with capital letters near the end of the Roman alphabet, so very often we have X or possibly Y, or possibly Z. Let's list the possible values of X and their probabilities of occurring. Well, the possible outcomes, which I'm just going to label here as the possibilities are that the first person we get is not under correctional supervision, that's one possibility and the second person is not under correctional supervision as well, so I'm going to let N represent someone was not under correctional supervision. Another possibility is that the first person we get is not under correctional supervision but the second person we get is under correctional supervision. Another possibility of course is that the first person we get is under correctional supervision and the second is not. And the fourth possibility is that both people that we randomly sample are under correctional supervision. The value of X, the value our random variable takes on in these spots, well, if you recall, X was the number of people that are under correctional supervision. So here it takes on the value 0, here would take on the value 1, here it would take on the value 1, and here would take on the value 2. Now let's list the probabilities here. So if we calculated the probabilities of these things happening, if we are sampling randomly and independently, and the probability that a person is under correctional supervision is 3%, the probability they're not is 97%. And if they're randomly and independently [sampled] we can simply multiply those probabilities together here. So the probability of getting N and then N is 0.97 time 0.97. The probability of getting N and then C is 0.97 and then times 0.03. The probability of getting C and then N is 0.03 times 0.97. And the probability of getting two C's is 0.03 times 0.03. Now this works out to 0.9409 This works out to 0.0291. And of course this is the same something thing, 0.0291. And this works out to 0.0009. Now if we were to summarize that, if I said the value of X, if I summarize this with my value of X, and said it can take on the possible values 0, 1, and 2. And the probability of those occurrences, of our random variable X, well this one has probability of 0.9409 this one has a probability, well 1 happens in these two spots, so this is going to be these two added up together, and we get 0.0582. And 2 is 0.0009. So we might say something like: the probability that the random variable X takes on the value 2 is equal to 0.0009. But we might want to just do this in general, so we do want to write this sometimes the probability that the random variable X takes on some value, and we call that some value little x. And the capital X represents the random variable X, whereas little x represents some value of the random variable X. So we could call this value of X, we could sometimes wright that as, I can put it in brackets here, little x, and this probability we represent as the probability that your random variable X takes on the value little x. and we sometimes right that as p(x). Now what we have here, what we have just done here, is created the probability distribution of our random variable X. A probability distribution for a random variable X is a listing of all possible values of X and their probabilities of occurring. This could be a listing like we had on the previous slide, or this can be a formula or this can be some sort of graphic representation. We know that all discrete probability distributions must satisfy these conditions. p(x) represents a probability. And probabilities have to lie between 0 and 1. So p(x) is between 0 and 1 for all x. And we are listing all possible values of X, and their probabilities of occurring. So if we sum up the probabilities of all possible values of X, we're going have to get 1. And if we go back to our discrete probability distribution here, we would see that the probabilities do in fact all lie between 0 and 1. And if we added them up, we would see that they sum to 1. Depending on the discrete probability distribution, these values can be anything. Here we have 2, that doesn't lie between 0 and 1, maybe it's negative maybe it's a billion depending on the situation. The values of x can be anything, depending on the situation, but the probabilities have to satisfy those conditions on the previous slide. Now we very often plot this out to see what this looks like visually, So let's plot these values of x and their probability of occurring. Here's a visual representation of our probability distribution of X. Now the probability of zero was very high, and the probability of one was much lower. And 2 you can't quite see here, it's 0.0009, doesn't quite show up visually for us, but it's there. So plotting out our discrete probability distribution can help us visualize what's happening. Let's contrast what a discrete probability distribution looks like with what a continuous probability distribution looks like. Here I have our discrete probability distribution from the last page where we have these discrete jumps between these values. It goes from 0, jumping down to 1, and then jumping to 2 in this case. And we had these three possibilities here. Down below is approximately the distribution of the heights of Canadian males. And this is a continuous random variable and this if you see, is modeled with a smooth curve, we don't have these distinct discrete jumps like we do for discrete probability distributions. So when we do eventually talk about continuous probability distributions, we're going to have to handle them in a slightly different way.
Info
Channel: jbstatistics
Views: 336,476
Rating: undefined out of 5
Keywords: Discrete random variables, discrete, random, variables, discrete random variable, variable, continuous, distributions, introductory statistics, introductory, statistics, intro stats, intro, stats, introductory statistics videos, videos, statistics videos, statistics help, 8 minute stats lectures, 8msl, AP statistics
Id: 0P5WRKihQ4E
Channel Id: undefined
Length: 11min 46sec (706 seconds)
Published: Thu Nov 15 2012
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.