The Biggest Ideas in the Universe | 19. Probability and Randomness

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone welcome to the biggest ideas in the universe i'm your host sean carroll and we're continuing today our journey back up to the world of our everyday experience to the macroscopic world of people and puppies and planets from the microscopic world of fundamental physics quantum field theory and so forth and today the big step we're going to take is to admit that none of us is laplace's demon we started this series many videos ago talking about the clockwork universe and our ability to in principle predict exactly what's going to happen in practice we're not able to do that so we'll be talking about are the concepts of probability and randomness these are ideas that are both very important in physics obviously in science more broadly but mathematicians have a lot to say and philosophy has a lot to say as well so there's subtle topics that we're not going to do complete justice to but we'll get some of the basics across and i want to start let's remind you first about what the clockwork universe was the universe that we thought was there in uh classical mechanics and one in which in principle newtonian laplacian mechanics you might want to call it giving laplace some credit here because he's the one who came up with the idea of laplace's demon and vast intellect that could know in principle the position and velocity of every particle and from that make predictions both about the future and retrodictions about the past okay so this is a certain viewpoint on the world now we know that quantum mechanics is going to mess this up a little bit but let's uh we'll save that for later in this video but the picture you remember was you have space at some time so time is going up and there's going to be some stuff in the universe which is going to evolve you know from these particles moving with these velocities to some other set of particles moving with different velocities and the laws of physics get us there and the relevant law of physics here is simply f equals m a the force is mass times acceleration in other words the mass of the particle times the second derivative of its position with respect to time first derivative is the velocity second root is the acceleration and the point is if you have a bunch of particles interacting through some forces and you tell me their positions and their velocities this equation tells you how the velocity changes instantaneously over time so in principle i can use calculus to chug forward or backward in time and figure out what the system is going to do the whole system if it's a closed system if i know everything that's going on inside it so i actually uh so we're going to talk about deviations from this viewpoint i mean not that this viewpoint is uh wrong in classical mechanics if classical mechanics were true this would have been correct but it's not the whole story because we're not liposucting we don't know the positions and velocities of everything so can we do something or are we just stuck and being able to say nothing at all but before i do that i want to give you a footnote it's it's bad to start the video or an essay or a lecture or anything like that with a footnote that it's kind of in the side but in this case i think it's it's kind of a very important conceptual point and that point is that even this newtonian system classical mechanics is actually not deterministic in the way that you might have hoped you know we've talked about it as if it is uh it turns out when you get into the equations a little bit more carefully you find that this hoped for determinism isn't quite there it is almost there in fact technically speaking it is almost always there but it's not always there and to show you that let's give you a counter example there's a famous counter example called norton's dome uh john norton is a philosopher of science university pittsburgh he's alive contemporary guy hi john if you're if you're listening to this so he wrote a paper about causality and he uh suggested this as an example of when it would be hard to pinpoint the cause of something happening and the idea behind norton's dome is it's just a classical mechanical system it's in spherical cowl and so there's no friction there's no air resistance nothing like that and we are literally considering a dome so by which we mean uh some sort of structure architectural structure okay in the shape of a dome and there's a point on top and this dome has a very specific shape so if you call the height h i don't know why he did this but h is the distance down from the top of the of the dome there's a specific formula for the height uh as a function of r the distance you are away from the middle and h goes as 2 over 3 g r to the three halves where g is the acceleration due to gravity so we're just having a dome in earth's gravity that's 9.8 meters per second squared for those of you remember your high school physics right acceleration due to gravity so of course there is a kind of trajectory for a particle let's put this in color so you can see it there are kinds of trajectories for particles that just sit exactly at the top of the dome forever so remember we're not including quantum mechanics no quantum fluctuations we're not including air resistance or random motions of molecules were being completely thought experimenty and saying that in principle a ball frictionless could sit exactly at the top forever okay no one is surprised by that but it's unlikely to happen in the real world it's like trying to balance a pencil on your desk exactly it's it's hard to do in the real world it requires strictly infinite precision in putting it there which is hard to achieve in reality there are also of course trajectories where the ball rolls down the hill and what norton showed was that if you in with this particular shape of the dome there's some other shapes that also work but a typical generic shape would not work this is one that does you get the following property that there are solutions where the ball literally sits stationary not moving a little bit not moving at all okay it's just sitting there stationary and then it starts to roll down and this is happening without any force this is why he was talking about it in a paper on causality nothing started it happening okay nothing pushed it it just suddenly starts rolling down with a certain trajectory as a function of time and everywhere including at that moment when it starts moving it solves newton's equations okay it solves the f equals ma equation you can figure out what the force is due to gravity on a dome of this size and uh the size and shape and what happens is the reason why it's possible is because at that point at that point where it just starts moving the position is zero r equals zero okay it's just sitting there at the top the velocity is zero it hasn't started moving yet and the acceleration is zero because there's no force acting on it is right at the top of the dome but the third derivative of the position with respect to time is not zero in this particular solution that norton was able to find which means that if you wait some finite time later the ball will have moved and that kind of solution is going to be a solution for any moment that the thing starts to move so you can tell me that the ball stays there over time for a certain amount of time and then at some moment that you pick completely arbitrarily it starts to roll that is a solution to newton's equations of motion and furthermore uh there are backward solutions right the laws of physics are time reversal and variant so there are solutions where the ball rolls up and it slows down and then it gets to the top and it stays there forever okay so in both of these cases you can't predict the future from the past and so just to be super duper clear here when the ball rolls up to the top and and sits there it doesn't roll up and go slower and slower and slower it's not like zeno's paradox that it sort of gradually gradually asymptotes to the top it reaches exactly the top at some finite time and then it stays there forever or it stays there for some time and then rolls down in a different direction right all of those are solutions to newton's equations and it has to do with the fact that these equations are non-linear in particular ways but the point is that given strictly the construal of classical mechanics defined by newton's equations there is not perfect determinism in there that's not quite how the equations work out now it's obvious from this example or it should be obvious that you really do need infinite precision to set up these initial conditions the technical term is it's a set of measure zero in the initial condition space so it's never going to happen in the real world but it could happen in principle right so that's the that's the kind of thing that philosophers and mathematicians care a lot about in principle it could happen so you can't say that even good old newtonian laplacian physics is perfectly deterministic in that sense however having said that i mean i said that both because it's intrinsically cool and because you know you should know what the limitations are when you say something like newtonian mechanics is deterministic i say it all the time because what you really mean is this deterministic up to except for a set of initial conditions of measure zero okay but maybe that's maybe that's interesting maybe the maybe uh of course we know the newtonian mechanics is not right quantum mechanics is right so the question will be even more complicated once we get into quantum mechanics okay so you could imagine saying well then even in newtonian mechanics you can imagine that things will happen in the future that cannot be strictly predicted from the past and that's true more broadly that we have situations where you can't predict exactly what's going to happen but things like this things like norton's dome are not the real reason why we have that kind of unpredictability in nature uh in the real world we have unpredictability you know when we go gambling okay we have las vegas behind me here uh and this was in fact historically this played a big role in the development of the theory of probability i think that my understanding is as with many things in math the initial steps were taken during the islamic golden age right arabic and persian mathematicians worked out some probability tables a lot more progress was made uh in the european enlightenment and the scientific revolution people like fermat and pascal were literally worried about playing dice and playing cards and that's how they worked out a lot of the theory of probability uh our old friend pierre simone laplace worked out some of the important issues in probability actually john von neumann i think i mentioned i might mention it appeared on the podcast i did recently on the mindscape podcast with maria konicova where we talked about uh poker and poker theory and a lot of game theory as we currently understand it was invented by john von neumann because he wanted to become a better poker player so the real world aspects of probability come up in gambling and in games of chance and that leads to a mathematical formalization of it so we want to understand how you can have these issues of probability like you don't know what coin is whether the coin is going to turn up heads or tails right you say the coin has a 50 50 chance of being heads or tails there are 52 cards in the deck half of them are red half of them are black you say the chance of getting a red card as the next one flipped over is 50 things like that what does that mean in a world that in principle is governed by newtonian mechanics it's not this right so i mentioned norton's dome as an example of the very very delicate but precise ways in which newtonian physics is not quite deterministic but don't leave behind the fact that mostly it is deterministic okay like norton's dome and similar kind of counter examples are not what's going on to lead to questions like the coin is 50 50. what's going on is that we don't know what the exact laplacian initial conditions are we do not know what the position and velocity of every molecule in our bodies or in the coin or in the deck of cards are so we want to formalize that into a notion of probability so oops we're green here sorry uh so we want to talk about what it means to be probability and that sort of has two aspects one aspect is the sort of the formal mathematical aspect how do you define probability and put it to work but then there is the more philosophical aspect what do you mean by probability and both of these i was going to say both these are complicated but they're complicated in different ways the mathematical way is just it's figured out we basically know the aspects of probability that matter in simple cases there's some definitions etc you have to remember philosophical underpinnings are still under debate actually people disagree and they disagree in very practical ways that have consequences for how we do science so it's worth thinking about that but let's do the math first okay let's do the sort of the mathematical uh formal aspects we define what we mean by probability okay so we say that here's the picture we have in mind we have a universe u of possible events so we have a universe of events and let's call the events x y z etc so here are some events it's just a set this is don't you know uh think of this blob as representing anything particular it's just a collection of possible things so the universe might be the universe of possible outcomes for tossing a coin might only have two events in it heads or tails it might be the universe of possible cards you can draw from a deck and it would have 52 different events in it so here's events x y etc and then there are um subsets we can group them into subsets of events a b etc and the subsets may or may not overlap so you know here's a subset a here's a subset b they overlap and what we do is we define a measure on this space of events so a measure is a way of assigning sizes to different subsets of some bigger set uh so i've already used the term measure zero when we were talking about the set of initial conditions that lead to norton's dome there's zero measure they exist these initial conditions but they're of zero measure you would never hit one randomly if you were choosing initial conditions randomly out of a bag and so we want to define a probability measure and the way to do this was formalized by a mathematician uh kamalgarov so we have the commodore of axioms for probability and they're actually pretty simple to state well i'm not you know there are different levels of complexity and rigor you can state them at so i'm going to be a little bit casual here like when you have infinitely big or continuous sets of possible events then things get complicated but with finite sets it's pretty straightforward so one axiom is that the probability of any event is always greater than or equal to zero okay the probability measure is never negative the probability for something is not negative now as soon as you say that some mathematician in the back of the room starts thinking hmm i wonder if i can define probabilities with negative values of course you can you can define non-standard versions of probability the komolga of axioms say probability is never negative the total probability so if you add up over all x's in the set that you're considering the probabilities of those x's you will always equal exactly one so the probability of something happening something happening is exactly one okay we accept that right that seems reasonable for a definition of probability and finally slightly uh more non-trivially but still pretty easy to understand if you have two subsets a and b that don't intersect so that's math notation a intersection b is the empty set okay so this is not true for the subsets i drew here but you could imagine two subsets that don't intersect then the probability of being in the union of a and b okay the probability of being in either a or b so you could write that as a probability of a union b uh equals the probability of a plus the probability of b now i'm saying here the word probability because that's what we're going to put this to use for but this is just math okay we don't have any interpretation necessarily that we attach to what this is this will work for probability but mathematicians just think well yeah this is a measure that i can define on a set and it has certain formal properties and i can manipulate it okay so at that level there's a lot you can do just from these very simple axioms i'm not going to do anything complicated here but i'm going to do some lingo that we might find useful later so you can define the joint probability one of the things uh about this whole subject of probability and statistics is that it is so useful that many different subfields use it right obviously physicists use probability all the time pure mathematicians use it applied statisticians use it other kinds of scientists use it information theory computer science they all use it and they all have developed slightly different notations for doing so so the probability notation that you might be most familiar with depending on what you've seen might be a little bit different than what someone else uses so i gotta i've got to use what i'm used to using uh you'll go along with me right so the joint probability is what in my field we write as a comma b and this is basically the probability of being of an event uh being in both a and b so when you write um yeah when you write p of a that's the the probability that the event that you randomly choose from the universe happens to be an a when you write p a comma b that's the probability that when you randomly choose an event it is in both a and b so it's in that intersection right there right so the probability of being an a and b is equal to the probability of being the intersection of a and b that's what we call the joint probability and there's also something we can call the conditional probability the conditional probability is written p oops p a bar b and you can read this off as either the conditional probability of a given b or just the probability of a given b it's the probability that if you know the element the random event is in the subset b what's the probability it's also in a okay so that's just the probability that the event is in both a and b so that's the joint probability we already just mentioned divided by the probabilities in b to start with so this is saying given that you know you're in b what is the probability you're also in a okay so it's the probability that you happen to be in both of them full stop divided by the probability we're in b in the first place and so this is very very useful when you do inference when you do statistical inferences saying oh i know one thing is true what's the probability some something else is also going to be true if you're playing poker you're playing texas hold'em and you have two cards in your hand you have one ace you might want to say all right there's six other people at the table what is the probability that one of them also has an ace or two aces or something like that given that you have an a's that's a conditional probability so you can see why gamblers are going to be very interested in these uh this formalization of probability okay but that's so this is this is what you need to do math of probability you can you know define a whole bunch of more things there's mutual informations and entropies and things like that that you can define for these probability distributions but then you can also ask okay given that this is the mathematical formalization what is probability physically where does it come from what does probability mean and that's much dicier it's not that we don't know the answer well we don't know the answer it's not that we don't have an answer it's that we have more than one possible answer and we're not people don't agree on what the right answer is probably you know as with often as is often the case it's not that there is a right answer it's that in different circumstances different answers are appropriate okay people that's not going to stop people from arguing so there are two notions of probability that i get a lot of play and maybe this is the exhaustive set like i'm not an expert in the philosophy of probability so i'm trying i'm trying to give you the very basics here not the comprehensive once and for all viewpoint there's something that you might call objective probability and objective probabilities are sometimes called chances or propensities these are not exactly the same thing but i'm not going to tell you what the subtle differences are between them i'm just letting you know the vocabulary words so if you if you thought for example that uh in quantum mechanics the probability that a uranium atom will decay in a certain time okay that's if you could easily imagine i'm not going to say again what's right or wrong but you could imagine it being the case that that's just an objective fact about the world right given a certain time scale given a certain isotope of uranium what is the probability the nucleus will decay that's just a fact about the world okay it's not up to you it has nothing to do with what you know and what you don't know it's just a fact so that viewpoint would go so we call that the chance uh that this is going to happen and there's a view of probability that things have a propensity for happening and therefore they get this objective probability out of that not going to go into those details but these are the words that are attached to that and this point of view that probabilities are objective things about the world goes hand in hand with something called frequentism and frequentism is the idea that you should define a probability by imagining you can do something an infinite number of times so when someone says what do you mean when you tell me that the chance that the coin flip is going to give me heads is 50 a frequentist would say well what that means is if i flipped the coin an infinite number of times i would get 50 50 heads and tails that's what it means to say the probability and furthermore i mean there'd be a footnote there saying and there's nothing else i can tell you about this structure you know it's not that uh the coin flips a different it's not that it gets tails every time heads was the last one like it alternates heads tails heads tails heads tails that's not quite what they have in mind because then there's something extra you could say that would give you additional information if all you can say is that if you flip the coin infinite number of times you get 50 50 heads and tails that's when you would say there's an objective probability of getting of heads being point five or fifty percent right okay so that's an idea um there are obviously advantages to this idea this notion of probability any anytime something can be objective that's good right we like it when it's subjective the opposite of objective is subjective or you know the contrasting idea and subjective means relative to a person right you know a subjective thing the subject to the self the i right gets all very philosophical so um objective is just out there in the world everyone agrees subjective is somehow more personal and it's nice when things are objective i mean that makes them sound like it's science it's not just my feelings that are involved right um but on the other hand the idea that you define probability by imagining you're able to do something an infinite number of times can be a little bit impractical right like you never actually do anything an infinite number of times and some things you know perfectly well are only going to happen once like in my book something deeply hidden when i talk about quantum mechanics i talk a little bit about subjective versus objective probabilities and i use the example of the 2020 nba championship and i say that there is a probability that the philadelphia 76ers are going to win the 2020 nba championship now that only happens once right uh but we talk about the probability we bet on it you know you can gamble on probabilities like that now ironically i didn't know at the time there was a chance that nobody would win the 2020 nba championship because of quarantine it looks like it is going to happen as of as of this uh recording that i'm doing right now but still the point being that there are things that clearly are subjective and yet we use probability talk about and for that matter we couldn't even imagine what it would mean to be a frequentist about something like that if you think it's possible to talk about probabilities of one-time events like the 2020 nba championship um you clearly can't mean the frequency of it happening an infinite number of times because you can't do that experiment in number times the basketball players would get older you'd not be doing the same nba championship over and over again okay so you mean something else when you say the probability of that event in the future is a certain amount and what you mean is typically taken to be the credence i don't know so i started thinking in the middle of writing there once again always a bad idea credences to be plural just like there that's your subjective degree of belief right the degree that you know the amount you'd be willing to bet that something was going to happen and this you know just to drive home that it's not just uh a frequency you can have credences about events that already happened right what is the credence that uh only one shooter shot jfk right people disagree about whether or not that's true there is an answer i mean you think there's only one answer to it but if you don't know it if you have some ignorance then you attach a credence you say well i think there's a certain probability that that was the case a certain probability uh that it didn't happen so this comes from ignorance and not ignorance in you know a pejorative sense there's something you should know and don't something you just don't know like many things about the world you don't know no no person is laplace's demon who knows everything so we talk about probabilities in these two different ways and you might want to know well which is right or which which should we use and i think the answer is you know i'm not sure what the answer is let me put it this way i think it's absolutely necessary to use subjective credence-like probabilities in some cases like i just said you know for the nba championship or for kennedy's assassination or whatever it makes perfect sense to talk in the language of probability and it's clearly not a frequency it's clearly not even objective people can disagree i'm less clear about whether there's anything in physics or in science that uh can really be called an objective probability but maybe there is you could certainly imagine a world in which it is you know we talked earlier about the grw theory of quantum mechanics right the objective collapse model when there's a probability per unit time at the wave function of the electron just localizes onto some point if that theory is right there is some objective chanciness to the universe and then you'll be very clear that you could use that kind of frequentist approach to define what you mean by probability what we can do is if you think that there are objective frequencies you can also admit there are sometimes subjective objective probabilities there are also sometimes subjective probabilities credences and then you can talk about trying to relate the two of them to each other so david lewis who's a famous philosopher uh he's extremely famous among other philosophers he's less famous among the general public but he invented what is called the principle principle and i know that the name is a joke that was intense attentional but he he called it that so principle principle oh he's a 50 50 probability that we'll get the right spelling first and later principle so the most important principle right and the principal principle the jokey name uh says that when there are i'm not gonna write the whole thing down when there are objective chancey kinds of probabilities in the world and you have subjective credences that you're going to assign to these different outcomes you should align your credences to the objective chances that's what the principle principle says your credences are personal they're subjective you can pick them you can say i don't believe that this is true but lewis is saying when there are objective facts about the frequency of something happening you better if you're smart as a normative guideline you better align your credences to the chances whatever they may be so this is again a philosophical can of worms seems very useful but um i'm not going to go into the details i just want to let you know that you can live in a world where you believe in both objective probabilities and subjective ones and they can happily play along under the right circumstances okay that was both a little bit of the math background what probability is how we talk about it a little bit of the philosophy background uh what it means now we can put it to work in physics okay that's what we're really here for so in physics where probability really came into the game was in the 19th century you know the 19th century is underrated in the history of physics you know we talk about um you know ancient physics with aristotle and we talk about galileo and newton and sometimes if you give a very quick history of physics you tend to skip from galileo newton right to einstein and quantum mechanics um but then in the you know the 1800s 19th century was just chock full of cool stuff with statistical mechanics which you can talk about here electromagnetism you know the precursors to special relativity atomic theory lots of cool stuff so we're going to think about statistical mechanics in a very simple-minded way that is to say let's think about a box of gas this is the standard example that we all like to use let me see if my pen will do this make a square there we go that's a box make a little bit more square and then we're going to fill it with gas and we know because we are modern physicists and we've had the last couple of biggest ideas videos we know that there are really atoms in the box of gas but we also know that none of us knows exactly the positions and velocities of all the atoms so we talk about the atoms as a gas instead with certain coarse macroscopic properties so how do we relate the fact that we think that really there is a single description of the positions and velocities of all the molecules to the fact that we don't know what that is and we still hope to do some useful physics with it okay so let's define a few notions here so we have the phase space for a single particle right which we would call one particle phase space that's the information you need to give me to say what's going to happen to one particle if you knew what was happening to all the other ones you need to give me the position of velocity of this particle and then all the forces acting on it um so that is the position and the velocity of that particle remember when we talk about phase space we can move back and forth from velocity to momentum this is all not relativistic so we're just doing newtonian mechanics momentum is m times the velocity so they're interchangeable and this is six dimensional okay three velocities three positions we're in a three-dimensional space i've only drawn a two-dimensional box but you know what i mean and this is called for reasons which will become only slightly more clear in a second mu space so we have a lot of particles in the box but for any one of them its phase space is a six dimensional space and we call that mu space we could also talk about the phase space of the entire system right so that also has a phase space every physical system has a phase space that describes it so if you have n particles the n particle phase space is a whole set of things right x1 v1 for particle one comma x2 v2 particle two and then maybe a lot of them uh you could have avogadro's number 10 to the 23 or many many more particles than that um and this is six n dimensional because there's six dimensions for each particle uh it's phase space and this is called gamma space so i actually do not know why mu and gamma are the two letters that are used for these two different ideas but at least it's clear you need two different letters because these are two different things right um the phase space for the whole thing is gamma space the phase space for just any one particle of time is mu space and the reason why we're distinguishing between these is because you might think that the sensible thing to do is to go into gamma space the phase space for the entire system and have some probability distribution on that space and see you know ask what you can do with the physics of that probability distribution and it's true you might want to do that but it turns out to be hard so actually people work in mu space which is much simpler because it's only six dimensional so one point in gamma is the same as n points in mu so in other words you can think of the phase space description of this n particle system as a single point in a big 6n dimensional space or as a cloud of n points all living in the six dimensional space that is completely mathematically equivalent okay uh it's all about what is more convenient for the problem that you're going to do so then you say okay well i have an actual box of gas let's do the following weird thought experiment uh someone comes into your room where you're sitting and they hand you a box and they say there are n particles in the box maybe they you can even weigh the box and you can say okay i know exactly how much energy is in the box but that's all you know okay you know the total energy in the box you tell the total number of particles remember the the energy per particle oops the energies are just going to be the kinetic energy this is particle i is one half and let's imagine they're just for simplicity particles of equal mass okay so they're hydrogen atoms or something like that one half m v i squared just the kinetic energy so we're ignoring the gravitational potential energy or ignoring interactions maybe the particles can bump into each other but they don't have long range forces acting on them so there's no potential energy or anything like that and if you imagine that okay that's all you know is it this amount of energy in a certain number of particles maybe so what could you do you could maybe say well i could be indifferent so there's another principle called the principle of indifference which is one of those things where in different circumstances we'll have different different definitions etc but basically it says when you have a bunch of possibilities and there's sort of a symmetry between them okay there's no nothing special that says some possibilities are different than others then you assign them equal probability that's the principle of indifference equal probability to similar circumstances so for example you could imagine describing the box of gas in gamma space by one point in gamma space now some of these points will have different energies than others because the velocities are different so there's some sub manifold of gamma space of constant energy where you know the total energy this is per particle e i e total is the sum over i of one half m v i squared so if you think about it if you fix e total in velocity space this is a sum of squares so you're saying the sum of squares equals a constant which is picking out a sphere there's a spherical region the the surface of a sphere in velocity space so on that sphere on that submanifold of fixed total energy the principle of indifference would say give me equal probability to be in any state any positions any velocities that are compatible with that total energy give that equal probability and sometimes people do this this is called the micro canonical ensemble i i never remember what the names of various ensembles are but that's one thing that you can do and you can make progress that way but it's a weird thing in in some sense i mean who is this person who walked into your room handing you a box of gas and how do you know they aren't tricking you right like how do you know what they did to prepare the gas in the box like they could easily make it so that all the molecules are moving to the left right and they hit the wall and they all move to the right uh that would not at all be compatible with the principle of indifference that's a very very specific kind of thing so the principle of difference you know it has a long complicated history i think laplace was one of the people who talked about it i'm not sure he actually completely advocated it although he often gets credit for saying the principle of indifference um but you know it's a sensible thing to do probably if you don't know any better but maybe sometimes we know better so there is something that there is a situation in which we can do a little bit better and that is uh let's instead of this instead of someone just handing you a box let's imagine we have a box but it's a real it's a slightly more real world box in that the edges of the box are relatively thin and the box is sitting in a room with some temperature in other words there can be a little bit of exchange of energy between the box and the room so particles cannot leave the box they're stuck in the box the number of particles is fixed but there are air molecules outside that are bumping into the sides of the box and the particles inside the box are also bumping into the edges so there can be slight variations in energy energy exchanging between inside and outside so we imagine that the box is basically embedded in a heat bath you don't maybe sometimes think of the air in your room as a heat bath but that's what it is so again box of gas particles and outside there's stuff at temperature t that can gradually interact with what's going on inside the box so what's going on inside the box is not completely a closed system it can exchange energy a little bit but the number of particles is fixed then you you're not looking inside you're not seeing what's going on inside the box okay so that's how we're going to physically change the situation a little bit we're also going to change mathematically how we deal with it a little bit because this gamma space thing this uh six n dimensional phase space is just too big so rather than working with a probability distribution in the n dimensional and six n dimensional gamma space the n particle phase space we will work with what's called a distribution function in mu space in the one particle phase space so this is a bad name distribution function because different subfields use the same terminology i mean very different things but this is how physicists use it the distribution function f as a function of x and v so it's a function in mu space so just in the one particle phase space and what it tells you is the density of particles on average in that region of phase space so remember in mu space you can think of the whole collection as a cloud of n particles and n is really big you know 10 to the 23 or bigger than that so it's almost like a smooth cloud right it's really a discrete set of points but you can approximate it and this is what you're doing here as a smooth function in phase space so if you really simplify your life by looking just at two dimensions here x and v then you have some point some little tiny region in phase space and f of x and v is the average number of particles in that region so but it's important that it's in phase space not in space so f is not the density of particles in physical space in a little cubic centimeter it's the density particles in phase space so both particles at that point at that location with that this velocity a velocity in a certain direction okay now this is throwing away some information already right um there might be correlations or relationships between the particles even if they're at similar points in phase space that this formalism does not keep track of still that's what we do this this goes back to uh people like maxwell and thompson and boltzmann and stuff like that in the 1800s they decided to simplify their lives by working in the one particle phase space instead of the n particle phase space and i should say that uh just to normalize things this is not really a probability we told you that when you add up all the probabilities of everything you always get one this is a number of distribution function so it's related to the total number of particles by saying that the integral of f which is a function of x and v over d3x d3v is not one but the number of particles so this you can have a lot of particles in some little region of phase space less in some other region and then what what you're able to show what you're able to do physically remember we've embedded our box in the heat bath so you say what happens and what happens is physically the gas will approach an equilibrium distribution and the equilibrium distribution we actually know what it looks like it's named after maxwell and boltzmann be embarrassing to uh misspell boltzmann maxwell boltzmann distribution so this is an astonishing wonderful fact about statistical mechanics this is you know precursor to all the talk about uh second law of thermodynamics which we'll eventually get into in a later video but this simple arrangement where someone gives you a box of gas it doesn't matter how they arranged it doesn't matter how they started you let a little bit of heat exchange happen across the boundaries of the box and what happens is that no matter what the initial conditions were the probability distribution distribution function sorry inside the box approaches a maxwell boltzmann distribution i'm going to tell you what that is what is it so the maxwell-boltzmann distribution is a particular form of the distribution function f of x and v so it's going to be some function of of x and v the positions and the velocities and what function is going to be well remember we're ignoring gravity right so it's not like there's more density near the bottom of the box than up up there in the box the what this means effectively is the random motions the temperature are more important than the fact that gravity wants to pull things down okay so what that means is that the maxwell-boltzmann distribution is going to be smooth it's going to be invariant under what is the word that i wrote down spatially uniform that's what it's going to be there's no reason to have it be more probable to find a particle over here than over there right it's gonna be the same probability at every point in space so what that means is that f of x and v is actually just a function of v it's not a function of x at all okay there's no difference in probability from one point to the other it's also rotationally invariant or isotropic if you want to say that so we have a function that is just a function of velocity but velocity is a vector it has a direction as well as a magnitude and what this is saying is that there's no preferred direction in the box you know the shape of the box whether it's really a cube or anything like that doesn't really matter particles are all bumping into each other and they're bumping into the sides and what that means is that this function of v is actually just a function of the magnitude of v v equals the magnitude it's the same in every direction okay finally okay so there's some function of v that is the maxwell-boltzmann distribution and i'm just going to tell you what it is f of v well there's some pre-factors i'm going to ignore so it's proportional to v squared times e to the minus one-half mv squared which you'll recognize is the energy of that particular configuration uh of that particular particle divided by kt where t is temperature remember we've embedded our box in a heat bath with temperature t and k is a constant of nature called boltzmann's constant i'm not gonna tell you what that is either because we set it equal to one and just like h bar and c uh we will end up setting k boltzmann's constant equal to one we'll try to remember it okay so what this is telling us is that if we plot this very specific functional form here's f and here's uh v well there's two things going on one thing is that there is a v squared so that's a parabola it looks like this the other is there's e to the minus v squared roughly speaking so one half mv squared is the kinetic energy of a single particle and it exponentially decreases as that energy goes up so if i plot minus e to the minus sorry e to the minus 1 fmv squared on this plot it looks like this and i multiply these two things together to get the maxwell boltzmann distribution and it looks like this it goes up as a peak then it's going to go down again so this is the maxwell boltzmann distribution it has a peak some particular velocity and what it's saying is that you can find uh individual this is saying what is the density of number of particles in phase space right so what is the number of particles with this particular velocity what's the number of particles with zero velocity well it's small okay it approaches zero as you go to zero velocity but only because there's only one way to have zero velocity there's a lot of ways to have a little velocity and if you're if you're thinking about small velocities much much smaller than kt then the e to the minus one half mv squared doesn't matter it's almost a constant it's approximately constant compared to the v squared which is growing and what it's basically saying is that there's just more ways to have a large velocity than a small velocity if the energies are not that different from each other okay that's where the v squared comes from remember the we said up here constant energy is like a sphere in velocity space so the reason why it's v squared is because the area of that sphere goes like v squared right areas go like distance squared but then what this is saying is that given that you're exchanging energy with your environment probably the typical energy of a particle in the box is going to be the same as the temperature outside and it's possible for a particle inside the box to have a much larger energy than that but it becomes really really unlikely it becomes exponentially unlikely this is e to the minus the kinetic energy in units of kt and units of the temperature outside so that's the maxwell-boltzmann distribution and it's an amazing fact of physics that a box of gas embedded in a heat bath approaches this distribution and this idea of embedding in the heat bath like this if this is the micro canonical ensemble which says completely equal probability given a fixed energy here the energy is not fixed right because it's exchanging energy with the environment and you get what is just called the canonical ensemble the temperature is kind of like the average energy but there are fluctuations around that okay so this is the the what you reach in the canonical ensemble and why how do you derive this well there's different ways to derive this fact one way is this is the maximum entropy configuration that the gas can go into given that it's going to be at this temperature okay given the other number of particles and there's going to be a temperature is going to reach this is the maximum entropy configuration so the entropy we're not going to go into this in detail right now but the entropy is something you can also define given this distribution function there's a formula for it and the formula is s equals minus the integral of f as a function of x and v i'm not going to draw the little vector signs here times the log of f of x and v d3x d3v okay i think i told you about logarithms before right so if you have the e to the logarithm of x is equal to x or likewise the logarithm of e to the x equals x that's what this that's the logarithm means so this formula minus the integral of f log f that's the entropy that's one of the ways in which you could talk about entropy you could also describe it in terms of mu gamma space this is mu space that we're in right now but this is a nice little thing that you can define it's a finite number boltzmann showed i think it was boltzmann who showed i think maxwell guessed or argued his way into the form of the maxwell boltzmann distribution and then boltzmann did some more rigorous investigation of it but it maximizes the entropy and as we'll talk about later that's a way of saying that given the constraints given the fact that there's a certain number of particles given the fact that the temperature is a certain value and the temperature is the average kinetic energy etc most of the allowed phase space is in this maximum entropy situation that's why you tend to go to the maximum boltzmann distribution because most particles will look like that under this condition so this is a wonderful example of statistical mechanics the idea that really it's not this is absolutely an example of the subjective probability in action okay because we all think that in that box of gas well sorry we all would have thought in the 1800s when maximum bolts when we're talking about this that there were atoms or there were molecules and they had some particular positions and velocities but we don't know what they are and so we can assign credences we can assign probabilities to the different possible ways you could configure the points in phase space and maxwell boltzmann say well this is the particular distribution that we should end up getting by maximizing the entropy and it's that kind of thinking that helped them invent this whole field of statistical mechanics again maxwell boltzmann also gibbs thompson aka lord kelvin and their friends you know set up this whole formalism of statistical mechanics and it's amazing so what it's telling you is that you can do physics you can learn things you know we learned something we said what happened to a box of gas that's a very physical tangible measurable observable thing even without knowing everything there is to know so this is crucially important to mapping this microscopic physics whether it's deterministic or not onto the macroscopic world can you still say something about the macroscopic world even if you don't know everything about the microscopic goings-on and this is the first indication that yes you can uh and you can sort of quantify your ignorance and nevertheless say interesting things and so statistical it was it was very um controversial at the time you know uh back in germany people were very slow to catch on to statistical mechanics and boltzmann in particular got a lot of flack from a lot of people uh because he was in germany like the the british caught on to it pretty quickly so maxwell and thompson were fine uh they were celebrated and boltzmann had fun with them but then when he went back to germany he got a lot of uh flack because a lot of people in germany led by ernst mock of mock's principal fame didn't want to believe in the existence of atoms okay they thought that things like the second law of thermodynamics were not probabilistic laws that you get from averaging over lots of atoms that they thought there was some law of nature that said entropy increased so but the statistical point of view won out in the end and it's a wonderful example of the use of probability in physics so that was the 19th century okay so the lesson is we can use notions of probability because we're ignorant because there's something we don't know about the system then the 20th century comes along okay and we have quantum mechanics and that seems to be a whole other kettle of fish as you know by now quantum mechanics tells us that when we measure an observable feature of a quantum system you cannot predict what outcome you're going to get with perfect certainty the best you can do is to predict a probability or to state a probability of getting different experimental outcomes that seems different you have the born rule that the probability of getting some outcome x is the wave function that we associated with x magnitude squared and it seems certainly in the well i almost said certainly in the copenhagen interpretation but the phrase certainly in the copenhagen interpretation should almost never be used because the copenhagen interpretation is not very well defined but certainly in the minds of many people this kind of probability which you'll notice does fit all of kolmogorov's axioms there okay it fits very well as long as the set of events that you're allowed to look at are compatible with each other so you cannot observe position and velocity at the same time but if you look at the probability for all the positions of an electron for example they would satisfy kamal grove's axioms by the way again parenthetically what you could do rather than what i just did which is to brush under the rug the fact that you shouldn't think of observing position and observing velocities or momentum at the same time when you talk about the probability in quantum mechanics you can instead try to to develop a quantum mechanical theory of probability which is different than kamalguard's theory of probability and that's another way you can go but as long as you just pay careful attention to what you're allowed to observe at any one time ordinary probability works perfectly well okay um but it seems like an objective feature of the world so this is supposed to be the point of something like bell's theorem right uh the difficulty of finding hidden variables at least local hidden variables is that the reason why there's only a probability we can attach to different outcomes isn't because there's something some fact some values of some parameters that we just don't know it's because they are intrinsically unpredictable okay but then other people you know bone will come along and say well i can invent a theory that says that there are hidden variables and ever it comes along says something completely different so it's subtle and actually the answer is in quantum mechanics given that we don't know the one and true once and for all true theory of quantum mechanics that means we don't know what probability comes from in quantum mechanics yet uh different schools of thought think they know but they don't agree with each other so so let's contrast three popular approaches to pump to probability in quantum mechanics and these are just the three approaches to quantum mechanics that we talked about before so one would be hidden variables so the hidden variable theories like bones say that in addition to the wave function there are some other variables like positions of particles and you don't know what they are and in fact there's an argument that they should be distributed according to some equilibrium distribution and you can sort of argue about that argument but that's that's a big part of being a good bohmian and then you use that equilibrium distribution to derive that you get the born rule for actual experiments so in hidden variable theories you have psi plus let's say particle positions both the wave functions and these extra variables and these particle positions are unknown so it is exactly like statistical mechanics it's perfectly deterministic there are equations for how psi evolves and for how the particles evolve so let's make that very clear the schrodinger equation h psi equals i d psi dt okay this equation is 100 deterministic it is more deterministic than newton's laws remember that was the point of telling you about norton's dome f equals m a looks deterministic but there are these sneaky little uh exceptions to that the schrodinger equation is linear which is a way of saying that on both sides of the schrodinger equation psi the wave function appears exactly once it's not psi squared or e to the psi or anything like that so the kinds of sneaky set of measures zero non-deterministic behavior that exist in newton's laws are not anywhere to be found in the schrodinger equation given the schrodinger equation it is true that i can evolve it forward and backward in time the reason why we get the bourne rules of course because quantum measurements don't obey the schrodinger equation something else seems to be going on and in hidden variable theories in addition to the schrodinger equation you also have a guidance equation that's why hidden variable theories are sometimes called pilot wave theories you can think about the wave function as a wave that pushes around the particles but the guidance equation is also completely deterministic the indeterminism comes in because you don't know the positions of the particles just like in statistical mechanics so this is an example of subjective probability so oops that was terrible oh my goodness subject div probability because you don't know the state of the system there you go but in contrast with that like i already mentioned you have grw slash objective collapse models which just say as part of the postulates of the theory there's a certain probability per unit time that the wave function is going to collapse okay that's what objective collapse means [Music] the original grw theory is just truly random in that sense there's other theories like pen roses where the collapses are triggered by something some physical event makes collapse happen but what the collapse is actually to is still completely random so these theories are objectively probabilistic so these theories in other words there's an objective chanciness stochasticity randomness to the world it's a very very different kind of probability than appears in the hidden variable theories and then guess what we have everett slash many worlds the tension mounts which is it going to be is it going to be objective probability or subjective probability so of course ever it's got to be difficult it's got to be a little bit slightly more complicated in everett as you know the only equation is the schrodinger equation right that's the fundamental equation i just told you the schrodinger equation is 100 deterministic so the fundamental dynamics of the wave function of the universe in everetti in many worlds quantum mechanics is just deterministic full stop but as you also know because we also talked about it probabilities nevertheless arise and they arise from what we called self-locating uncertainty so imagine that you are on a branch of the wave function here and you have some spin and it's in a superposition of spin up and spin down and then you measure the spin or let's say the spin gets measured doesn't need to be you who does the measuring and so there's a branch where the spin is up and a branch where the spin is down and that's going to happen faster than you know about it okay so the branching happens really really really fast so there's a moment when there are two branches and you don't know which branch you are on okay that's self-locating uncertainty sometimes called indexical uncertainty because you'd imagine a little index that says i am on this branch but you don't know what the index is okay and so this is subjective uncertainty it's something that you don't know but it's a different kind of thing that you don't know okay in up here in the um oops where'd it go i lost my little uh well that's too bad i lost my little highlighter i don't know what happened but anyway up here where we were in the hidden variable world okay there was something we didn't know and the thing we didn't know was the state of the world right we didn't know what the positions of the particles were which is part of the specification of the state in everett you can know the state of the world exactly but there's still something you don't know which is where you are in it so that's a different kind of subjective uncertainty you need to attach credences that's something you need to do but how to attach those cretans as credences is a little bit less obvious okay and i would argue i have argued other people have argued that there is a clearly right rational sensible way to attach those credences and that is to use the born rule the probability of being on any branch is the wave function squared but other people don't quite agree with that you know they haven't yet seen the light yet but what's interesting to me is that um these different these you know pre-existing philosophical debates about the nature of probability and what it means are not only not resolved but every possible option could be true given what we know about quantum mechanics if someone says you know is the world fundamentally deterministic we don't know we don't know the answer to that in grw it's not in hidden variable theories and everett it is there's other theories of quantum mechanics obviously so we don't know whether the world is fundamentally deterministic or not if it is how do probabilities arise in a deterministic world we don't know that either in hidden variable theories they arise because there's something we don't know about the world in ever they arise because there's something we don't know about where we are in the world so it's interesting that you know none of this is yet settled again i like i like to point out open problems for the youngsters in the audience to solve for their phd theses this is a good set of problems here you know the relationship of probability and quantum mechanics subjective and objective probabilities and so forth i should say also um remember i said before i'm not even sure there are any objective probabilities in the world um so that's that goes along with me being an averredian this is saying that the probabilities that arise in quantum mechanics are subjective just as much as the probabilities that arise in uh statistical mechanics are and david lewis actually went even further than this you know david lewis the same guy with the principle principle his most famous thing is to talk about possible worlds so he wasn't he thought in he understood you know about everybody in quantum mechanics and so forth but that was not his main focus but he talked about the space of all conceivable he said possible i should say possible all possible worlds so there's a by by world he doesn't mean a branch of the everyday wave function he means a set of laws of physics being instantiated in some universe some reality okay so he's thinking of really different realities uh i talked about this a little bit in podcast land in mindscape podcast i had one podcast with ned hall philosopher and another one with max tegmark a cosmologist and we both talk about this idea of the set of all possible worlds okay so lewis thought that all like there's a possible world which is newtonian mechanics there's a possible world where space is two-dimensional there's all these different possible worlds an infinite number of them and lewis thought about you know how we should think about these possible worlds and he actually thought they were all real he advocated thinking of every possible world as really existing you don't need to go that far uh modal realism it was called but you don't need to go that far but but he does point out that you know we can think of all probabilities like let's say um you know we don't know what the laws of physics are we don't know what the dark matter is is the dark matter a weakly interacting massive particle is it an axion is it something else okay different people will assign probabilities you know you have to like like don't say you don't want to like if you're if you're an honest rational person you should have credences for things and you know maybe your credence is it's something else i don't know about and maybe your credence is not very specific but if you're a scientist a cosmologist you need to have credences you need to and the reason why is because you choose what to think about right you choose what to work on how to spend your life so uh if you think that it's very probable that the dark matter is a weakly interacting massive particle then that means you have a high greens that that's true if you think it's an axion or black holes or neutrinos then you think that the credences are different and so what lewis says at least once he said this is that you can think of all probabilities including um like the probability uh that the axion to the dark matter and so forth the probability that a head uh coin will turn up heads or tails the probability that you know a certain radioactive atom will decay can be thought of as self-locating uncertainties are self-locating uncertainties in the space of possible worlds so lewis says there's a possible world in which axions are the dark matter there's a possible world in which weakly or acting massive particles are the dark matter when you say i think there's a 60 chance that weakly interacting massive particles are the dark matter you're putting a credence on your self-locating uncertainty about which possible world you live in so this is a different way of self-locating uncertainty to arise in everyday in quantum mechanics it arises because you don't know where you are in the actual world even though we say many worlds it's still one big wave function of the universe you just don't know where you are in that wave function lewis is saying you consider really different possible disjoint worlds they didn't arise from the same world in the past through schrodinger evolution or anything like that they're truly different possibilities and we don't know which one is true so that's a kind of self-locating uncertainty another way that probabilities arise i don't know if this is actually true i'm very sympathetic to it i like it but i don't understand it well enough to really say that i advocate for it but i'm thinking about it i think it's something that we should take seriously okay last thing i want to do in this video is to talk about how we decrease our uncertainty you know how you know so we have this uncertainty we have self-locating certainty for example anyway we have probabilities we have different theories of what is the dark matter uh you know was there a big bang and stuff like this different probabilities you know what is the right theory of quantum mechanics is it uh pilot waves hidden variables jrw ever many worlds we have different probabilities for this and what we're going to do is science we're going to go out there and observe things about the world and learn about it and hopefully figure out which of our probabilities are correct right so we're going to increase certain probabilities decrease others we're going to change our credences in different propositions by collecting data and this is the whole universe of bayesian inference honestly it's amazing i've gotten this far in a video about probability and randomness without mentioning bayes's name once yet so bayes's theorem is is literally a theorem it's not something that can be wrong it's just if you have probabilities probabilities obey a certain rule in particular this is saying um the probability how probability changes when you collect more data that's what baze's theorem says but you have to distinguish between bayes's theorem and the whole bigger set of ideas called bayesian inference so let's first get to the theorem okay so imagine you have a set of mutually exclusive propositions or let's call them theories because this is very often used in the context of scientists trying to decide which theory is correct but any mutually exclusive set of propositions uh works the coin will become heads the coin will become tails those are two theories okay so call them theories ti one of them is the head will the coin is heads the coin will be tails those are two different theories there's a prior in bayesian reasoning probability for every theory so this is p of ti so this says even before you've looked at the world before you've done any science before you've done anything before you collected any data before you flipped the coin for example uh you need to already assign a probability you can't just say no i don't know i don't know what it is you need in order to get along in the world you need to be able to assign probabilities to things so you start with a prior probability the coin has not yet been flipped well i'm going to give it 50 50 heads or tails those are my prior probabilities and then i collect some data and i'm going to call the data d so maybe the data is oh someone just gave me a receipt for this person who's about to flip the coin had purchased a trick coin at a magic store that always turns out heads right so that's going to change my probabilities i got some new data in there or oh a weakly interacting massive particle was just discovered at the xenon dark matter detector or something like that okay data can come in in various sorts so what we do is we then define the likelihood the likelihood of measuring that data if each individual theory were true okay so the likelihood is a conditional probability the likelihood is saying if this theory were true if we conditionalize on that particular theory then what would the probability have been that we would have collected that particular piece of information we would have located that data so the likelihood is the probability of collecting the data given the theory okay what we want to get what we want to do is to update our priors with the fact that we've collected the data so we want is the posterior probability the probability that our theory is true now that we've collected the data so this the likelihood is something we can calculate right our theory should make a prediction that's what it means the theory predicts that the coin will be heads up 50 of the time or whatever um different theories will make different predictions and then this is what we calculate so this sorry this is what we infer so you calculate this you collect the data you start with this where do your priors come from yeah don't ask me this is the big looming question in bayesian inference you know how do you choose your priors in the first place how do you choose to think that weakly interacting massive particles are more likely than axions or whatever lots of different things go into it you know how pretty is it how elegant is the theory how simple uh have you written any papers on that particular model therefore you'd be happy you know this is all stuff that could in principle go into determining your priors um and then you infer from all that the after you've collected the data what is your new probability for that theory and so baze's law is how to do that baze's law says the probability of the theory now that you've collected the data is equal to the probability the likelihood the probability that the oops that you would have collected that data under the theory times the prior for that theory divided by the total probability that the data would be collected in some theory okay so the total probability of collecting the data is just the sum over all theories of the likelihood probably data in theory i times the prior probability of theory i so you can see that the thing on the top of baze's law is the thing we're summing over to get the bottom so by construction the left hand side is going to be a collection of numbers between zero and one that when you add them together they're going to give you one okay because when you add all the probabilities together you're just taking the same thing on the top and bottom of this right hand side of the equation so this is guaranteed to give you a good probability and this is every statistician's and many scientists favorite rule and so this is again this is a law this is a theorem this is how probabilities fit together but it also suggests an attitude it's just a mathematical expression but it also suggests a philosophy of life and the philosophy of life is there's a whole bunch of things that we're ignorant about i mean they may be true they may not be true but we should attach credences to them we should attach priors we should say i think this is probably true this thing is less likely to be true etc and then we should go through life collecting data and updating our priors using bayes's theorem just like this and the sad fact is that people don't do this uh this is this is you know as as uh someone said people call this bayesian rationality but you might as well just call it rationality it's just being rational about a world where you are not sure about things now in fact there's all sorts of psychological barriers to being a good bayesian reasoner there are examples where people are given data that is unlikely according to their favorite theory and they get defensive about it and become more and more wedded to their favorite theory okay it's not doesn't always happen that way but it's definitely something that can happen it's hard to be a good bayesian reasoner certainly if you're playing poker you should be a good bayesian reasoner i mean you might say well i have two pair i think that's the best hand i will raise and someone re-raises you you should update what is the chance that they would re-raise me well it's low if they have a crappy hand and it's high if they have a really good hand so maybe i should lower the probability that my hand is the best one right you should be always doing bayesian updating as you're sitting there playing poker but it's hard to do in practice let's let's finish this up by running through one example of seeing how it actually works okay so let's use um it's just a modern physics thing that we really care about is you know what we're really annoyed by is the fact that we didn't discover any new particles of a large hadron collider other than the higgs boson which we all suspected was there we haven't discovered super symmetry or extra dimensions or any of the things we were looking for so let's use that as an example so here's an example let t one be super symmetry exists exists in nature so t1 the theory one is that the fundamental laws of physics whatever they are have supersymmetry as an aspect and t2 is that no they don't supersymmetry is just a theorist's pipe dream it is not part of the real laws of nature okay so imagine you're a physicist and you like supersymmetry imagine your physicist 20 years ago we haven't yet turned on the large hadron collider and you have priors so you've thought about it you thought about the mathematical elegance and the explanatory power and the unification prospects of supersymmetry and you said this all looks good to me so you say my prior on t1 is 0.9 and therefore my prior on t2 because these are mutually exclusive theories and they're a comprehensive set there's no third option supersymmetry exists or it doesn't okay so they better add up to one so i better give 0.1 for my prior nothing this is just an illustrative example you don't have to agree with this particular physicist okay we're just running the numbers um and then your data is we did not see supersymmetry at the lhc we turned on large hadron collider we did not see super symmetry now the way to play the game is before you turned on the lac you should be clear about your likelihoods okay now the the problem so what you want is the probability of getting that data in theory one and the probability of getting that data in theory two okay i could make this a little bit better written because p's look like these um and in principle what i said up here was you just calculate the likelihood i mean if you have a well-posed theory it should make a statement about what is the probability of getting different data sadly in the real world you very often face theories like these which are not well posed okay theory one is not a specific model of grand unification and symmetry breaking in the whole bit any specific model will make a prediction either probability one or probability zero that supersymmetry is going to be located at the lhc but this is not like that the idea that supersymmetry exists in some model is a wide variety of possible worlds right there's a lot of different kinds of ways that supersymmetry could exist maybe it exists but it's broken at the planck scale so it exists it's part of nature but it's not visible at the large hadron collider maybe it exists but it's broken right there at the electro weak scale in which case you would expect it to observe it at the large head rock glider so a certain amount of judgment comes into calculating these likelihoods even though they're supposed to be perfectly well defined so it's a sad fact so let's say that what you would have guessed before we turn the large hadron collider on is that if supersymmetry exists we probably would see it let's say you thought there was an 80 chance that you would see supersymmetry at the lhc so the probability of not seeing it the likelihood of not seeing it you write as 0.2 okay and the likelihood of not seeing supersymmetry if there is no supersymmetry is one these are likelihoods they don't need to add to one okay they could be they could both be one in different theories in which case if the likelihoods are the same in different theories then basis theorem does not move your priors into different posteriors at all but okay then we calculate the probability of getting that data in the whole set of possible theories right the probability of getting that data you did not see super symmetry is uh well the probability of getting that data in theory 1 which is 0.2 times the prior probability for theory 1 which is 0.9 plus the probability of getting that data in theory 1 which is 1 times the prior probability sorry theory 2 which is 1 times the prior in theory 2 which is 0.1 and i've done this ahead of time this is 0.28 you can check my math at home my prior that my math is correct is about 50 um so that's the probability of getting the data in given your priors and your likelihoods and therefore the probability that theory one is true the probability that supersymmetry is still lurking out there to be found given that you haven't found it at the lhc this is kind of fun that we can actually calculate this uh it's uh well what is the formula where is bayes's law the likelihood of that in theory one times the prior for theory one so that is 0.2 times 0.9 divided by the total probability of that data 0.28 and i work this out and i get 0.64 okay so i started the day 20 years ago i thought there was a 90 chance the supersymmetry was out there somewhere lurking to be found as a part of the fundamental laws of nature i turned the lhc on i didn't see it i was disappointed but i'm still left with what i would write as a 64 chance that supersymmetry is still out there so on the one hand less by a substantial amount than i started with on the other hand still greater than 50 percent this is supposed to explain why there's plenty of physicists out there who think that it's still very likely that supersymmetry is part of nature either we'll find it soon or the next big collider or we're not going to find it because it's broken at the planck scale but it's still out there anyway this is an example of uh good bayesian reasoning the the idea is a bayesian inference is that this is the right way to think about all probabilities you should be honest about your priors and this is why people start arguing frequentist versus bayesians frequentists really want probabilities to be objective okay they really want to imagine that i could have flipped the coin an infinite number of times and find the frequency with which it came up heads or tails bayesians say i should admit that probabilities are just not objective and i should admit that i have priors which are subjective and the frequentest say well your priors could be anything you get any answer you want and the bayesians will say yes i could by picking dumb priors but number one real scientists don't pick dumb priors like they might disagree but they shouldn't be wildly inconsistent and number two we should be able to agree on the likelihoods given those theories maybe we can't but at least that's something we should be able to do and so a good bayesian scientist when they write a paper with some new data that points toward a result they will emphasize the likelihoods rather than the priors or the posteriors because those are more subjective anyway it's a useful way to help us think about quantitatively how to locate ourselves in the set of all possible worlds and it's a good way of thinking about how probability can be made compatible with the world even if the fundamental laws of physics are deterministic we don't know if they are i think that they probably are but that's just my prior
Info
Channel: Sean Carroll
Views: 57,718
Rating: 4.9014492 out of 5
Keywords:
Id: mDUgPJrXqRc
Channel Id: undefined
Length: 83min 36sec (5016 seconds)
Published: Tue Jul 28 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.