Alice, Bob, and the average shadow of a cube

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

If we're mainly talking about how to actually solve problems, then I think that it's good for Grant to have pointed out the biases that these kinds of videos can have towards slick solutions. He's a smart guy and I think the last thing he wants to be is misleading about math, so slightly self-critical videos like this are great not just for the specific content but also as example.

Usually how I work is I just do whatever I can in order to get a solution, and it all ends up pretty messy. Kinda like going through a forest without having directions, basically a random walk. Lot's of unnecessary steps/computations/ideas etc. But then upon reflection I notice some of these redundancies and find ways to shorten the path by connecting two dots in a more direct way. Thinking and sitting on it like this for long enough will usually end up with a nice, slick proof where the underlying idea that I initially drew on (but maybe was hidden) is showcased. Sometimes the solution spontaneously flips to a totally different method only revealed by this simplification process. In a way, if Bob pays attention to what he's doing, by knowing what his computations are saying, then he can find a homotopy from his solution to Alice's.

There's also more that goes into problem solving than an individual sitting down and doing computations or playing with ideas alone in a dark room (or on a test) - which is another bias that popularizing videos can have. There is way more collaboration and discussion that happen. Why are Alice and Bob working on the problem alone? Why are they not working together while both bringing their different perspectives to the table? We tend to construct mathematics as being done by very smart - slightly crazy - men alone in their rooms (even this video helps with this). Newton gets all the credit for Calculus, but a lot of what we would call calculus was known by his time (especially for algebraic curves), including a version of the Fundamental Theorem of Calculus by Barrow, Newton's advisor. Newton and Leibniz both found ways of using infinitesimals to generalize these results and use them in broader contexts. But infinitesimals/fluxions weren't even meaningful things; they were just things that, computationally, were and were not zero at different times, it would take hundreds of years to develop meaningful tools for them. It's not like Newton went to the countryside to avoid the plague for two years and came out with the Principia, he collaborated during this time through letters and it wasn't until long after his prominence that he published the Principia.

In reality, math is created from the diversity of thought and perspectives that many people bring to the table. The Alices, the Bobs, the Carols, etc working together rather than in isolation. We generally can't solve problems by ourselves, we need each other. Whether it be remembering an idea a friend showed you one time, or straight-up talking on a board together, problem solving is a community effort that does not happen solely in an individual's brain. I typically sort-out my problems by explaining them to others, and this process/feedback helps turn it slick or illuminate the key idea needed to finish it. A move away from "crazy, genius, hero mathematicians" to "a healthy community of similarly-interested collaborators with different ways of thinking" would be good to see.

👍︎︎ 109 👤︎︎ u/functor7 📅︎︎ Dec 20 2021 🗫︎ replies

Raise your hand if you're a "Frustrated Bob": likes calculations, starts that way, then ends up with a mess of tangled integrals that you can't solve and rage quits.

Grant's reflection is truly exemplary.

👍︎︎ 30 👤︎︎ u/0nlyg00d 📅︎︎ Dec 20 2021 🗫︎ replies

I'm very happy that this is the direction the video went, not claiming they're different styles but being honest about needing both of them(and probably a little more of bob).

Far too many educators believe the 'learning styles' idea (where different students have prefered learning styles they're better in) so i'm glad we didn't have some "find out where you're an alice or a bob" kind of thing :)

👍︎︎ 58 👤︎︎ u/asphias 📅︎︎ Dec 20 2021 🗫︎ replies

I just watched this… So, people of r/math , is there a nice slick generalization to when the light source isn’t infinitely far away?

👍︎︎ 12 👤︎︎ u/Nilstyle 📅︎︎ Dec 20 2021 🗫︎ replies

[removed]

👍︎︎ 33 👤︎︎ u/[deleted] 📅︎︎ Dec 20 2021 🗫︎ replies

mathologer also has a video featuring the shadow cube problem

part 1 - part 2

👍︎︎ 7 👤︎︎ u/snillpuler 📅︎︎ Dec 20 2021 🗫︎ replies

3b1b needs a Nobel prize for th wonderful videos he puts on YouTube.

👍︎︎ 18 👤︎︎ u/sossolha9ira 📅︎︎ Dec 20 2021 🗫︎ replies

I really like the Alice approach here. As soon as he reached the conclusion that the constant should be the same for every convex shape, at around 20 minutes, I figured out the solution.

I also really like his conclusion. I think this is a general problem with how math is presented, but it's also necessary to be this way. The reason we can understand so much math in school so quickly is because it's already neatly packed into generalisations and specialisations for the students. In hindsight a lot of this stuff seems easy and there is so much math you learn that it can't be expected of every student to go through every step by themselves. I think it's important to realise that you can still learn a lot about how math works by going straight to the neat solutions.

On the other hand, this creates a picture of how math research works that is just not realistic. The reality is that a lot of trial and error happens, that never shows up in the released papers. In turn students at university are very frustrated when they have to do a lot of work to figure stuff out. I wish there was more time at school to teach that part of doing math is to overcome that frustration and keep trying. I know that I am grossly oversimplifying things here, but I think being able to take that initial frustration is one of the most important traits to become a mathematician.

👍︎︎ 4 👤︎︎ u/DerFelix 📅︎︎ Dec 20 2021 🗫︎ replies

wake up babe new 3b1b video

👍︎︎ 5 👤︎︎ u/Iron_2019 📅︎︎ Dec 20 2021 🗫︎ replies
Captions
In a moment I'm going to tell you about a certain  really nice puzzle involving the shadow of a cube.   But before we get to that I should say that the  point of this video is not exactly the puzzle,   per se, it's about two distinct problem-solving  styles that are reflected in two different ways   that we can tackle this problem. In fact  let's anthropomorphize those two different   styles by imagining two students, Alice and  Bob who embody each one of the approaches. So Bob will be the kind of student who really  loves calculation. As soon as there's a moment   when he can dig into the details and get a very  concrete view of the concrete situation in front   of him, that's where he's the most pleased.  Alice on the other hand is more inclined to   procrastinate the computations, not because she  doesn't know how to do them or doesn't want to,   per se, but she prefers to get a nice high-level  general overview of the kind of problem she's   dealing with, the general shape that it has,  before she digs into the computations themselves.   She's most pleased if she understands not just  the specific question sitting in front of her,   but also the broadest possible  way that you could generalize it,   and especially if the more general  view can lend itself to more swift   and elegant computations once she does  actually sit down to carry them out. Now the puzzle that both of them are going to  be faced with is to find the average area for   the shadow of a cube. So if I have a cube kind of  sitting here hovering in space, there are, a few   things that influence the area of its shadow. One  obvious one would be the size of the cube, smaller   cube smaller shadow. But also if it's sitting  at different orientations, those orientations   correspond to different particular shadows with  different areas. And when I say find the average   here, what I mean is the average over all possible  orientations for a particular size of the cube. The astute among you might point out that it  also matters a lot where the light source is.   If the light source were very low close to the  cube itself then the shadow ends up larger and if   the light source were kind of positioned laterally  off to the side, this can distort the shadow and   give it a very different shape. Accounting  for that light position stands to be highly   interesting in its own right, but the puzzle is  hard enough as it is, so at least initially let's   do the easiest thing we can and say that the light  is directly above the cube and really far away.   Effectively infinitely far so that all we're  considering is a flat projection, in the sense   that if you look at any coordinates (x, y, z)  in space the flat projection would be (x, y, 0) So just to get our bearings, the easiest  situation to think about would be if the cube is   straight up with two of its faces parallel to the  ground. In that case this flat projection shadow   is simply a square, and if we say the side lengths  of the cube are s, then the area of that shadow is   s squared. And by the way, anytime that I have  a label up on these animations like the one down   here I'll be assuming that the relevant cube  has a side length of 1. Another special case   among all the orientations that's fun to think  about is if the long diagonal is parallel to   the direction of the light. In that case the  shadow actually looks like a regular hexagon,   and if you use some of the methods that we will  develop in a few minutes, you can compute that the   area of that shadow is exactly the square root of  three times the area of one of the square faces.   But of course more often the actual shadow will  not be so regular as a square or a hexagon, it's   some harder-to-think-about shape based on some  harder-to-think-about orientation for this cube. Earlier, I casually threw out this phrase  of averaging over all possible orientations,   but you could rightly ask what exactly is  that supposed to mean. I think a lot of   us have an intuitive feel for what we want  it to mean, at least in the sense of what   experiment would you do to verify it. You might  imagine tossing this cube in the air like a die,   freezing it at some arbitrary point, recording  the area of the shadow from that position,   and then repeating. If you do this many many  times over and over you can take the mean of   your sample. The number that we want to get at,  the true average here, should be whatever that   experimental mean approaches as you do more  and more tosses approaching infinitely many. Even still, the sticklers among you could complain  that doesn't really answer the question, because   it leaves open the issue of how we're defining  a "random" toss. The proper way to answer this,   if we want it to be more formal, would be to first  describe the space of all possible orientations,   which mathematicians have actually  given a fancy name. They call it SO(3),   typically defined in terms of a certain family  of 3-by-3 matrices. And the question we want   to answer is "What probability distribution are we  putting to this entire space?" It's only when such   a probability distribution is well-defined that  we can answer a question involving an average.   If you are a stickler for that kind of thing, I  want you to hold off on that question until the   end of the video. You'll be surprised at how far  we can get with the more heuristic experimental   idea of just repeating a bunch of random tosses  without really defining the distribution.   Once we see Alice and Bob's solutions,  it's actually very interesting to ask   how exactly each one of them defined  this distribution along their way. And remember, this is not meant to be  a lesson about cube shadows, per se,   but a lesson about problem-solving told through  the lens of two different mindsets that we might   bring to the puzzle. And as with any lesson  on problem-solving, the goal here is not to   get to the answer as quickly as we can, but  hopefully for you to feel like you found the   answer yourself. So if ever there's a point  when you feel like you might have an idea,   give yourself the freedom to  pause and try to think it through. As a first step, and this is really independent  of any particular problem-solving styles,   just anytime you find a hard question, a  good thing that you can do is ask "What's   the simplest possible non-trivial variant  of the problem that you can try to solve?"   In our case what you might say is, okay, let's  forget about averaging over all the orientations.   That's a tricky thing to think about. And let's  even forget about all the different faces of   the cube, because they overlap and that's also  tricky to think about. Just for one particular   face and one particular orientation,  can we compute the area of this shadow? Once more, if you want to get your bearings with  some special cases the easiest is when that face   is parallel to the ground in which case the  area of the shadow is the same as the area   of the face. And on the other hand if we were to  tilt that face 90-degrees, then its shadow will   be a straight line and it has an area of zero. So  Bob looks at this and he wants an actual formula   for that shadow, and the way he might think about  it is to consider the normal vector perpendicular   off of that face. What seems relevant is the angle  that that normal vector makes with the vertical,   with the direction where the light is  coming from, which we might call theta.   Now from the two special cases we just looked at,  we know that when theta is equal to 0, the area of   that shadow is the same as the area of the shape  itself, which is s squared if the square has side   lengths s. And if theta is equal to 90 degrees,  then the area of that shadow is zero. And it's   probably not too hard to guess that trigonometry  will be somehow relevant, so anyone comfortable   with their trig functions could probably  hazard a guess as to what the right formula is. But Bob is more detail-oriented than that. He  wants to properly prove what that area should be   rather than just making a guess based on the  endpoints. The way you might think about it   could be something like this. If we consider  the plane that passes through the vertical as   well as our normal vector, and then we consider  all the different slices of our shape that are   in that plane, or parallel to that plane, then  we can focus our attention on a two-dimensional   variant of the problem. If we just look at one  of those slices, who has a normal vector an   angle theta away from the vertical, its shadow  might look something like this. And if we draw   a vertical line up to the left here, we have  ourselves a right triangle. And from here we   can do a little bit of angle chasing, where we  follow around what that angle theta implies about   the rest of the diagram. And this means the lower  right angle in this triangle is precisely theta. So when we want to understand the size of this  shadow in comparison to the original size of   the piece, we can think about the cosine of that  angle theta, which remember is the adjacent over   the hypotenuse. It's literally the ratio between  the size of the shadow and the size of the slice.   So the factor by which the slice gets squished  down in this direction is exactly cosine of theta.   And if we broaden our view to the entire square  all the slices in that direction get scaled by   the same factor. But in the other direction,  the one perpendicular to that slice, there is   no stretching or squishing because the face is  not at all tilted in that direction. So overall   the two-dimensional shadow of our two-dimensional  face should also be scaled down by this factor of   a cosine of theta. It lines up with what you  might intuitively guess given the case where   the angle is 0 degrees and the case where it's 90  degrees, but it's reassuring to see why it's true Actually, as stated so far, this is not  quite correct. There is a small problem   with the formula that we've written. In the  case where theta is bigger than 90 degrees,   the cosine would actually come out to be  negative, but of course we don't want to   consider the shadow to have negative area. At  least not in a problem like this. So there's   two different ways you could solve this. You  could say we only ever want to consider the   normal vector that is pointing up, that has a  positive z component. Or more simply we could   say just take the absolute value of that  cosine, and that gives us a valid formula. So bob's happy because he has a precise  formula describing the area of the shadow,   but Alice starts to think about it a little bit  differently. She says, okay we've got some shape,   and then we apply a rotation that sort of situates  it into 3d space in some way, and then we apply   a flat projection that shoves that back into  two-dimensional space. And what stands out to her   is that both of these are linear transformations.  That means that in principle you could describe   each one of them with a matrix, and that the  overall transformation would look like the product   of those two matrices. What Alice knows from  one of her favorite subjects, linear algebra, is   that if you take some shape and you consider its  area, then you apply some linear transformation,   the area of that output looks like some constant  times the original area of the shape. More   specifically we have a name for that constant,  it's called the determinant of the transformation. If you're not so comfortable with linear algebra,  we could give a much more intuitive description   and say if you uniformly stretch the original  shape in some direction, the output will also   uniformly get stretched in some direction. So the  area of each of them should scale in proportion   to each other. Now, in principle Alice could  compute this determinant. But it's not really   her style to do that, at least not to do so  immediately. Instead the thing that she writes   down is how this proportionality constant between  our original shape and its shadow does not depend   on the original shape. We could be talking about  the shadow of this cat outline, or anything else,   and the size of it doesn't really matter, the  only thing affecting that proportionality constant   is what transformation we're applying, which in  this context means we could write it down as some   factor that depends on the rotation being applied  to the shape. In the back of our mind because of   bob's calculation we know what that factor looks  like, you know it's the absolute value of the   cosine of the angle between the normal vector and  the vertical. But Alice right now is just saying,   "yeah, yeah, I can think about that eventually  when I want to." But she knows we're about to   average over all the different orientations  anyway, so she holds out some hope that any   specific formula about a specific orientation  might get washed away in that average. Now it's easy to look at this and say, "Okay,  well Alice isn't really doing anything then!"   Of course the area of the shadow is  proportional to the area of the original shape,   they're both two-dimensional quantities, they  should both scale like two-dimensional things. But keep in mind this would not at all be  true if we were dealing with the harder   case that has a closer light source. In  that case the projection is not linear.   So for example if I rotate this cat so that its  tail ends up quite close to the light source,   then if I stretch the original  shape uniformly in the x direction,   say by a factor of 1.5, it might have a very  disproportionate effect on the ultimate shadow,   because the tail gets very disproportionately  blown up as it gets really close to the light. Again, Alice is keeping an eye out for what  properties of the problem are actually relevant,   because that helps her know how much she  can generalize things. Does the fact that   we're thinking about a square face and not  some other shape matter? No not really. Does   the fact that the transformation  is linear matter? Yes, absolutely. Alice can also apply a similar way of  thinking about the average shadow for   any shape like this. Say we have some sequence  of rotations that we apply to our square face,   and let's call them R1, R2, R3, and so on. Then  the area of the shadow in each one of those cases   looks like some factor times the area of the  square, and that factor depends on the rotation.   So if we take an empirical average  for that shadow across the sample of   rotations we're looking at right now,  the way it looks is to add up all of   those shadow areas and then divide  by the total number that we have. Now, because of the linearity, this area of the  original square can cleanly factor out of all of   that, and it ends up on the left. This isn't the  exact average that we're looking for, it's just   an empirical mean of a sample of rotations. But  in principle what we're looking for is what this   approaches as the size of our sample approaches  infinity. And all the parts that depend on the   size of the sample sit cleanly away from the area  itself. So whatever this approaches, in the limit   it's just going to be some number. It might be  a royal pain to compute, we're not sure about   that yet, but the thing that Alice notes is that  it's independent of the size and the shape of the   particular 2d thing that we're looking at. It's a  universal proportionality constant, and her hope   is that that universality somehow lends itself  to a more elegant way to deduce what it must be.   Now Bob would be eager to compute  this constant here and now,   and in a few minutes I'll show you how he  does it. But before that, I do want to stay   in Alice's world for a little bit more, because  this is where things start to really get fun. In her desire to understand the overall structure  of the question before diving into the details,   she's curious now about how the area of the  shadow of the cube relates to the area of its   individual faces. If we can say something about  the average area of a particular face, does that   tell us anything about the average area of the  cube as a whole? For example, a simple thing we   could say is that that area is definitely less  than the sum of the areas across all the faces,   because there's a meaningful amount of overlap  between those shadows. But it's not entirely clear   how to think about that overlap, because if we  focus our attention just on two particular faces,   in some orientations they don't overlap at  all, but in other orientations they do have   some overlap. The specific shape and area of that  overlap seems a little bit tricky to think about,   much less how on earth we would average that  across all of the different orientations. But Alice has about three clever  insights through this whole problem,   and this is the first one of them. She says,  actually, if we think about the whole cube,   not just a pair of faces, we can conclude that  the area of the shadow for a given orientation   is exactly one-half the sum of  the areas of all of the faces.   Intuitively, you can maybe guess that half of them  are bathed in the light, and half of them are not.   But here's the way that she justifies it.  She says for a particular ray of light   they would go from the sky and eventually hit a  point in the shadow, that ray passes through the   cube at exactly two points. There's one moment  when it enters, and one moment when it exits,   so every point in that shadow corresponds to  exactly two faces above it. Well, okay, that's not   exactly true. If that beam of light happened to go  through the edge of one of the squares, there's a   little bit of ambiguity on how many faces it's  passing. But those account for zero area inside   the shadow, so we're safe to ignore them if the  thing we're trying to do is compute the area. If Alice is pressed and she needs to justify  why exactly this is true, which is important   for understanding how the problem might  generalize, she can appeal to the idea of   convexity. Convexity is one of those properties  where a lot of us have an intuitive sense for   what it should mean. You know, it's shapes  that just bulge out, they never dent inward.   But mathematicians have a pretty clever way of  formalizing it that's helpful for actual proofs They say that a set is "convex" if the line  that connects any two points inside that set   is entirely contained within the set itself. So a  square is convex because no matter where you put   two points inside that square, the line connecting  them is entirely contained inside the square. But something like the symbol pi is not  convex. I can easily find two different points   so that the line connecting them has to peak  outside of the set itself. None of the letters   in the word "convex" are themselves convex. You  can find two points so that the line connecting   them has to pass outside of the set. It's a really  clever way to formalize this idea of a shape that   only bulges out, because any time that it dents  inward, you can find these counter-example lines For our cube, because it's convex, between the  first point of entry and the last point of exit   it has to stay entirely inside the cube, by the  definition of convexity. But if we were dealing   with some other non-convex shape, like a donut,  you could find a ray of light that enters then   exits then enters and exits again. So you wouldn't  have a clean two-to-one cover from the shadows.   The shadows of all of its different parts, if  you were to cover this in a bunch of faces,   would not be precisely two times  the area of the shadow itself. So that's the first key insight, the face  shadows double-cover the cube shadow.   And the next one is a little bit more  symbolic, so let's start things off by   abbreviating our notation a little to make room on  the screen. Instead of writing Area(Shadow(Cube)),   I'm just going to write S(Cube). And similarly  instead of Area(Shadow(a particular face)), I'm   just going to write S(F_j), where that subscript  j indicates which face I'm talking about. But of course, we should really be talking about  the shadow of a particular rotation applied to the   cube, so I might write this as S of some rotation  applied to the cube. And likewise on the right,   it's the area of the shadow of that same  rotation applied to a given one of the faces. With the more compact notation at hand, let's  think about the average of this shadow area across   many different rotations, some sample of R1, R2,  R3, and so on. Again, that average just involves   adding up all of those shadow areas and then  dividing them by n, and in principle if we were   to look at this for larger and larger samples,  letting n approach infinity, that would give us   the average area of the shadow of the cube. Some  of you might be thinking, "yes, we know this,   you've said this already." But it's beneficial to  write it out so that we can understand why it is   that expressing the shadow area for a particular  rotation of the cube as a sum across all of its   faces, or one half times that sum at least...why  is that beneficial? What is that going to do for   us? Well, let's just write it out, where for  each one of these rotations of the cube we   could break down that shadow as a sum across that  same rotation applied across all of the faces. And   when it's written as a grid like this, we can get  to Alice's second insight, which is to shift the   way that we're thinking about the sum from going  row-by-row to instead going column-by-column. For example if we focused our attention just  on the first column, what it's telling us is   to add up the area of the shadow of the first  face across many different orientations. So if   we were to take that sum and divide it by the  size of our sample, that gives us an empirical   average for the area of the shadow of this  face. So if we take larger and larger samples,   letting that size go to infinity, this will  approach the average shadow area for a square.   Likewise, the second column  can be thought of as telling us   the average area for the second face of the  cube, which should of course be the same number.   And same deal for any other column, it's telling  us the average area for a particular face. So that gives us a very different way  of thinking about our whole expression.   Instead of saying add up the areas of the  cubes at all the different orientations,   we could say just add up the average  shadows for the six different faces   and multiply the total by one half. The term  on the left here is thinking about adding up   rows first, and the term on the right is  thinking about adding up columns first. In short, the average of  the sum of the face shadows   is the same as the sum of the average of the  face shadows. Maybe that swap seems simple,   maybe it doesn't, but I can tell you  that there is actually a little bit   more than meets the eye to the step that  we just took. But we'll get to that later. And remember, we know that the  average area for a particular face   looks like some universal proportionality  constant times the area of that face,   so if we're adding this up across all the faces  of the cube, we could think of this as equaling   some constant times the surface area of  the cube. And that's pretty interesting,   the average area for the shadow of this cube is  going to be proportional to its surface area.   But at the same time, you might complain, "Well  Alice is just pushing around a bunch of symbols   here, because none of this matters if we don't  know what that proportionality constant is!" I mean it almost seems obvious. Like, of course  the average shadow area should be proportional to   the surface area, they're both two-dimensional  quantities, so they should scale in lock step   with each other. It's not obvious. After all,  for a closer light source it simply wouldn't   be true. And also this business where we added  up the grid column-by-column versus row-by-row   is a little more nuanced than it might  look at first, there's a subtle hidden   assumption underlying all of this which carries  a special significance when we choose to revisit   the question of what probability distribution is  being taken across the space of all orientations. But more than anything, the reason that it's  not obvious is that the significance of this   result right here is not merely that these two  values are proportional. It's that an analogous   fact will hold true for any convex solids, and  crucially, the actual content of what Alice   has built up so far is that it'll be the same  proportionality constant across all of them.   Now if you really mull over that, some of you  may be able to predict the way that Alice is   able to finish things off from here. It's really  delightful, it's honestly my main reason for   covering this topic. But before we get into it,  I think it's easy to under-appreciate her result   unless we dig into the details of  what it is that she manages to avoid. So let's take a moment to turn our  attention back into Bob's world   because while Alice has been doing all of this,  he's been busy doing some computations. In fact,   what he's been working on is finding exactly  what Alice has yet to figure out, which is how to   take the formula that he found for the area of a  square's shadow, and taking the natural next step   of trying to find the average of that square's  shadow, averaged over all possible orientations. The way Bob starts, if he's thinking about all the  different possible orientations for this square,   is to ask what are all the different normal  vectors that that square can have in all   these orientations. Because everything about  its shadow comes down to that normal vector.   It's not too hard to see that all those possible  normal vectors trace out the surface of a sphere.   If we assume it's a unit normal vector, it's a  sphere with radius 1. And furthermore, Bob figures   that each point of the sphere should be just as  likely to occur as any other, our probabilities   should be uniform in that way. There's no  reason to prefer one direction over another.   But in the context of continuous probabilities  it's not very helpful to talk about the likelihood   of a particular individual point, because in the  uncountable infinity of points on the sphere,   that would be zero and unhelpful. So instead  the more precise way to phrase this uniformity   would be to say the probability that our normal  vector lands in any given patch of area on the   sphere should be proportional to that area itself.  More specifically it should equal the area of that   little patch divided by the total surface area  of the sphere. If that's true no matter what   patch of area we're considering, that's what we  mean by a uniform distribution on the sphere. Now to be clear, points on the sphere are not  the same thing as orientations in 3d space,   because even if you know what normal  vector the square is going to have   that leaves us with another degree of freedom. The  square could be rotated about that normal vector.   but Bob doesn't actually have to care about that  extra degree of freedom, because in all of those   cases the area of the shadow is the same. It's  only dependent on the cosine of the angle between   that normal vector and the vertical, which is  kind of neat. All those shadows are genuinely   different shapes, they're not the same, but  the area of each of them will be the same. What this means is that when Bob wants  this average shadow area over all possible   orientations, all he really needs to know is the  average value of this absolute value of cosine of   theta for all different possible normal vectors,  all different possible points on the sphere. So how do you compute an average like this?  Well if we lived in some kind of discrete,   pixelated world, where there's only a finite  number of possible angles theta that that   normal vector could have, the average would be  pretty straightforward. What you do is find the   probability of landing on any particular value of  theta, which will tell us something like how much   of the sphere do normal vectors with that angle  make up, and then you multiply it by the thing we   want to take the average of, this formula for the  area of the shadow. And then you would add that up   over all of the different possible values of theta  ranging from 0 up to 180 degrees, or pi radians. But of course in reality there is a  continuum of possible values of theta   this uncountable infinity, and the probability of  landing on any specific particular value of theta   will actually be zero, and so a sum like this  unfortunately doesn't really make any sense.   Or if it does make sense, adding up infinitely  many zeros should just give us a zero. The short answer for what we do instead is that  we compute an integral. And I'll level with you,   the hard part here is I'm not entirely sure what  background I should be assuming from those of   you watching right now. Maybe it's the case  that you're quite comfortable with calculus,   and you don't need me to belabor the point here.  Maybe it's the case that you're not familiar with   calculus, and I shouldn't just be throwing down  integrals like that. Or maybe you...you know,   you took a calculus class a while ago  but you need a little bit of a refresher. I'm gonna go with the option of setting  this up as if it's a calculus lesson,   because to be honest even when  you are quite comfortable with   integrals setting them up can be  kind of an error-prone process,   and calling back to the underlying definition is a  good way to sort of check yourself in the process. If we lived in a time before calculus  existed and integrals weren't a thing   and we wanted to approximate an answer to this  question, one way we could go about it is to take   a sample of values for theta that ranges  from 0 up to 180 degrees. We might think   of them as evenly spaced, with some sort of  difference between each one, some delta-theta. And it's still the case that it would be unhelpful  to ask about the probability of a particular value   of theta occurring, even if it's one in our  sample. That probability would still be zero,   and it would be unhelpful. But what is helpful  to ask is the probability of falling between two   different values from our sample, in this little  band of latitude with a width of delta theta. Based on our assumption that the distribution  along the sphere should be uniform,   that probability comes down to knowing the area  of this band. More specifically, the chances   that a randomly chosen vector lands in that band  should be that area divided by the total surface   area of the sphere. To figure out that area,  let's first think of the radius of that band,   which if the radius of our sphere is 1 is  definitely going to be smaller than 1. And   in fact, if we draw the appropriate little right  triangle here, you can see that that little radius   let's just say at the top of the band should be  the sine of our angle, the sine of theta. This   means that the circumference of the band should  be 2 pi times the sine of that angle. And then   the area of the band should be that circumference  times its thickness, that little delta theta. Or   rather, the area of our band is approximately this  quantity. What's important is that for a finer   sample of many more values of theta the accuracy  of that approximation would get better and better. Now remember, the reason we wanted this area is  to know the probability of falling into that band,   which is this area divided by the surface area  of the sphere, which we know to be 4 pi times   its radius squared. That's a value that you could  also compute with an integral similar to the one   that we're setting up now, but for now we can take  it as a given, as a standard well-known formula. And this probability itself is just a stepping  stone in the direction of what we actually want,   which is the average area  for the shadow of a square.   To get that we'll multiply this probability times  the corresponding shadow area, which is this   absolute value of cosine theta expression  we've seen many times up to this point.   And our estimate for this average would  now come down to adding up this expression   across all of the different bands, all of the  different samples of theta that we've taken. This right here, by the way, is when  Bob is just totally in his element.   We've got a lot of exact formulas  describing something very concrete,   actually digging in on our way to a real answer.  And again, if it feels like a lot of detail,   I want you to appreciate that fact so that you  can appreciate just how magical it is when Alice   manages to somehow avoid all of this. Anyway,  looking back at our expression, let's clean   things up a little bit, like factoring out all  of the terms that don't depend on theta itself.   And we can simplify that 2 pi divided by 4 pi to  simply be one half. And to make it a little more   analogous to calculus with integrals, let me  just swap the main terms inside the sum here. What we now have, this sum that's going  to approximate the answer to our question,   is almost what an integral is. Instead of writing  the sigma for sum, we write the integral symbol,   this kind of elongated Leibnizian S showing us  that we're going from zero to pi. And instead   of describing the step size as delta theta, a  concrete finite amount, we instead describe it as   "d" theta, which I like to think of as signaling  the fact that some kind of limit is being taken. What that integral means, by definition, is  whatever the sum on the bottom approaches   for finer and finer subdivisions, more dense  samples that we might take for theta itself.   And at this point, for those of you who do know  calculus, I'll just write down the details of how   you would actually carry this out as you might see  it written down in Bob's notebook. It's the usual   anti-derivative stuff, but the one key step  is to bring in a certain trig identity.   In the end, what Bob finds after doing  this is the surprisingly clean fact   that the average area for a square's shadow  is precisely one-half the area of that square.   This is the mystery constant  which Alice doesn't yet know.   If Bob were to look over her shoulder and see  the work that she's done he could finish out   the problem right now. He plugs in the constant  that he just found and he knows the final answer. And now, finally! With all of this as backdrop,   what is it that Alice does to  carry out the final solution? I introduced her as someone who really  likes to generalize the results she finds.   And usually those generalizations end up as  interesting footnotes that aren't really material   for solving particular problems. But this is a  case where the generalization itself draws her to   a quantitative result. Remember, the substance of  what she's found so far is that if you look at any   convex solid, then the average area for its shadow  is going to be proportional to its surface area.   And critically, it'll be the same proportionality  constant across all of these solids. So all Alice   needs to do is find just a single convex solid out  there where she already knows the average area of   its shadow. And some of you may see where this is  going, the most symmetric solid available to us is   a sphere. No matter what the orientation of that  sphere, its shadow, the flat projection shadow,   is always a circle with an area of pi r squared.  So in particular that's its average shadow area.   And the surface area of a sphere, like I  mentioned before, is exactly 4 pi r squared. By the way, I did make a video talking  all about that surface area formula,   and how Archimedes proved it thousands of years  before calculus existed. So you don't need   integrals to find it. The magic of what Alice  has done is that she can take this seemingly   specific fact, that the shadow of a sphere has  an area exactly one-fourth its surface area,   and use it to conclude a much more general  fact, that for any convex solid out there   its shadow and surface area  are related in the same way,   in a certain sense. Wo with that she can go and  fill in the details of the particular question   about a cube and say that its average shadow area  will be one-fourth times its surface area, 6s^2. But the much more memorable fact that  she'll go to sleep thinking about   is how it didn't really matter that  we were talking about a cube at all. Now, that's all very pretty, but some of you might  complain that this isn't really a valid argument,   because spheres don't have flat faces. When I said  Alice's argument generalizes to any convex solid,   if we actually look at the argument itself,  it definitely depends on the use of a finite   number of flat faces. For example, if  we were mapping it to a dodecahedron,   you would start by saying that the area of  a particular shadow of that dodokahedron   looks like exactly one half times the sum of  the areas of the shadows of all its faces. Snce again you could use a certain  ray-of-light-mixed-with-convexity argument   to draw that conclusion. And remember the benefit  of expressing that shadow area as a sum is that   when we want to average over a bunch of different  rotations, we can describe that sum as a big grid,   where we can then go column-by-column and consider  the average area for the shadow of each face.   And also, a critical fact was the conclusion from  much earlier that the average shadow for any 2d   object (a flat 2d object, which is important)  will equal some universal proportionality   constant times its area. The significance was that  that constant didn't depend on the shape itself,   it could have been a square, or a cat, or the  pentagonal faces of our dodecahedron, whatever. So after hastily carrying this over to a sphere  that doesn't have a finite number of flat faces,   you would be right to complain. But luckily,  it's a pretty easy detail to fill in.   What you can do is imagine a sequence of different  polyhedra that successively approximate a sphere,   in the sense that their faces hug tighter and  tighter around the genuine surface of the sphere.   For each one of those approximations, we can draw  the same conclusion that its average shadow is   going to be proportional to its surface area, with  this universal proportionality constant. So then,   if we say "okay, let's take the limit of the  ratio between the average shadow area at each step   and the surface area at each step..." Well, since  that ratio is never changing, it's always equal to   this constant, then in the limit it's also going  to equal that constant. But on the other hand,   by their definition, in the limit their average  shadow area should be that of a circle which is   pi r squared, and the limit of the surface areas  would be the surface area of the sphere, 4 pi   r squared. So we do genuinely get the conclusion  that intuition would suggest, but as is so common   with Alice's argument here, we do have to be a  little delicate in how we justify that intuition. It's easy for this contrast of Alice and  Bob to come across like a value judgment,   as if I'm saying "Look how  clever Alice has managed to be!   She insightfully avoided all those  computations that Bob had to do." But that would be a very...misguided conclusion.   I think there's an important way that  popularizations of math differ from the   feeling of actually doing math. There's this bias  towards showing the slick proofs, the arguments   with some clever key insight that lets you avoid  doing calculations. I could just be projecting,   since I'm very guilty of this, but what I can tell  you sitting on the other side of the screen here   is that it feels a lot more attractive to make a  video about Alice's approach than Bob's. For one,   thing in Alice's approach the line of reasoning  is fun. It has these nice aha moments. But also, crucially, the way that you explain it  is more or less the same for a very wide range   of mathematical backgrounds. It's much less  enticing to do a video about bob's approach,   not because the computations are all  that bad. I mean they're honestly not.   But the pragmatic reality is that  the appropriate pace to explain it   looks very different depending on the different  mathematical backgrounds in the audience. So you watching this right now  clearly consume math videos online,   and I think in doing so it's  worth being aware of this bias. If the aim is to have a genuine lesson on problem  solving, too much focus on the slick proofs runs   the risk of being disingenuous. For example let's  say we were to step up to challenge mode here   and ask about the case with a closer light source.  To my knowledge there is not a similarly slick   solution to Alice's here, where you can  just relate to a single shape like a sphere.   The much more productive warm-up to have done  would have been the calculus of Bob's approach. And if you look at the history of this  problem, it was proved by Cauchy in 1832.   And if we paw through his handwritten notes,   they look a lot more similar to Bob's work than  Alice's work. Right here at the top of page 11,   you can see what is essentially the same  integral that you and I set up in the middle. On the other hand, the whole framing  of the paper is to find a general fact,   not something specific like the case of  a cube, so if we were asking the question   which of these two mindsets correlates  with the act of discovering new math,   the right answer would almost certainly have  to be a blend of both. But I would suggest that   many people don't assign enough weight to the  part of that blend where you're eager to dive   into calculations. And I think there's some risk  that the videos I make might contribute to that. In the podcast that I did with the  mathematician Alex Kontorovich, he talked   about the often underappreciated importance of  just drilling on computations to build intuition,   whether you're a student engaging with a new  class, or a practicing research mathematician   engaging with a new field of study. A listener  actually wrote in to highlight what an impression   that particular section made. They're a Ph.D.  student, and described themselves as being worried   that their mathematical abilities were starting to  fade, which they attributed to becoming older and   less sharp. But hearing a practicing mathematician  talk about the importance of doing hundreds of   concrete examples in order to learn something  new, evidently that changed their perspective.   In their own words, recognizing this completely  reshaped their outlook and their results.   And if you look at the famous mathematicians  through history, You know Newton, Euler, Gauss,   all of them, they all have this seemingly  infinite patience for doing tedious calculations. The irony of being biased to show insights  that let us avoid calculations is that the   way people often train up the intuitions  to find those insights in the first place   is by doing piles and piles of calculations. All that said, something would definitely  be missing without the Alice mindset here. I mean think about it how sad would it  be if we solved this problem for a cube,   and we never stepped outside of the trees  to see the forest and understand that   this is a super general fact, it  applies to a huge family of shapes.   And if you consider that math is not just about  answering the questions that are posed to you,   but about introducing new ideas and constructs,  one fun side note about Alice's approach here   is that it suggests a fun way to quantify the  idea of convexity. Rather than just having a   yes/no answer, is it convex is it not, we could  put a number to it by saying: Consider the average   area of the shadow of some solid, multiply  that by four, divide by the surface area,   and if that number is 1 you've got a  convex solid. But if it's less than 1,   it's non-convex, and how close it is to 1  tells you how close it is to being convex. Also, one of the nice things about the Alice  solution here is that it helps explain why it   is that mathematicians have what can sometimes  look like a bizarre infatuation with generality,   and with abstraction. The more examples that  you see where generalizing and abstracting   actually helps you to solve a specific case, well  the more you start to adopt the same infatuation. And as a final thought, for the stalwart viewers  among you who've stuck through it this far,   there is still one unanswered question  about the very premise of our puzzle.   What exactly does it mean to choose a random  orientation? Now if that feels like a silly   question, like, of course we know what it should  mean, I would encourage you to watch a video   that I just did with Numberphile on a conundrum  from probability known as "Bertrand's Paradox".   After you watch it, and if you appreciate some of  the nuance at play here, homework for you is to   reflect on where exactly Alice and Bob implicitly  answered this question. The case with Bob is   relatively straightforward, but the point at which  Alice locks down some specific distribution on the   space of all orientations...well it's not  at all obvious, it's actually very subtle.
Info
Channel: 3Blue1Brown
Views: 266,168
Rating: undefined out of 5
Keywords: Mathematics, three blue one brown, 3 blue 1 brown, 3b1b, 3brown1blue, 3 brown 1 blue, three brown one blue
Id: ltLUadnCyi0
Channel Id: undefined
Length: 40min 5sec (2405 seconds)
Published: Mon Dec 20 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.