Alice, Bob, and the average shadow of a cube

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

If we're mainly talking about how to actually solve problems, then I think that it's good for Grant to have pointed out the biases that these kinds of videos can have towards slick solutions. He's a smart guy and I think the last thing he wants to be is misleading about math, so slightly self-critical videos like this are great not just for the specific content but also as example.

Usually how I work is I just do whatever I can in order to get a solution, and it all ends up pretty messy. Kinda like going through a forest without having directions, basically a random walk. Lot's of unnecessary steps/computations/ideas etc. But then upon reflection I notice some of these redundancies and find ways to shorten the path by connecting two dots in a more direct way. Thinking and sitting on it like this for long enough will usually end up with a nice, slick proof where the underlying idea that I initially drew on (but maybe was hidden) is showcased. Sometimes the solution spontaneously flips to a totally different method only revealed by this simplification process. In a way, if Bob pays attention to what he's doing, by knowing what his computations are saying, then he can find a homotopy from his solution to Alice's.

There's also more that goes into problem solving than an individual sitting down and doing computations or playing with ideas alone in a dark room (or on a test) - which is another bias that popularizing videos can have. There is way more collaboration and discussion that happen. Why are Alice and Bob working on the problem alone? Why are they not working together while both bringing their different perspectives to the table? We tend to construct mathematics as being done by very smart - slightly crazy - men alone in their rooms (even this video helps with this). Newton gets all the credit for Calculus, but a lot of what we would call calculus was known by his time (especially for algebraic curves), including a version of the Fundamental Theorem of Calculus by Barrow, Newton's advisor. Newton and Leibniz both found ways of using infinitesimals to generalize these results and use them in broader contexts. But infinitesimals/fluxions weren't even meaningful things; they were just things that, computationally, were and were not zero at different times, it would take hundreds of years to develop meaningful tools for them. It's not like Newton went to the countryside to avoid the plague for two years and came out with the Principia, he collaborated during this time through letters and it wasn't until long after his prominence that he published the Principia.

In reality, math is created from the diversity of thought and perspectives that many people bring to the table. The Alices, the Bobs, the Carols, etc working together rather than in isolation. We generally can't solve problems by ourselves, we need each other. Whether it be remembering an idea a friend showed you one time, or straight-up talking on a board together, problem solving is a community effort that does not happen solely in an individual's brain. I typically sort-out my problems by explaining them to others, and this process/feedback helps turn it slick or illuminate the key idea needed to finish it. A move away from "crazy, genius, hero mathematicians" to "a healthy community of similarly-interested collaborators with different ways of thinking" would be good to see.

👍︎︎ 109 👤︎︎ u/functor7 📅︎︎ Dec 20 2021 🗫︎ replies

Raise your hand if you're a "Frustrated Bob": likes calculations, starts that way, then ends up with a mess of tangled integrals that you can't solve and rage quits.

Grant's reflection is truly exemplary.

👍︎︎ 30 👤︎︎ u/0nlyg00d 📅︎︎ Dec 20 2021 🗫︎ replies

I'm very happy that this is the direction the video went, not claiming they're different styles but being honest about needing both of them(and probably a little more of bob).

Far too many educators believe the 'learning styles' idea (where different students have prefered learning styles they're better in) so i'm glad we didn't have some "find out where you're an alice or a bob" kind of thing :)

👍︎︎ 58 👤︎︎ u/asphias 📅︎︎ Dec 20 2021 🗫︎ replies

I just watched this… So, people of r/math , is there a nice slick generalization to when the light source isn’t infinitely far away?

👍︎︎ 12 👤︎︎ u/Nilstyle 📅︎︎ Dec 20 2021 🗫︎ replies

[removed]

👍︎︎ 33 👤︎︎ u/[deleted] 📅︎︎ Dec 20 2021 🗫︎ replies

mathologer also has a video featuring the shadow cube problem

part 1 - part 2

👍︎︎ 7 👤︎︎ u/snillpuler 📅︎︎ Dec 20 2021 🗫︎ replies

3b1b needs a Nobel prize for th wonderful videos he puts on YouTube.

👍︎︎ 18 👤︎︎ u/sossolha9ira 📅︎︎ Dec 20 2021 🗫︎ replies

I really like the Alice approach here. As soon as he reached the conclusion that the constant should be the same for every convex shape, at around 20 minutes, I figured out the solution.

I also really like his conclusion. I think this is a general problem with how math is presented, but it's also necessary to be this way. The reason we can understand so much math in school so quickly is because it's already neatly packed into generalisations and specialisations for the students. In hindsight a lot of this stuff seems easy and there is so much math you learn that it can't be expected of every student to go through every step by themselves. I think it's important to realise that you can still learn a lot about how math works by going straight to the neat solutions.

On the other hand, this creates a picture of how math research works that is just not realistic. The reality is that a lot of trial and error happens, that never shows up in the released papers. In turn students at university are very frustrated when they have to do a lot of work to figure stuff out. I wish there was more time at school to teach that part of doing math is to overcome that frustration and keep trying. I know that I am grossly oversimplifying things here, but I think being able to take that initial frustration is one of the most important traits to become a mathematician.

👍︎︎ 4 👤︎︎ u/DerFelix 📅︎︎ Dec 20 2021 🗫︎ replies

wake up babe new 3b1b video

👍︎︎ 5 👤︎︎ u/Iron_2019 📅︎︎ Dec 20 2021 🗫︎ replies

Captions

In a moment I'm going to tell you about a certain really nice puzzle involving the shadow of a cube. But before we get to that I should say that the point of this video is not exactly the puzzle, per se, it's about two distinct problem-solving styles that are reflected in two different ways that we can tackle this problem. In fact let's anthropomorphize those two different styles by imagining two students, Alice and Bob who embody each one of the approaches. So Bob will be the kind of student who really loves calculation. As soon as there's a moment when he can dig into the details and get a very concrete view of the concrete situation in front of him, that's where he's the most pleased. Alice on the other hand is more inclined to procrastinate the computations, not because she doesn't know how to do them or doesn't want to, per se, but she prefers to get a nice high-level general overview of the kind of problem she's dealing with, the general shape that it has, before she digs into the computations themselves. She's most pleased if she understands not just the specific question sitting in front of her, but also the broadest possible way that you could generalize it, and especially if the more general view can lend itself to more swift and elegant computations once she does actually sit down to carry them out. Now the puzzle that both of them are going to be faced with is to find the average area for the shadow of a cube. So if I have a cube kind of sitting here hovering in space, there are, a few things that influence the area of its shadow. One obvious one would be the size of the cube, smaller cube smaller shadow. But also if it's sitting at different orientations, those orientations correspond to different particular shadows with different areas. And when I say find the average here, what I mean is the average over all possible orientations for a particular size of the cube. The astute among you might point out that it also matters a lot where the light source is. If the light source were very low close to the cube itself then the shadow ends up larger and if the light source were kind of positioned laterally off to the side, this can distort the shadow and give it a very different shape. Accounting for that light position stands to be highly interesting in its own right, but the puzzle is hard enough as it is, so at least initially let's do the easiest thing we can and say that the light is directly above the cube and really far away. Effectively infinitely far so that all we're considering is a flat projection, in the sense that if you look at any coordinates (x, y, z) in space the flat projection would be (x, y, 0) So just to get our bearings, the easiest situation to think about would be if the cube is straight up with two of its faces parallel to the ground. In that case this flat projection shadow is simply a square, and if we say the side lengths of the cube are s, then the area of that shadow is s squared. And by the way, anytime that I have a label up on these animations like the one down here I'll be assuming that the relevant cube has a side length of 1. Another special case among all the orientations that's fun to think about is if the long diagonal is parallel to the direction of the light. In that case the shadow actually looks like a regular hexagon, and if you use some of the methods that we will develop in a few minutes, you can compute that the area of that shadow is exactly the square root of three times the area of one of the square faces. But of course more often the actual shadow will not be so regular as a square or a hexagon, it's some harder-to-think-about shape based on some harder-to-think-about orientation for this cube. Earlier, I casually threw out this phrase of averaging over all possible orientations, but you could rightly ask what exactly is that supposed to mean. I think a lot of us have an intuitive feel for what we want it to mean, at least in the sense of what experiment would you do to verify it. You might imagine tossing this cube in the air like a die, freezing it at some arbitrary point, recording the area of the shadow from that position, and then repeating. If you do this many many times over and over you can take the mean of your sample. The number that we want to get at, the true average here, should be whatever that experimental mean approaches as you do more and more tosses approaching infinitely many. Even still, the sticklers among you could complain that doesn't really answer the question, because it leaves open the issue of how we're defining a "random" toss. The proper way to answer this, if we want it to be more formal, would be to first describe the space of all possible orientations, which mathematicians have actually given a fancy name. They call it SO(3), typically defined in terms of a certain family of 3-by-3 matrices. And the question we want to answer is "What probability distribution are we putting to this entire space?" It's only when such a probability distribution is well-defined that we can answer a question involving an average. If you are a stickler for that kind of thing, I want you to hold off on that question until the end of the video. You'll be surprised at how far we can get with the more heuristic experimental idea of just repeating a bunch of random tosses without really defining the distribution. Once we see Alice and Bob's solutions, it's actually very interesting to ask how exactly each one of them defined this distribution along their way. And remember, this is not meant to be a lesson about cube shadows, per se, but a lesson about problem-solving told through the lens of two different mindsets that we might bring to the puzzle. And as with any lesson on problem-solving, the goal here is not to get to the answer as quickly as we can, but hopefully for you to feel like you found the answer yourself. So if ever there's a point when you feel like you might have an idea, give yourself the freedom to pause and try to think it through. As a first step, and this is really independent of any particular problem-solving styles, just anytime you find a hard question, a good thing that you can do is ask "What's the simplest possible non-trivial variant of the problem that you can try to solve?" In our case what you might say is, okay, let's forget about averaging over all the orientations. That's a tricky thing to think about. And let's even forget about all the different faces of the cube, because they overlap and that's also tricky to think about. Just for one particular face and one particular orientation, can we compute the area of this shadow? Once more, if you want to get your bearings with some special cases the easiest is when that face is parallel to the ground in which case the area of the shadow is the same as the area of the face. And on the other hand if we were to tilt that face 90-degrees, then its shadow will be a straight line and it has an area of zero. So Bob looks at this and he wants an actual formula for that shadow, and the way he might think about it is to consider the normal vector perpendicular off of that face. What seems relevant is the angle that that normal vector makes with the vertical, with the direction where the light is coming from, which we might call theta. Now from the two special cases we just looked at, we know that when theta is equal to 0, the area of that shadow is the same as the area of the shape itself, which is s squared if the square has side lengths s. And if theta is equal to 90 degrees, then the area of that shadow is zero. And it's probably not too hard to guess that trigonometry will be somehow relevant, so anyone comfortable with their trig functions could probably hazard a guess as to what the right formula is. But Bob is more detail-oriented than that. He wants to properly prove what that area should be rather than just making a guess based on the endpoints. The way you might think about it could be something like this. If we consider the plane that passes through the vertical as well as our normal vector, and then we consider all the different slices of our shape that are in that plane, or parallel to that plane, then we can focus our attention on a two-dimensional variant of the problem. If we just look at one of those slices, who has a normal vector an angle theta away from the vertical, its shadow might look something like this. And if we draw a vertical line up to the left here, we have ourselves a right triangle. And from here we can do a little bit of angle chasing, where we follow around what that angle theta implies about the rest of the diagram. And this means the lower right angle in this triangle is precisely theta. So when we want to understand the size of this shadow in comparison to the original size of the piece, we can think about the cosine of that angle theta, which remember is the adjacent over the hypotenuse. It's literally the ratio between the size of the shadow and the size of the slice. So the factor by which the slice gets squished down in this direction is exactly cosine of theta. And if we broaden our view to the entire square all the slices in that direction get scaled by the same factor. But in the other direction, the one perpendicular to that slice, there is no stretching or squishing because the face is not at all tilted in that direction. So overall the two-dimensional shadow of our two-dimensional face should also be scaled down by this factor of a cosine of theta. It lines up with what you might intuitively guess given the case where the angle is 0 degrees and the case where it's 90 degrees, but it's reassuring to see why it's true Actually, as stated so far, this is not quite correct. There is a small problem with the formula that we've written. In the case where theta is bigger than 90 degrees, the cosine would actually come out to be negative, but of course we don't want to consider the shadow to have negative area. At least not in a problem like this. So there's two different ways you could solve this. You could say we only ever want to consider the normal vector that is pointing up, that has a positive z component. Or more simply we could say just take the absolute value of that cosine, and that gives us a valid formula. So bob's happy because he has a precise formula describing the area of the shadow, but Alice starts to think about it a little bit differently. She says, okay we've got some shape, and then we apply a rotation that sort of situates it into 3d space in some way, and then we apply a flat projection that shoves that back into two-dimensional space. And what stands out to her is that both of these are linear transformations. That means that in principle you could describe each one of them with a matrix, and that the overall transformation would look like the product of those two matrices. What Alice knows from one of her favorite subjects, linear algebra, is that if you take some shape and you consider its area, then you apply some linear transformation, the area of that output looks like some constant times the original area of the shape. More specifically we have a name for that constant, it's called the determinant of the transformation. If you're not so comfortable with linear algebra, we could give a much more intuitive description and say if you uniformly stretch the original shape in some direction, the output will also uniformly get stretched in some direction. So the area of each of them should scale in proportion to each other. Now, in principle Alice could compute this determinant. But it's not really her style to do that, at least not to do so immediately. Instead the thing that she writes down is how this proportionality constant between our original shape and its shadow does not depend on the original shape. We could be talking about the shadow of this cat outline, or anything else, and the size of it doesn't really matter, the only thing affecting that proportionality constant is what transformation we're applying, which in this context means we could write it down as some factor that depends on the rotation being applied to the shape. In the back of our mind because of bob's calculation we know what that factor looks like, you know it's the absolute value of the cosine of the angle between the normal vector and the vertical. But Alice right now is just saying, "yeah, yeah, I can think about that eventually when I want to." But she knows we're about to average over all the different orientations anyway, so she holds out some hope that any specific formula about a specific orientation might get washed away in that average. Now it's easy to look at this and say, "Okay, well Alice isn't really doing anything then!" Of course the area of the shadow is proportional to the area of the original shape, they're both two-dimensional quantities, they should both scale like two-dimensional things. But keep in mind this would not at all be true if we were dealing with the harder case that has a closer light source. In that case the projection is not linear. So for example if I rotate this cat so that its tail ends up quite close to the light source, then if I stretch the original shape uniformly in the x direction, say by a factor of 1.5, it might have a very disproportionate effect on the ultimate shadow, because the tail gets very disproportionately blown up as it gets really close to the light. Again, Alice is keeping an eye out for what properties of the problem are actually relevant, because that helps her know how much she can generalize things. Does the fact that we're thinking about a square face and not some other shape matter? No not really. Does the fact that the transformation is linear matter? Yes, absolutely. Alice can also apply a similar way of thinking about the average shadow for any shape like this. Say we have some sequence of rotations that we apply to our square face, and let's call them R1, R2, R3, and so on. Then the area of the shadow in each one of those cases looks like some factor times the area of the square, and that factor depends on the rotation. So if we take an empirical average for that shadow across the sample of rotations we're looking at right now, the way it looks is to add up all of those shadow areas and then divide by the total number that we have. Now, because of the linearity, this area of the original square can cleanly factor out of all of that, and it ends up on the left. This isn't the exact average that we're looking for, it's just an empirical mean of a sample of rotations. But in principle what we're looking for is what this approaches as the size of our sample approaches infinity. And all the parts that depend on the size of the sample sit cleanly away from the area itself. So whatever this approaches, in the limit it's just going to be some number. It might be a royal pain to compute, we're not sure about that yet, but the thing that Alice notes is that it's independent of the size and the shape of the particular 2d thing that we're looking at. It's a universal proportionality constant, and her hope is that that universality somehow lends itself to a more elegant way to deduce what it must be. Now Bob would be eager to compute this constant here and now, and in a few minutes I'll show you how he does it. But before that, I do want to stay in Alice's world for a little bit more, because this is where things start to really get fun. In her desire to understand the overall structure of the question before diving into the details, she's curious now about how the area of the shadow of the cube relates to the area of its individual faces. If we can say something about the average area of a particular face, does that tell us anything about the average area of the cube as a whole? For example, a simple thing we could say is that that area is definitely less than the sum of the areas across all the faces, because there's a meaningful amount of overlap between those shadows. But it's not entirely clear how to think about that overlap, because if we focus our attention just on two particular faces, in some orientations they don't overlap at all, but in other orientations they do have some overlap. The specific shape and area of that overlap seems a little bit tricky to think about, much less how on earth we would average that across all of the different orientations. But Alice has about three clever insights through this whole problem, and this is the first one of them. She says, actually, if we think about the whole cube, not just a pair of faces, we can conclude that the area of the shadow for a given orientation is exactly one-half the sum of the areas of all of the faces. Intuitively, you can maybe guess that half of them are bathed in the light, and half of them are not. But here's the way that she justifies it. She says for a particular ray of light they would go from the sky and eventually hit a point in the shadow, that ray passes through the cube at exactly two points. There's one moment when it enters, and one moment when it exits, so every point in that shadow corresponds to exactly two faces above it. Well, okay, that's not exactly true. If that beam of light happened to go through the edge of one of the squares, there's a little bit of ambiguity on how many faces it's passing. But those account for zero area inside the shadow, so we're safe to ignore them if the thing we're trying to do is compute the area. If Alice is pressed and she needs to justify why exactly this is true, which is important for understanding how the problem might generalize, she can appeal to the idea of convexity. Convexity is one of those properties where a lot of us have an intuitive sense for what it should mean. You know, it's shapes that just bulge out, they never dent inward. But mathematicians have a pretty clever way of formalizing it that's helpful for actual proofs They say that a set is "convex" if the line that connects any two points inside that set is entirely contained within the set itself. So a square is convex because no matter where you put two points inside that square, the line connecting them is entirely contained inside the square. But something like the symbol pi is not convex. I can easily find two different points so that the line connecting them has to peak outside of the set itself. None of the letters in the word "convex" are themselves convex. You can find two points so that the line connecting them has to pass outside of the set. It's a really clever way to formalize this idea of a shape that only bulges out, because any time that it dents inward, you can find these counter-example lines For our cube, because it's convex, between the first point of entry and the last point of exit it has to stay entirely inside the cube, by the definition of convexity. But if we were dealing with some other non-convex shape, like a donut, you could find a ray of light that enters then exits then enters and exits again. So you wouldn't have a clean two-to-one cover from the shadows. The shadows of all of its different parts, if you were to cover this in a bunch of faces, would not be precisely two times the area of the shadow itself. So that's the first key insight, the face shadows double-cover the cube shadow. And the next one is a little bit more symbolic, so let's start things off by abbreviating our notation a little to make room on the screen. Instead of writing Area(Shadow(Cube)), I'm just going to write S(Cube). And similarly instead of Area(Shadow(a particular face)), I'm just going to write S(F_j), where that subscript j indicates which face I'm talking about. But of course, we should really be talking about the shadow of a particular rotation applied to the cube, so I might write this as S of some rotation applied to the cube. And likewise on the right, it's the area of the shadow of that same rotation applied to a given one of the faces. With the more compact notation at hand, let's think about the average of this shadow area across many different rotations, some sample of R1, R2, R3, and so on. Again, that average just involves adding up all of those shadow areas and then dividing them by n, and in principle if we were to look at this for larger and larger samples, letting n approach infinity, that would give us the average area of the shadow of the cube. Some of you might be thinking, "yes, we know this, you've said this already." But it's beneficial to write it out so that we can understand why it is that expressing the shadow area for a particular rotation of the cube as a sum across all of its faces, or one half times that sum at least...why is that beneficial? What is that going to do for us? Well, let's just write it out, where for each one of these rotations of the cube we could break down that shadow as a sum across that same rotation applied across all of the faces. And when it's written as a grid like this, we can get to Alice's second insight, which is to shift the way that we're thinking about the sum from going row-by-row to instead going column-by-column. For example if we focused our attention just on the first column, what it's telling us is to add up the area of the shadow of the first face across many different orientations. So if we were to take that sum and divide it by the size of our sample, that gives us an empirical average for the area of the shadow of this face. So if we take larger and larger samples, letting that size go to infinity, this will approach the average shadow area for a square. Likewise, the second column can be thought of as telling us the average area for the second face of the cube, which should of course be the same number. And same deal for any other column, it's telling us the average area for a particular face. So that gives us a very different way of thinking about our whole expression. Instead of saying add up the areas of the cubes at all the different orientations, we could say just add up the average shadows for the six different faces and multiply the total by one half. The term on the left here is thinking about adding up rows first, and the term on the right is thinking about adding up columns first. In short, the average of the sum of the face shadows is the same as the sum of the average of the face shadows. Maybe that swap seems simple, maybe it doesn't, but I can tell you that there is actually a little bit more than meets the eye to the step that we just took. But we'll get to that later. And remember, we know that the average area for a particular face looks like some universal proportionality constant times the area of that face, so if we're adding this up across all the faces of the cube, we could think of this as equaling some constant times the surface area of the cube. And that's pretty interesting, the average area for the shadow of this cube is going to be proportional to its surface area. But at the same time, you might complain, "Well Alice is just pushing around a bunch of symbols here, because none of this matters if we don't know what that proportionality constant is!" I mean it almost seems obvious. Like, of course the average shadow area should be proportional to the surface area, they're both two-dimensional quantities, so they should scale in lock step with each other. It's not obvious. After all, for a closer light source it simply wouldn't be true. And also this business where we added up the grid column-by-column versus row-by-row is a little more nuanced than it might look at first, there's a subtle hidden assumption underlying all of this which carries a special significance when we choose to revisit the question of what probability distribution is being taken across the space of all orientations. But more than anything, the reason that it's not obvious is that the significance of this result right here is not merely that these two values are proportional. It's that an analogous fact will hold true for any convex solids, and crucially, the actual content of what Alice has built up so far is that it'll be the same proportionality constant across all of them. Now if you really mull over that, some of you may be able to predict the way that Alice is able to finish things off from here. It's really delightful, it's honestly my main reason for covering this topic. But before we get into it, I think it's easy to under-appreciate her result unless we dig into the details of what it is that she manages to avoid. So let's take a moment to turn our attention back into Bob's world because while Alice has been doing all of this, he's been busy doing some computations. In fact, what he's been working on is finding exactly what Alice has yet to figure out, which is how to take the formula that he found for the area of a square's shadow, and taking the natural next step of trying to find the average of that square's shadow, averaged over all possible orientations. The way Bob starts, if he's thinking about all the different possible orientations for this square, is to ask what are all the different normal vectors that that square can have in all these orientations. Because everything about its shadow comes down to that normal vector. It's not too hard to see that all those possible normal vectors trace out the surface of a sphere. If we assume it's a unit normal vector, it's a sphere with radius 1. And furthermore, Bob figures that each point of the sphere should be just as likely to occur as any other, our probabilities should be uniform in that way. There's no reason to prefer one direction over another. But in the context of continuous probabilities it's not very helpful to talk about the likelihood of a particular individual point, because in the uncountable infinity of points on the sphere, that would be zero and unhelpful. So instead the more precise way to phrase this uniformity would be to say the probability that our normal vector lands in any given patch of area on the sphere should be proportional to that area itself. More specifically it should equal the area of that little patch divided by the total surface area of the sphere. If that's true no matter what patch of area we're considering, that's what we mean by a uniform distribution on the sphere. Now to be clear, points on the sphere are not the same thing as orientations in 3d space, because even if you know what normal vector the square is going to have that leaves us with another degree of freedom. The square could be rotated about that normal vector. but Bob doesn't actually have to care about that extra degree of freedom, because in all of those cases the area of the shadow is the same. It's only dependent on the cosine of the angle between that normal vector and the vertical, which is kind of neat. All those shadows are genuinely different shapes, they're not the same, but the area of each of them will be the same. What this means is that when Bob wants this average shadow area over all possible orientations, all he really needs to know is the average value of this absolute value of cosine of theta for all different possible normal vectors, all different possible points on the sphere. So how do you compute an average like this? Well if we lived in some kind of discrete, pixelated world, where there's only a finite number of possible angles theta that that normal vector could have, the average would be pretty straightforward. What you do is find the probability of landing on any particular value of theta, which will tell us something like how much of the sphere do normal vectors with that angle make up, and then you multiply it by the thing we want to take the average of, this formula for the area of the shadow. And then you would add that up over all of the different possible values of theta ranging from 0 up to 180 degrees, or pi radians. But of course in reality there is a continuum of possible values of theta this uncountable infinity, and the probability of landing on any specific particular value of theta will actually be zero, and so a sum like this unfortunately doesn't really make any sense. Or if it does make sense, adding up infinitely many zeros should just give us a zero. The short answer for what we do instead is that we compute an integral. And I'll level with you, the hard part here is I'm not entirely sure what background I should be assuming from those of you watching right now. Maybe it's the case that you're quite comfortable with calculus, and you don't need me to belabor the point here. Maybe it's the case that you're not familiar with calculus, and I shouldn't just be throwing down integrals like that. Or maybe you...you know, you took a calculus class a while ago but you need a little bit of a refresher. I'm gonna go with the option of setting this up as if it's a calculus lesson, because to be honest even when you are quite comfortable with integrals setting them up can be kind of an error-prone process, and calling back to the underlying definition is a good way to sort of check yourself in the process. If we lived in a time before calculus existed and integrals weren't a thing and we wanted to approximate an answer to this question, one way we could go about it is to take a sample of values for theta that ranges from 0 up to 180 degrees. We might think of them as evenly spaced, with some sort of difference between each one, some delta-theta. And it's still the case that it would be unhelpful to ask about the probability of a particular value of theta occurring, even if it's one in our sample. That probability would still be zero, and it would be unhelpful. But what is helpful to ask is the probability of falling between two different values from our sample, in this little band of latitude with a width of delta theta. Based on our assumption that the distribution along the sphere should be uniform, that probability comes down to knowing the area of this band. More specifically, the chances that a randomly chosen vector lands in that band should be that area divided by the total surface area of the sphere. To figure out that area, let's first think of the radius of that band, which if the radius of our sphere is 1 is definitely going to be smaller than 1. And in fact, if we draw the appropriate little right triangle here, you can see that that little radius let's just say at the top of the band should be the sine of our angle, the sine of theta. This means that the circumference of the band should be 2 pi times the sine of that angle. And then the area of the band should be that circumference times its thickness, that little delta theta. Or rather, the area of our band is approximately this quantity. What's important is that for a finer sample of many more values of theta the accuracy of that approximation would get better and better. Now remember, the reason we wanted this area is to know the probability of falling into that band, which is this area divided by the surface area of the sphere, which we know to be 4 pi times its radius squared. That's a value that you could also compute with an integral similar to the one that we're setting up now, but for now we can take it as a given, as a standard well-known formula. And this probability itself is just a stepping stone in the direction of what we actually want, which is the average area for the shadow of a square. To get that we'll multiply this probability times the corresponding shadow area, which is this absolute value of cosine theta expression we've seen many times up to this point. And our estimate for this average would now come down to adding up this expression across all of the different bands, all of the different samples of theta that we've taken. This right here, by the way, is when Bob is just totally in his element. We've got a lot of exact formulas describing something very concrete, actually digging in on our way to a real answer. And again, if it feels like a lot of detail, I want you to appreciate that fact so that you can appreciate just how magical it is when Alice manages to somehow avoid all of this. Anyway, looking back at our expression, let's clean things up a little bit, like factoring out all of the terms that don't depend on theta itself. And we can simplify that 2 pi divided by 4 pi to simply be one half. And to make it a little more analogous to calculus with integrals, let me just swap the main terms inside the sum here. What we now have, this sum that's going to approximate the answer to our question, is almost what an integral is. Instead of writing the sigma for sum, we write the integral symbol, this kind of elongated Leibnizian S showing us that we're going from zero to pi. And instead of describing the step size as delta theta, a concrete finite amount, we instead describe it as "d" theta, which I like to think of as signaling the fact that some kind of limit is being taken. What that integral means, by definition, is whatever the sum on the bottom approaches for finer and finer subdivisions, more dense samples that we might take for theta itself. And at this point, for those of you who do know calculus, I'll just write down the details of how you would actually carry this out as you might see it written down in Bob's notebook. It's the usual anti-derivative stuff, but the one key step is to bring in a certain trig identity. In the end, what Bob finds after doing this is the surprisingly clean fact that the average area for a square's shadow is precisely one-half the area of that square. This is the mystery constant which Alice doesn't yet know. If Bob were to look over her shoulder and see the work that she's done he could finish out the problem right now. He plugs in the constant that he just found and he knows the final answer. And now, finally! With all of this as backdrop, what is it that Alice does to carry out the final solution? I introduced her as someone who really likes to generalize the results she finds. And usually those generalizations end up as interesting footnotes that aren't really material for solving particular problems. But this is a case where the generalization itself draws her to a quantitative result. Remember, the substance of what she's found so far is that if you look at any convex solid, then the average area for its shadow is going to be proportional to its surface area. And critically, it'll be the same proportionality constant across all of these solids. So all Alice needs to do is find just a single convex solid out there where she already knows the average area of its shadow. And some of you may see where this is going, the most symmetric solid available to us is a sphere. No matter what the orientation of that sphere, its shadow, the flat projection shadow, is always a circle with an area of pi r squared. So in particular that's its average shadow area. And the surface area of a sphere, like I mentioned before, is exactly 4 pi r squared. By the way, I did make a video talking all about that surface area formula, and how Archimedes proved it thousands of years before calculus existed. So you don't need integrals to find it. The magic of what Alice has done is that she can take this seemingly specific fact, that the shadow of a sphere has an area exactly one-fourth its surface area, and use it to conclude a much more general fact, that for any convex solid out there its shadow and surface area are related in the same way, in a certain sense. Wo with that she can go and fill in the details of the particular question about a cube and say that its average shadow area will be one-fourth times its surface area, 6s^2. But the much more memorable fact that she'll go to sleep thinking about is how it didn't really matter that we were talking about a cube at all. Now, that's all very pretty, but some of you might complain that this isn't really a valid argument, because spheres don't have flat faces. When I said Alice's argument generalizes to any convex solid, if we actually look at the argument itself, it definitely depends on the use of a finite number of flat faces. For example, if we were mapping it to a dodecahedron, you would start by saying that the area of a particular shadow of that dodokahedron looks like exactly one half times the sum of the areas of the shadows of all its faces. Snce again you could use a certain ray-of-light-mixed-with-convexity argument to draw that conclusion. And remember the benefit of expressing that shadow area as a sum is that when we want to average over a bunch of different rotations, we can describe that sum as a big grid, where we can then go column-by-column and consider the average area for the shadow of each face. And also, a critical fact was the conclusion from much earlier that the average shadow for any 2d object (a flat 2d object, which is important) will equal some universal proportionality constant times its area. The significance was that that constant didn't depend on the shape itself, it could have been a square, or a cat, or the pentagonal faces of our dodecahedron, whatever. So after hastily carrying this over to a sphere that doesn't have a finite number of flat faces, you would be right to complain. But luckily, it's a pretty easy detail to fill in. What you can do is imagine a sequence of different polyhedra that successively approximate a sphere, in the sense that their faces hug tighter and tighter around the genuine surface of the sphere. For each one of those approximations, we can draw the same conclusion that its average shadow is going to be proportional to its surface area, with this universal proportionality constant. So then, if we say "okay, let's take the limit of the ratio between the average shadow area at each step and the surface area at each step..." Well, since that ratio is never changing, it's always equal to this constant, then in the limit it's also going to equal that constant. But on the other hand, by their definition, in the limit their average shadow area should be that of a circle which is pi r squared, and the limit of the surface areas would be the surface area of the sphere, 4 pi r squared. So we do genuinely get the conclusion that intuition would suggest, but as is so common with Alice's argument here, we do have to be a little delicate in how we justify that intuition. It's easy for this contrast of Alice and Bob to come across like a value judgment, as if I'm saying "Look how clever Alice has managed to be! She insightfully avoided all those computations that Bob had to do." But that would be a very...misguided conclusion. I think there's an important way that popularizations of math differ from the feeling of actually doing math. There's this bias towards showing the slick proofs, the arguments with some clever key insight that lets you avoid doing calculations. I could just be projecting, since I'm very guilty of this, but what I can tell you sitting on the other side of the screen here is that it feels a lot more attractive to make a video about Alice's approach than Bob's. For one, thing in Alice's approach the line of reasoning is fun. It has these nice aha moments. But also, crucially, the way that you explain it is more or less the same for a very wide range of mathematical backgrounds. It's much less enticing to do a video about bob's approach, not because the computations are all that bad. I mean they're honestly not. But the pragmatic reality is that the appropriate pace to explain it looks very different depending on the different mathematical backgrounds in the audience. So you watching this right now clearly consume math videos online, and I think in doing so it's worth being aware of this bias. If the aim is to have a genuine lesson on problem solving, too much focus on the slick proofs runs the risk of being disingenuous. For example let's say we were to step up to challenge mode here and ask about the case with a closer light source. To my knowledge there is not a similarly slick solution to Alice's here, where you can just relate to a single shape like a sphere. The much more productive warm-up to have done would have been the calculus of Bob's approach. And if you look at the history of this problem, it was proved by Cauchy in 1832. And if we paw through his handwritten notes, they look a lot more similar to Bob's work than Alice's work. Right here at the top of page 11, you can see what is essentially the same integral that you and I set up in the middle. On the other hand, the whole framing of the paper is to find a general fact, not something specific like the case of a cube, so if we were asking the question which of these two mindsets correlates with the act of discovering new math, the right answer would almost certainly have to be a blend of both. But I would suggest that many people don't assign enough weight to the part of that blend where you're eager to dive into calculations. And I think there's some risk that the videos I make might contribute to that. In the podcast that I did with the mathematician Alex Kontorovich, he talked about the often underappreciated importance of just drilling on computations to build intuition, whether you're a student engaging with a new class, or a practicing research mathematician engaging with a new field of study. A listener actually wrote in to highlight what an impression that particular section made. They're a Ph.D. student, and described themselves as being worried that their mathematical abilities were starting to fade, which they attributed to becoming older and less sharp. But hearing a practicing mathematician talk about the importance of doing hundreds of concrete examples in order to learn something new, evidently that changed their perspective. In their own words, recognizing this completely reshaped their outlook and their results. And if you look at the famous mathematicians through history, You know Newton, Euler, Gauss, all of them, they all have this seemingly infinite patience for doing tedious calculations. The irony of being biased to show insights that let us avoid calculations is that the way people often train up the intuitions to find those insights in the first place is by doing piles and piles of calculations. All that said, something would definitely be missing without the Alice mindset here. I mean think about it how sad would it be if we solved this problem for a cube, and we never stepped outside of the trees to see the forest and understand that this is a super general fact, it applies to a huge family of shapes. And if you consider that math is not just about answering the questions that are posed to you, but about introducing new ideas and constructs, one fun side note about Alice's approach here is that it suggests a fun way to quantify the idea of convexity. Rather than just having a yes/no answer, is it convex is it not, we could put a number to it by saying: Consider the average area of the shadow of some solid, multiply that by four, divide by the surface area, and if that number is 1 you've got a convex solid. But if it's less than 1, it's non-convex, and how close it is to 1 tells you how close it is to being convex. Also, one of the nice things about the Alice solution here is that it helps explain why it is that mathematicians have what can sometimes look like a bizarre infatuation with generality, and with abstraction. The more examples that you see where generalizing and abstracting actually helps you to solve a specific case, well the more you start to adopt the same infatuation. And as a final thought, for the stalwart viewers among you who've stuck through it this far, there is still one unanswered question about the very premise of our puzzle. What exactly does it mean to choose a random orientation? Now if that feels like a silly question, like, of course we know what it should mean, I would encourage you to watch a video that I just did with Numberphile on a conundrum from probability known as "Bertrand's Paradox". After you watch it, and if you appreciate some of the nuance at play here, homework for you is to reflect on where exactly Alice and Bob implicitly answered this question. The case with Bob is relatively straightforward, but the point at which Alice locks down some specific distribution on the space of all orientations...well it's not at all obvious, it's actually very subtle.

Info

Channel: 3Blue1Brown

Views: 266,168

Rating: undefined out of 5

Keywords: Mathematics, three blue one brown, 3 blue 1 brown, 3b1b, 3brown1blue, 3 brown 1 blue, three brown one blue

Id: ltLUadnCyi0

Channel Id: undefined

Length: 40min 5sec (2405 seconds)

Published: Mon Dec 20 2021