The BEST Way to Find a Random Point in a Circle | #SoME1 #3b1b

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Hey, I'm the creator of this video. Thank you so much for all the kind words! Also all the criticisms you've made are very helpful and informative!

πŸ‘οΈŽ︎ 114 πŸ‘€οΈŽ︎ u/SenorHoosteen πŸ“…οΈŽ︎ Oct 12 2021 πŸ—«︎ replies

Sometimes you can spot video games with "square" explosions presumably because particle velocities are chosen naively, something like:

particle.vx = v * random();
particle.vy = v * random();

And then there's the case of Williams Defender, with nice round explosions, and its sequel, Stargate, with inexplicable square explosions.

πŸ‘οΈŽ︎ 115 πŸ‘€οΈŽ︎ u/smcameron πŸ“…οΈŽ︎ Oct 11 2021 πŸ—«︎ replies

Very interesting video! I didn't know about two of these techniques.

Regarding the ending: Benchmarking numerical Python is always feels pointless. The slowest function run in any performance-oriented language implementation is going to beat the snot out of the fastest function in CPython. If you cared about performance, you wouldn't be using Python in the first place.

So here's my own benchmark in C: circle.c. The relative results are similar (Clang 12.0.1) where rejection sampling is the clear winner:

rejection   122.337 M-ops/s
sqrt_dist    33.961 M-ops/s
sum_dist     33.685 M-ops/s
max_dist     33.037 M-ops/s

Those transcendental functions are just so expensive. The latter three are far friendlier to a SIMD implementation, though: They have a fixed number of steps and no loop. sum_dist and max_dist would require masking for the branch, and sqrt_dist is completely branchless. However, with rejection sampling at 4x faster in the straightforward case, a 4-lane wide SIMD of the others could only just match the speed of rejection sampling.

πŸ‘οΈŽ︎ 174 πŸ‘€οΈŽ︎ u/skeeto πŸ“…οΈŽ︎ Oct 11 2021 πŸ—«︎ replies

Strange that randomly choosing a spoke length fails, but adding two random spoke lengths (and reflecting if necessary) gets past this problem. I definitely wouldn't have thought of this, and just used the cartesian method, or the (incorrect) spoke length and angle method.

πŸ‘οΈŽ︎ 20 πŸ‘€οΈŽ︎ u/rjcarr πŸ“…οΈŽ︎ Oct 11 2021 πŸ—«︎ replies

Yep, rejection sampling is the easiest. When I worked on a raytracer, I needed a way to randomly sample from a unit sphere. Generalizing rejection sampling to 3d was the fastest and more correct way.

πŸ‘οΈŽ︎ 40 πŸ‘€οΈŽ︎ u/Lord_Zane πŸ“…οΈŽ︎ Oct 11 2021 πŸ—«︎ replies

What a wonderful video, great use of Manim and great math being put to work.

πŸ‘οΈŽ︎ 19 πŸ‘€οΈŽ︎ u/FU_residue πŸ“…οΈŽ︎ Oct 11 2021 πŸ—«︎ replies

Why are so many posters rushing to the comments to share the first half-formed thought that entered their brain after only reading the title.

This is a comment thread discussing a video. You should watch the video before commenting on it.

πŸ‘οΈŽ︎ 61 πŸ‘€οΈŽ︎ u/NeverComments πŸ“…οΈŽ︎ Oct 11 2021 πŸ—«︎ replies

I really enjoyed the video and learned about inverse sampling, thanks!

Regarding the bit about random() + random() =/= 2 * random(). One thing that adds to the confusion is the way they are written. It sort reads like x + x = 2x when in reality this is not necessarily the case. If we would have written rand1 = random() and rand2 = random() then perhaps it would be easier to digest that in general rand1 + rand2 =/= 2 * rand1 or rand1 + rand2 =/= 2 * rand2.

πŸ‘οΈŽ︎ 15 πŸ‘€οΈŽ︎ u/swagbyte πŸ“…οΈŽ︎ Oct 11 2021 πŸ—«︎ replies

ITT "I didn't watch the video, but is this the solution?"

πŸ‘οΈŽ︎ 32 πŸ‘€οΈŽ︎ u/cassandra232 πŸ“…οΈŽ︎ Oct 11 2021 πŸ—«︎ replies
Captions
hey i'm justin and i need to find a uniform random point inside of a circle how do i do that when i was first looking into this problem i was working on a coding project that demanded something to be randomly placed inside of a circle but i quickly fell down a surprisingly deep rabbit hole filled with lots of interesting math so i decided to compile my findings into this video i hope you enjoy it before thinking about circles let's simplify the problem and select a random point inside of a square to do this just pick a random x and y coordinate within the bounds of the square and that's it now this can also be applied to circles in some cases like this one the point will fall inside of the circle but not always which is why we need an extra step if the selected point falls outside of the circle try again and again until the point falls inside of the circle this process is an example of rejection sampling where instead of transforming some distribution you reject any results that fall outside of your desired range and it works however something that didn't quite sit right with me was knowing that this algorithm might have to be executed multiple times for simplicity's sake the radius of our circle will be 1 for the rest of the video but remember that the math applies for any radius the probability of a point selected with this algorithm being inside of the circle is equal to the area of the circle divided by the area of the square or pi over 4 which is approximately 0.785 meaning that the chance of the algorithm failing n times is equal to 1 minus pi over 4 raised to the power of n where n is some positive integer this number gets really small really quickly as n increases in fact the odds of the algorithm failing four times in a row are less than a quarter of a percent then we can look at the average number of attempts needed or the expected value of n which is the reciprocal of our success rate or 4 over pi which is approximately 1.273 so on average we will only have to make one attempt all of this math is to say that we really don't have to worry about repeating this process too many times so let's implement it in python first i use a wild true loop to run the code over and over again until a valid point is selected then with random variables x and y between negative 1 and positive 1 i use the pythagorean theorem to see if the distance from the point to the center is less than the radius of the circle if it is then the point is inside of the circle and i can return x y if i run this function 3141 times i know cute you'll see that we get our desired result where the points are randomly yet uniformly distributed around the circle and we could stop here but it is possible to select a valid point the first time around guaranteed the issue is that we're using the wrong coordinate system the cartesian coordinate system that we all know and love limits us by making us select two values that represent distances on perpendicular axes which we call x and y what better suits our needs is the polar coordinate system where points are represented by a distance from a reference point and an angle for example consider the point two pi over three in cartesian coordinates this would represent a point at the intersection of x equals two and y equals pi over three but in polar coordinates this would represent a point that is two units away from the origin at a pi over 3 radians angle just by looking at the graphical representations of these coordinate systems you can probably tell why the polar coordinate system is better for our purposes of working with circles so all we need to do is select a random radius r between 0 and 1 and a random angle theta between 0 and 2 pi for drawing purposes these coordinates have to be converted back into cartesian coordinates which can be done by multiplying r by the cosine or sine of theta to get the x and y coordinates respectively the python implementation of this algorithm similarly has a random variable theta between 0 and 2 pi a random variable r between 0 and 1 and returns r times the cosine of theta r times the sine of theta if we again run this algorithm 3141 times the results should look similar to the rejection sampling algorithm but they don't as you can see the points are much more densely packed in the center of the circle let's figure out why that is we know that there is no problem with how we generate theta because the points are evenly distributed around the circle so the issue must lie in how we select a value of r the random number generator that we are using has a uniform distribution meaning that there is an equal probability of each possible radius being chosen it follows that each ring in the circle will on average have the same number of points maybe now you can see the problem as the circumference of a ring grows the number of points remains constant resulting in the outer rings with larger radii having lower densities than the inner ones we can represent how we select a radius or value of r using something called a probability density function or pdf which is a function whose integral tells us how likely it is for a continuous random variable x to fall inside of a given interval currently this graph is a horizontal line from 0 to 1 because each value of r is equally likely to be chosen so we know that this pdf doesn't work so let's figure out what it should look like such that the points will be uniformly distributed in the circle since the circumference of a circle grows linearly with its radius the number of points should also grow linearly with the radius to keep the density of points in the circle constant therefore our pdf should be a linear function which we'll denote as f of r equals m times r where m is some slope but how do we calculate the value of m well the probability of picking a value of r that is between 0 and 1 is 100 or 1. therefore the integral or the area under the curve of our pdf from 0 to 1 must also be 1. from here you can solve for the height of this triangle and then solve for the slope of its hypotenuse or you can set up an integral which is what i'll do solving this integral reveals that the slope of our pdf is 2 meaning that its equation is f of r equals 2r and what's interesting is that every pdf has an integral of 1. so this trick will always work now that we know what our distribution should look like how do we implement it using only our uniform random number generator we need to find some transformation that when applied to a uniform distribution gives us our desired linear distribution first let's create a cumulative distribution function or cdf which i'll denote as big f of r and it represents the integral of our pdf which in our case is r squared what a cdf actually represents is the probability that a random variable x will be less than or equal to its input using this cdf we can employ a strategy called inverse transform sampling which takes a uniform distribution like that of our random number generator and transforms it while our random variable x does not have a uniform distribution the probabilities themselves marked on the y-axis do so let's see what happens if we treat those probabilities as our input essentially taking the inverse of our cdf if we take some number of evenly spaced y values and look at their corresponding x values you can see that we get our desired result where the points are more densely packed around greater values of r and more sparse around the lesser ones and while this intuition is great i'll show you a more rigorous proof for why plugging a uniform random variable into the inverse of our cdf yields our desired distribution so far we know that a cdf represents the probability of a random variable x being less than or equal to its input r but let's take a look back at what the pdf of a uniform distribution looks like i'll denote a uniform random variable between 0 and 1 as u the graph is a horizontal line from 0 to 1 at y equals 1. it follows that the equation of the cdf the integral of the pdf would be big f r equals r and the definition of a cdf tells us that the probability of u being less than or equal to r is r if we substitute 0.5 in for r it makes sense that the probability of a uniform random variable being less than or equal to 0.5 is 0.5 since it is in the middle of 0 and 1. going back to inverse transform sampling we can now say that big f of r is equal to the probability of u being less than or equal to big f of r then we can apply the inverse of our cdf to both sides of the inequality giving us two inequalities that are both less than or equal to r therefore x the random variable with our desired distribution is equal to the inverse of our cdf being applied to a uniform random variable and we know that the inverse of our cdf is the square root of r so let's implement this in python the only difference between this function and the last one is that to calculate r i now take the square root of random and if we run this 3141 times you can see that we once again have points uniformly distributed around the circle we now have two perfectly valid ways of selecting a uniform random point inside of a circle but it doesn't stop there there's another really interesting method that achieves the same exact result a circle can be thought of as a collection of infinitely many infinitely thin isosceles triangles rotated about a point so if we can pick a random triangle and then pick a random point in that triangle we will have a uniform random point in the circle let's go back to how we can select a random point in a square this method can be applied to parallelograms as well since we are dealing with isosceles triangles we'll use a rhombus which we know to be two isosceles triangles joined at the base simply pick a random point on two adjacent sides then from each point extend a line that is parallel to an adjacent side and see where they intersect to constrict our range to just a triangle we can draw a diagonal and reflect any points in the wrong half of the rhombus over it now let's label the four vertices of the rhombus a b c and d the two random points on adjacent sides x1 and x2 and the selected point t since the triangles are infinitely thin angle c will approach zero as this happens the height of the rhombus will also approach zero which means that all of its sides will become effectively parallel therefore the distance from the center point c to the selected point t is equal to the sum of x1 and x2 and then we can reflect over the diagonal which is in the middle of points c and a if we need to this is ready to be put into python first theta is calculated as usual but this can also be thought of as selecting a random triangle since they are infinitely thin so each triangle gets its own angle then x is calculated by doing random plus random which is just x one plus x two if r is greater than or equal to one that means the point is on the wrong side of the rhombus and the value of its reflection is calculated by subtracting r from 2. finally the return statement remained unchanged and once more we get 3141 points uniformly distributed around the circle but a question that i had that you may have as well is why adding two random variables is not equal to multiplying one random variable by two an easy way to approach this question is by thinking of it in terms of dice if we take the output of one fair die and multiply it by two we can see that only even numbers are returned but also that each output is equally likely to occur that is the die has a uniform distribution however if we add the results of two fair dice together you may recall that seven is a more likely output than any other number because there are six different ways for two dice to add up to seven in contrast 2 and 12 each only have one way of occurring making them the least likely outputs if we graph the frequency of each possible output we can see that as the sum approaches 7 on either side which is the average of 2 and 12 the minimum and maximum sums the frequency increases so if we imagine reflecting the right half of this graph the result would look very similar to our desired pdf but it was at this point that i noticed something strange since they both work both of these methods of calculating r taking the square root of random and reflecting random plus random as needed must have the same distribution with a linear pdf of 2r and a quadratic cdf of r squared this doesn't seem right at first glance how can these two seemingly unrelated and completely different functions have the same distribution well we know that the square root method must have the right distribution because we derived it directly from our desired cdf so let's prove the distribution of the sum method luckily the erwin-hall distribution already exists which tells us the distribution of the sum of n uniform random numbers between 0 and 1. in our case n is equal to 2. here's the equation for the pdf and the cdf and you could use these but don't worry we don't actually need to look at them because there's a really elegant geometric derivation of this that works well with lower values of n to start it's helpful to plot the two variables x1 and x2 on the x and y axes to better visualize them a line can be drawn that represents all possible ways by which x1 and x2 can add up to some number t between 0 and 2. so the equation of the line would be x1 plus x2 equals t the length of this line is proportional to the pdf increasing from 0 to 1 and decreasing from 1 to 2. the area below the line and within the bounds of this unit square represents the cdf but remember any value of t between 1 and 2 is reflected to get the value of r this means that the cdf can always be represented as the area of a triangle with a base and height of r which is r squared over 2. but since all the points in this triangle are essentially doubled as a result of the reflection this area has to be multiplied by 2 to get the desired cdf of r squared but why stop here any function that has this distribution should in theory select a value of r that will generate a uniform random point inside of a circle i'll show you one more method but i really encourage you to try to find your own because this is by no means a complete list this last method selects a value of r by taking the maximum of two random variables i don't use the max function built into python because it performs slower than what you see here so for the final time 3141 points are distributed uniformly around the circle but this still needs to be proven and it can be done similarly to the erwin hall distribution by plotting x1 and x2 just like before since r is equal to their maximum we can split the square diagonally any point that falls in the red half of the square returns x1 because it is greater than x2 any point that falls in the blue half returns x2 for the opposite reason our cdf is represented by the area of a square of side length r in which for every point x1 and x2 are both less than or equal to r and the area of the square once again gives us a cdf of r squared with these four methods all that's left to do is to test them and see which runs the fastest after executing each function three million one hundred and forty one thousand five hundred and ninety two times yes i'm sticking with it these were their times and the one that is considerably faster than the others is rejection sampling this makes sense because the other three methods all have the added expense of performing a sine and cosine operation so does this mean that most of this work was for nothing that we should have stopped earlier i don't believe so because if it weren't for this problem i never would have been exposed to pdfs or cdfs or a whole array of interesting geometric proofs so i'm grateful to this problem for taking me on this mathematical journey albeit circuitous and i'm grateful to you for joining me i hope you enjoyed this video and have a wonderful day
Info
Channel: nubDotDev
Views: 86,383
Rating: 4.9321909 out of 5
Keywords: probability, pdf, cdf, random point, random point in a circle, random point in circle, summer of math exposition, SoME, SoME1, leetcode 478
Id: 4y_nmpv-9lI
Channel Id: undefined
Length: 18min 35sec (1115 seconds)
Published: Sat Aug 21 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.