R Beginner Monte Carlo Integration

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hi everyone in our last video we walked through a monte carlo method to estimate the value of pi we did this by writing and optimizing an r program that simulates a game of darts you can find a link to that video on the right in this video we will see that this approach is actually part of a much broader class of methods called monte carlo integration using monte carlo integration we can estimate single or multiple definite integrals which allows us to easily simulate areas or volumes and along the way we will also check out some approaches to define and plot functions in r we use r because it allows us to implement the methods and generate all these plots with just a few lines of code so what is monte carlo integration imagine you have a function defined on the interval from a to b you want to find the area under this curve which means that you are looking to compute a definite integral one monte carlo way to solve this definite integral is to first uniformly sample many random points and then compute the value of the function at each point in this example we randomly sample four points x1 through x4 we then estimate the definite integral simply as the length of the line from a to b times the average value of the function at the four points so right now you might be wondering why this formula even works let's start by rearranging the terms a bit we can rewrite our equation as follows now we see that the formula is actually breaking up the interval into four rectangles of equal width the heights of these rectangles correspond to the values of the function at the four points we have chosen so we are just summing up the area of these four rectangles to estimate the area under our function this approach is not perfect for example we know that the function dips between x2 and x3 but this is not captured by our sum to overcome this challenge we can just increase the number of random points from 4 to potentially a few thousand the more points we choose the better our estimate of the integral will generally be and that's it for monte carlo integration this example focused on a one parameter function but this approach extends into multiple dimensions as well our function can for example be defined on a 2d surface instead of a line and the approach will work exactly the same many approaches such as importance sampling have also been designed to further improve our estimates for now let's implement this basic monte carlo integration method for a few functions in r to start let's define the simple function x squared for this example we will focus on the region between -3 and 3. we can plot this function in r using the curve command and then we can shade the area we wish to compute and also add a background grid to our plot notice that we first had to load a package in order to gain access to the shade function if this package is not yet present on your machine you can easily set it up using r's built-in package installer for now this is just a simple example of what r offers in terms of graphing functionality in a few minutes we will see that r can also generate interactive 3d plots that you can move around and resize with your mouse now from basic calculus we know that the shaded area equals 18. let's estimate this area using monte carlo integration which we will implement in two steps first we sample 2 000 random points uniformly between -3 and 3. then in a single line of r code we apply our function to the random points compute the corresponding mean and finally multiply this value by the length of the interval from -3 to 3 which is 6. this is our first monte carlo estimate of the integral and it is relatively close to the true value of 18. let's see how the number of random points we sampled affects the accuracy of our estimate to do this we repeat the exact same estimation process with different numbers of points starting with an estimate based on just one point and ending at an estimate that uses 10 000 points in this plot the horizontal axis represents the number of random points used in a given simulation and the vertical axis shows the corresponding estimate we see that the accuracy of our estimate improves as we add more points and finally stabilizes around the correct value of 18 which is marked by the horizontal red line so far everything looks great now let's see how this approach works in three dimensions we define a two variable function that just returns the sum of the squared inputs and let's look at two ways to visualize this function again we are interested in values between -3 and 3 for both x and y this code assigns to x and y 100 values that increase in consonant increments to start at -3 and then that tree we now apply our function to all combinations of x and y values and finally plot this grid of results this graph uses r's built-in plotting functionality and generates a static picture of our function external packages have been developed over time to add even more powerful functionality let's use one of these packages to generate an interactive plot [Music] we can now move this plot around and get a better view of our function [Music] our goal is to estimate the volume under this surface from basic calculus we know that this volume equals 216. again we select uniform random points across the region but this time we do it for both variables x and y finally we perform the monte carlo integration in one line by first applying our function to the random points we selected then taking the mean of these values and finally multiplying the result by the area of the surface over which we want to compute the volume this result is fairly close to 216 which is the true value of the integral the number of random points we sampled affects the accuracy of our estimate so let's repeat the process again starting with an estimate based on just one point and ending with an estimate that uses 10 000 points as in the previous example we see that the accuracy of our integral estimate improves as we add more points and finally stabilizes around the true value of 216. now at the beginning of this video i mentioned that monte carlo integration can be used to estimate pi to accomplish this we just need to define an appropriate function take for example the following definition the value of this function is 0 except when both x and y are within a circle of radius 1 centered at the origin in that case the return value is 1. let's define the function on x and y values between -1 and 1. under our current definition the function works great if both x and y are individual numbers but will not work as intended when x or y are vectors of numbers to be able to query h with vector inputs we use the handyvectorize function in r [Music] now h works correctly under both settings [Music] h is a two parameter function so we can easily graph it in r and we can move this plot around to get a better view a simple geometric argument shows that the volume under this function equals pi this is true since we are looking at a cylinder with a base surface area of pi and a height of 1. let's repeat our usual workflow by first uniformly sampling random points between -1 and 1. and then computing the monte carlo estimate of the volume so far a decent guess at the true value of pi let's check the accuracy as we sample more points as usual the red horizontal line marks our target value in this case pi and we see that sampling more points helps stabilize our estimate around this target and that's about it for today thanks for watching this quick overview of monte carlo integration in r and see you in the next video
Info
Channel: codeRtime
Views: 3,151
Rating: 5 out of 5
Keywords: Monte Carlo integration, R Monte Carlo simulation, Monte Carlo simulation in R, Introduction to Monte Carlo simulation, simulation in R
Id: 8xo4Lx9fiRc
Channel Id: undefined
Length: 9min 31sec (571 seconds)
Published: Wed Jun 17 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.