High-Performance Computing with Python: Numba Vectorize

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
yesterday we talked a lot about you Fox and trying to make num Pie fast by using you funds so by taking advantage of those routines that basically run a machine code and we already introduced you with not in detail to number vectorize so this notebook goes into a little more detail on how to use number vectorize so what happens when we take a simple trig function um from the math library and maybe have a function that um we also want to apply that doesn't exist yet like sine a cosine B could also be sine a cosine a or maybe sine squared a plus cosine squared a you're good actually most compilers don't get it yet I mean compilers do wonderful optimizations and we'll break down entire series to the closed expression but they don't do sine squared plus cosine squared yet anyway um so we have this function here and it takes two variables a and P and calculates the sine times the cosine all right what happens if we do this with a numpy array if we pass a numpy array to the 13 any guesses what will happen hmm but oh won't work yes exactly let's try there we go only size one raise so actually you can pass an umpire array as long as it's of size one can be converted on scalars because I explicitly actually use the math sign here and the math cosine not the numpy versions if I'd use the numpy versions it would have worked would have been boring and I wouldn't have had anything to show but with this it doesn't alright so what can we do we can use we could compile yep so we can use numpy vector eyes like we learned yesterday and we do that hello there we go timing took some time and we do that and we see it takes 436 milliseconds for the array that I just set up in the previous section it's working though I can also use number vectorize and here I basically use the same interface as for the numpy version so i use it as a function call and i get seven point zero nine milliseconds that is speed up of about was at sixty not bad now when we do that um it actually compiles it whenever I call it so when I create this function like that it actually only basically creates a callin it doesn't compile at that time will actually wait for the function to being called before it compiles it then it looks at the arguments that I'm passing and it will create machine code for that function with the arguments was called with this is somewhat similar to C++ as template mechanism but sometimes I don't want this to happen just at the time when I call it I want that to happen when I define these functions and then I want to be done with it and that's where eager compilation comes in I can tell number which versions I want by passing a list of strings where I'm giving first the return argument in this case a double precision floating point number which of course has eight bytes not four and two integers here so this would be the same as saying in 64 in this notation this is bytes okay so I created one version for f8 I ate I ate I create a second version for single-precision floating-point numbers and a third version for double-precision floating-point numbers I said no Python equal true which means it can't do calls into the Python runtime [Applause] why would I use this one advantage is that I can also give it a target keyword so the the types would it would try to detect automatically although if it finds something that is compatible for example if I pass floating point numbers and I already had a double precision version it would just use the double precision version wouldn't care but other than that I would just be fine calling it at my function call the overhead yeah it's there but first of all it's only there the first time when I call it and if I have a simulation that runs often then well I shouldn't profile that first call maybe but other than that it doesn't do much bad but this is a real advantage here if I use your compilation so I specify the type I can give the target the default target is CPU which just means we do the same thing as before single-threaded but there's also target parallel and now here it becomes interesting because I'm now saying apply the same operation to a long vector of numbers it can take this operation break it up and run over multiple threads another interesting target is CUDA it Stein gets most of its compute powers out of the GPUs with this you can actually have here you funks use the GPU and we'll go into this in more detail when we talk about cuter for Python so here would be a complete signature notice that it basically has two sets of parentheses one for specifying the arguments so here my list of function signatures my note python equal true and my target parallel that is these are arguments to number vectorize and then the actual function so if I do the parallel sine cosine take six point five two milliseconds that's not much improvement I'm now running this on 12 cores rather than one and I got a speed up of what 10% absolutely because that Ray was kind of small before and you have to push kind of hard to get to better numbers so if it gets large enough it will actually work even better if it has work to do so that the threads actually have some work within the thread that they need to do but here the improvement is a little better some of you have done this exercise already with the number vectorized but this time you are supposed to so we use the same escape time algorithm but this time you should use the number vectorize either as decorator or as a function to UM do a vectorized version so a you func of the mandelbrot this is a you func because we are giving o the input value is of the same shape as the output value so it's a regular you func once you're done with that please visualize so that you know that it is right
Info
Channel: cscsch
Views: 2,455
Rating: undefined out of 5
Keywords: CSCS, HPC, Python, Lugano, High-Performance Computing, Numba
Id: fxzzKUHs9bs
Channel Id: undefined
Length: 10min 3sec (603 seconds)
Published: Wed Jul 24 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.