Mind-bending new programming language for GPUs just dropped...

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
yesterday the clouds opened up and a weird new programming language came down to earth with a promise of parallelism for allou who writeth code this is big if true because parallel Computing is a superpower it allows a programmer to take a problem that could be solved in a week and instead solve it in seven days using seven different computers unfortunately running code in parallel is like conducting a symphony one wrong note and the entire thing becomes a total disaster but luckily Bend offers Hope by making a bold promise everything that can run in parallel will run in parallel you don't need to know anything about Cuda blocks locks mutexes or regex's to write algorithms that take advantage of all 24 of your CPU cores or even all 16,000 of your GPU cores you just write some highlevel python looking code and the rest is Magic it is May 17th 2024 and you're watching the code report when you write code in a language like python your code runs on a single thread that means only one thing can happen at a time it's like going to a KFC with only one employee who takes the order cleans the toilets and Cooks the food in that order now on a modern CPU you might have a clock cycle around 4 GHz and if it's handling one instruction per cycle you're only able to perform 4 billion instructions per second now if four giips is not enough you can modify your python code to take advantage of multiple threads but it adds a lot of complexity to your code and there's all kinds of gotas like race conditions Deadlocks thread starvation and may even lead to conflicts with demons even if you do manage to get it working you might find that your CPU just doesn't have enough juice at which point you look into using the thousands of cacor on your GPU you but now you'll need to write some C++ code and likely blow your leg off in the process well what if there is a language that just knew how to run things in parallel by default that's the promise of Bend imagine we have a computation that adds two completely random numbers together in Python The Interpreter is going to convert this into B code and then eventually run it on the python virtual machine pretty simple but in Bend things are a little more complex the elements of the computation are structured into a graph which are called interaction combinators you can think of it as a big network of all the computations that need to be done when two nodes run into each other the computation progresses by following a simple set of rules that rewrite the computation in a way that can be done in parallel it continues this pattern until all computations are done it then merges the result back into whatever expression was returned from the function this concept of interaction combinators goes all the way back to the 1990s and is implemented in a runtime called the higher order virtual machine hbm is not meant to be used directly and that's why they build bend a highle language to interface with it and the language itself is implemented in Rust its syntax is very similar to Python and we can write a Hello World by defining a main function that returns a string now to execute this code we can pull up the terminal and use the Ben run command by default this is going to use the rust interpreter which will execute it sequentially just like any other boring language but now here's where things get interesting imagine we have an algorithm that needs to count a bunch of numbers and then add them together the first thing that might blow your mind is that bend does not have loops like we can't just do a for Loop like we would in Python instead Bend has something entirely different called a fold that works like a search and replace for data types and any algorithm that requires a loop can be replaced with a fold basically a fold allows you to consume recursive data types in parallel like a list or a tree but first we need to construct a recursive data type and for that we have the bend keyword which is like the opposite of fold now if that's a little too mind-bending maybe check out my back catalog for recursion in 100 seconds but now let's see what this looks like from a performance standpoint when I try to run this algorithm on a single thread it takes forever like 10 minutes or more however I can run the same code without any modification whatsoever with the bend run C command when I do that it's now utilizing all 24 threads on my CPU and now it only takes about 30 seconds to run the computation that's a huge Improvement but I think we can still do better because I'm a baller I have an Nvidia RTX 490 and once again I can run this code without any modification on Cuda with Bend run- cuu and now this code only takes 1 and 1 half seconds to run and I'll just go ahead and drop the mic right there this has been the code report thanks for watching and I will see you in the next one
Info
Channel: Fireship
Views: 1,018,892
Rating: undefined out of 5
Keywords: webdev, app development, lesson, tutorial
Id: HCOQmKTFzYY
Channel Id: undefined
Length: 4min 1sec (241 seconds)
Published: Fri May 17 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.