What Is Apache Spark?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Have you ever been training a machine learning model and the training data that you get is bigger than the machine that you have? Or have you ever been running an SQL query and then you realize it's going to take all night to finish? Well, you could just buy a bigger machine and upgrade it. And you could just patiently wait for the SQL query to finish. But what about when the training data grows and grows and grows and your database starts to go into the millions and millions of rows? A great solution to this is Apache Spark. Hey David, sorry to interrupt, man, this is great stuff. I just want to remind everyone at home to like and subscribe. It helps us grow the channel so it can bring you more great videos like this. And make sure you check out my video where I take you behind the scenes where we develop and test some of our most powerful servers. Alright man, I'll let you get back to it. Thanks Ian. So Apache Spark takes your big data problem and gives you a much quicker and more affordable solution to it. So let's break down your big data problem. Usually you're addressing it using some code, and then you have to run it on your hardware, which is where the problem usually arises. Your hardware is not big enough or powerful enough. And finally, you have to store that data. And very often the data that you come out with is much bigger than the data that you started with. Spark addresses this through its stack. At the very top, we have Spark libraries like Spark SQL, ML lib for machine learning workloads. And Spark R. All these are supported by the Spark Core API. Underneath that, Spark takes the hardware problem, splits it into multiple computers using something like Kubernetes or EC2 and handles all the resource management. Finally, Spark has data stores that you can access to store all the data that's generated from your workload. So next time you have a big data problem, spare your wallet and spare stress levels. Use Apache Spark. Thanks so much. If you like this video and want to see more like it, please like and subscribe. See you soon.
Info
Channel: IBM Technology
Views: 54,345
Rating: undefined out of 5
Keywords: IBM, IBM Cloud, Opensource, Apache Spark, Spark, Data Science
Id: VZ7EHLdrVo0
Channel Id: undefined
Length: 2min 39sec (159 seconds)
Published: Fri Oct 21 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.