Top 5 FREE Resources to 10X Your Data Engineering Skills

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone this is Josh I'm an AI engineer at Google and previously I was a data engineer in fact out of 5 years of experience I have in data field 3.5 of them have been in data engineering I've created a data engineering road map on this channel before that I'm going to link it which is super relevant and if you have not gone through it I definitely recommend going through it but that is slightly different like all of the courses that I've mentioned are mostly paid in that video yeah I mean it's cheap nonetheless but still it's not completely free for this video I'll be focusing on completely free resources no fluff no upsells just the good stuff and if you're at a stage where you feel like you're not ready to invest in data engineering upscaling yet or if you don't have that budget yet but you still want to be a good data engineer then this is the perfect video for you you can also go through these courses to learn about data engineering before finally deciding if it's the right fit for you or not here are the five free resources that will 10x your data engineering skills and by the way all of these resources I'm going to link it in the description so after the video feel free to go through them and before we proceed with the rest of this video do not forget to leave a like And subscribe to the channel that really goes a long way in helping us out number one is Google cloud training for data engineering and Analysis now you might think why do I have to learn Google Cloud what if I'm interested in AWS or Azure well I'm not here to start any Cloud Wars but if you have worked in Cloud you would know that a lot of these services are very common across all all three major Cloud platforms like for example aure virtual machine might be known as a compute engine in gcp or an ec2 within AWS but all of them like they offer the similar type of services just with three different names but the point is these skills are easily transferable uh because of overlap between these Services the reason that I went with gcp is because this is the only resource that I found specifically on cloud computing Plus data engineering that's completely free and it is a super in-depth it has videos it has Hands-On it has Labs it also comes with source code of different Hands-On Labs so let's Jump Right In so when you open this link you'll see a data analyst learning path you will also see a data engineer learning path and you will also see a datab base engineer learning path now I would like you to skip the data analysis learning path simply because it focuses a lot on looker and looker is like a dashboard building tool and as a data engineer you will not be working with a dashboard building tool a lot when it comes to data engineering learning path you can see that it focuses on big data and machine learning fundamentals data engineering on gcp it also focuses on serverless processing and much more and similarly in database engineer path it focuses on Enterprise database migration now as a data engineer I cannot stress that how migration related projects are important because a lot of projects are not just about creating a data pipeline from scratch they're also about migrating something from an old text act to a new text act both both of these learning paths are super important now if I click on start learning you can see that it will take you to data engineering learning path and you'll see it has multiple mini courses some courses are 8 hours some courses are 5 hours or some courses are even 12 hours a lot of them are videos and a lot of them are Hands-On as well so for example this is building batch data pipelines on Google cloud and this takes you through ETL el el all of these differentiations quality considerations and it also focuses on spark as well if you're more interested in solely Hands-On then like let's take a look at this predict visitor purchases with a classification model in big query ml so this will take you through a Hands-On lab it will also share all the commands that you have to execute with UI snapshots and all the code is also provided either as a snippet that you can copy and paste or there might be a different link to a GitHub repository so to be honest it is remarkable that something like this is available freely in the market and that's your opportunity to not miss out on it and it it going to provide a really great end to end uh Learning Resource for data Engineers to get started now number two if you are going through a data engineering interview process there are three main rounds that you need to focus on that which are like common across any company that you apply for number one is SQL so for SQL you could use just W3 schools or lead coding as I've mentioned in my road map video as well and both of these resources are obviously completely free but I'm focusing more on rest of the two rounds which is system design and data modeling so a lot of people get confused in these two round specifically with data engineering profiles so for them I have just the place for you to get started freely so I found this GitHub repository and it contains all the books from data warehouse toolkit by Kimble so there are multiple books in these GitHub repositories you obviously do not need to focus on everything first thing you can focus on is designing data intensive applications this will be used for system design next is the definitive guide to dimensional modeling this is the perfect resource for to help you get started for data modeling interviews and then lastly if you have time then go through the spark definitive guide so this will take you through how Hardo Works how spark works and all the internal workings and you can see a little button here which says download raw file which will download the PDF here for you you can also take a look at table of contents and see what the book focuses on for example it also talks about encoding and evolution Json XML and binary variants language specific formats AO like files like AO Park are used a lot in Big Data projects so it also goes through that so these three books are really good and I found them for free on this GitHub repos I'm going to link it down below now we have covered a lot of things uh in these resources we have gone through cloud-based resources we have gone through uh system design data modeling resources we've also gone through distributed comput resources like spark now the next step naturally would be for you to get started with data engineering projects and I recommend doing two to three projects data engineering projects if you don't have data engineering experience in your day job already then these two to three projects would not only help you get started in terms of learning but it would also be helpful if you host them on your own personal GitHub profile and then just add a link to them within your resume so whatever companies that you apply to the hiring managers can go through them and see the value that you provided with these projects I have a lot of projects that like covers different difficulty levels so if you're a beginner there is a list of projects for you along with it source code if you're an intermediate or an advanced data engineer then for that also we have different projects on this website and the source code is also hosted completely open source on GitHub so it's really cool to see people contributing to something like this for free and since it's my job to highlight some of these resources and pinpoint them to you let's take a look at these projects so I'm going to open this website it's alpha. a and it has top 12 data engineering projects beginner to advance now if if you take a look at beginner level you can see a lot of different projects and you can choose the one that like resonates with your interest like let's say if you are interested in finance data then pick up a project that deals with Finance data if you're are interested with social media data then pick up a project that deals with social media data so so that along with learning you're also interested fully in it so that you can uh not get bored when you implement the project and one thing that I would say is implementing a project is not very easy you'll have have to probably Google a lot of things you'll have to find them on stack Overflow you might use U AI llms like Chad GPT or Gemini but the whole process that you'll go through and the learning that you'll get out out of implementing an end to end project is tremendous for example if you're interested in social media data uh let's look at this project like stock and Twitter data extraction using python Kafka and Spark they've put it in a beginner bracket but I don't feel that it's completely beginner python is but services like Kafka and Spark might not be for completely beginners but I mean it's it's a good project nonetheless so let's go through them so they've mentioned that uh like what it does like stock market analysis Twitter sentiment analysis there's a real-time data integration there's also extraction and transformation how Kafka and Spark is being used and also Predictive Analytics and visualization so that's great right I mean let's let's click on the source code here this is the whole GitHub repository with all the code and uh first of all let's just get started with read me so this has some snapshot of the final visualizations that you'll end up creating so this will be the final product and that it also shows the architecture of the whole project and also shows like what's happening within each and every layer what are the Transformations applied so it has really good documentation as well so that's always a bonus point and one thing that I would recommend is uh once you have implemented the base project that's listed here with that source code add something more right focus on adding more value than what's already present otherwise what's the point of just like creating a copy of this within your GitHub profile it has to be something that you did uh that enhanced the existing project right and that can be included in the summary when you include this project in your resume now I'm also going to give you a couple of additional bonus resources so if you're interested in learning let's say data brakes or snowflake which are like cloud data platforms both of them kind of provide End to-end Business data processing Solution on cloud of your choice for data braks you could just go through this customer Academy so this is for the customers of data brakes or potential customers who are interested in data brakes you can go through this get started with data brakes for data engineering now you'll see there are two such courses right some might prefer self-based e-learning courses or some might prefer instructor U Le courses which happen on a fixed set of times so accordingly just just pick whatever that suits you and then you can click on that event you can select you can see that what is your local time and then you can just enroll into this session now for learning snowflake Basics I found a really good course on LinkedIn learning so if you don't know LinkedIn comes with a one month free trial and the course that I found here you can go through that course within 7even or 8 days or maximum 15 days so within your 1 month free trial itself you can easily complete this course this is also technically free you can see the button my start my 1 month free trial and this course covers like like snowflake DB overview snowflake data storage and files query processing data services partners and architectures so and you can see a lot of these videos are not very long so for example create and use user defined function it's like barely 5 minutes long and I would definitely recommend when you're going through any learning course do Hands-On like just create free tier accounts maybe on gcp snowflake uh data breakes AWS whatever you're learning right just just create PR tier accounts and do a lot of Hands-On because the learnings that you get out of doing Hands-On is really unparallel and you cannot get that just by going through some videos or going through some articles so what what are your thoughts about it uh these are completely free resources maybe just go through them and let's talk let's discuss your point of views in the comment section below if you have any questions feel free to drop them down below and yeah that's it uh I hope that you find these resources useful and do not forget to leave a like And subscribe to the channel and don't forget to share this video if somebody is interested in learning data engineering and wants to get started in doing so for free see you next time [Music]
Info
Channel: Jash Radia
Views: 47,105
Rating: undefined out of 5
Keywords: free data engineer course, free data engineer roadmap, how to become a data engineer, data engineer roadmap, data engineering roadmap, learn data engineering, data engineer guide, data engineering guide, how to learn data engineering, learn cloud, learn gcp, learn snowflake, learn python, data engineering, google data engineer
Id: Hq5FDA2JPss
Channel Id: undefined
Length: 11min 49sec (709 seconds)
Published: Sat May 18 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.