If I Started Learning Data Engineering in 2024, I'd Do This. #dataengineering

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] these are the question we have asked so many people and searched again and again on Google and YouTube but why even after watching several tutorial videos and reading blogs why do we still have this question in mind three reasons one you haven't found the right video or blog to yes they have told me to learn these things but where do I learn them what resources I can make use of three you're not consistent so let's put a full stuff to this question before making this video I've done some deep research on data enging job description to understand what skills a companies are expecting from entry level 1 plus 2 plus year candidates after analyzing the result I understood one thing Hadoop h spark SQL and anyone programming language or not enough even for - level candidates you need to know more because companies are expecting more from us by the end of this video you will get to know about everything what skills you need to learn what projects you can do what are the resources you can use and how many days you can spend on each topic these will address the first two reasons to tackle the third reason I've got something for you that I'll be sharing by the end of this video you know that will help you to be consistent and keep keep track your Learning Journey if you follow the schedule and resources which I'm going to share in this video definitely I can assure you that by the end of the third month you would have acquired a good amount of skill you needed to become a data engineer by the way guys before getting into the video I I just want to share you one thing you don't have to pause the video and take note of every single resource um which I'm going to share in a moment I'll let you know at the end how you can access the free resources you just sit back and and watch the whole video so you ready let's get started day one and two Basics if you're a complete beginner I recommend watching this video Even if you aware of what is data engineering why Big Data came into picture it's okay to give it a watch you know you'll get a clearer understanding alternatively if you prefer reading you can go to Google and type what is data engineering or what is Big Data open few articles and start understanding the basics while watching the videos or reading blogs you may see some suggestions uh that is related to data engineering Concepts like Hado or Hive or you know how to become a data engineer please don't jump around just stick with the schedule for first two days Focus only on Basics day 3 to 12 programming for the first 10 days pick anyone programming language from the following either it could be Java python scholar or or I recommend choosing python or scholar so for example if you have picked python then first cover the basics what is data types list tles set dictionary and Loops conditions functions what are all the build-in functions you can learn it from W3 schools or if you prefer watching video you can watch this video or this video and then move on to libraries for data engineering numai and pandas are more than enough I learn numai and pandas only using W3 school if you get comfortable using W3 school you don't actually need any other resource W3 school is just amazing you may have this doubt do we have to learn oops concept well I would say yes you have to learn oops Concepts um such as what they are what is inheritance types of inheritance polymorphism what is class what is objects but you you don't have to master them same applies to numai and pandas because there are so many functions in both you can go through everything but you don't have to remember every single function just practice the commonly used functions that level of understanding is more than enough most important thing every day spend an hour or more if you need it for practicing coding if you want to be good at coding the only thing you have to do is practice practice practice that's the only solution you don't have any other options for practicing python you can use hacker rank lead code or code CHF if you're a complete beginner I recommend starting off with hacker rank day 13 to 22 so now we are entering into the SQL world first cover dbms concept like what is dbms what is rdbms types of keys asset properties what is normalization then you can learn about SQL commands DML ddl DCL TCL aggregate functions subqueries comment table expression and the most important thing is window function please don't just skip window function initially you may feel bit difficult to understand the concept but window function is actually super easy if you know how to use window function then it'll be easier for you to solve certain problems so after learning you will get to know why I'm saying this and guys I bet you you'll get at least one question on window function in your interview you can learn SQL and dbms from javao and W3 school also don't forget to check out this YouTube channel Tech tfq in his channel you can find loads of good SQL content so don't forget to check out this YouTube channel you can practice SQL on lead code or hacker rank guys please don't underestimate the power of SQL when it comes to data and sharing you have to be 90% strong in SQL because it's extremely important so practice every day to become strong in SQL day 23 to 26 Linux learn some basic concepts in Linux you don't have to dive deeper in Linux focus more on commonly used comments you can learn from by watching this video or geks for geks day 27 to30 now we are entering into the data world for the next 4 days learn about data warehousing concept what is data warehousing what is the difference between data L and data warehouse oap olp star schema snowflake schema you can learn by watching these videos day 31 and 32 ETL process ETL is nothing but extract transform and load extracting data from some particular location and doing some transformation that is completely based on business use case and then loading the data to some other location you can learn about ETL and how ETL Works uh what is pipeline everything by watching these videos day 33 to 40 apachi Hado in Hadoop you have to FOC focus on Hadoop architecture htfs architecture how data is stored in htfs what are the htfs commands cover these Concepts in detail for map produce just understand what is map produce and why it was used because these days no uses map produce it has been replaced by spark you can learn completely about Hado by watching this playlist day 41 to 44 Hive in Hive learn these things why Hive is used of Hive commands types of of partition internal and external table and how to create internal and external table what is the main difference between those use this playlist to learn about hype also if you want you can refer in Java depine but that playlist is more than enough day 45 to 55 Apache spark in spark learn these things uh what is spark why it is used what is the difference between map produce and Spark what is badge processing and stream processing rdd data frame and data set Focus more on data frame uh how to read and write data what are the different options you have while reading and writing data explore all the functions available in spark but you don't have to remember everything just understand go through everything and understand and practice the commonly used ones when learning spark you may encounter these two things one is py spark another one is SCA spark P spark is nothing but writing Spar code in Python and scalar spark writing Spar code in Scala there's also spark SQL which allow us to write SQL queries inside spark you can choose either Scala spark or P spark that's that's completely up to you if you are a Java person then you can choose scalar spark because Scala and Java are kind of similar also it is even more easier than Java so it's completely up to you guys you can prefer whichever you want if you're learning Pi spk free code cam has this amazing tutorial so you can make use of it and one more thing my most favorite website uh Spark by example there you can find everything like literally everything you can learn uh about spark there rdd data Frame data set and Spark SQL and the beauty is you'll have the example for every single thing after finishing spark SQL Hadoop Hive anyone programming language you can Pat your shoulder and congrats yourself because now we have completed the mandatory skills required to become a data engineer but but we not stopping here there are more to explore so it's time to level up our skills learn the basics of a flow what is a flow what is dag how to create dag and what are the trigger rules available in airflow and what is a dependencies different types of operators in airflow you can learn about airflow from data camp or airflow official documentation or from this medium blog post day 62 to 71 now it's time to dive into Cloud platforms pick anyone from AWS assure or gcp and start learning you don't need to get deep into Cloud just understand the basics uh how data is stored in Cloud what are the different services available for data engineering I learned about assure but if you ask me I recommend choosing either AWS or assure because when I was digging the job description the most repeated ones was assure and AWS so it would be better if you choose anyone from this you can learn assure from their official documentation and also from this YouTube channel on his channel you can find everything related to aure for AWS you can refer k21 Academy now we have completed 50% offer Journey what just 50 I thought it would be 80 or 90% that's what you're thinking right think for a moment guys so far we have learned about everything but we haven't implemented our learning right so it's time to get our hands dirty do as many projects you want to get deeper and clearer understanding but I recommend doing two projects at least two project one using airflow another one using Cloud you can refer dasel bma's YouTube channel um he uploaded lot of project related videos or this playlist or this gab repo and guys please don't just copy everything from the project understand the architecture understand where the data is read and stored how to process the data what are the different Services they are using try to use different data set you can freely download it from Kel also use different use case and if you have no idea about which use case to use why you have to worry when a friend is here right uh you can ask Char definitely it will give you some good suggestions you may have this doubt uh let's take the video the project video uh I've shared is uh for example 6 hours let's take for example 6 hours you may have this doubt you know the video is only for 6 hours I can just complete it within 2 or 3 days why I have to spend 15 plus days guys are we doing projects just to complete them and tell everyone that hey I've completed this project and adding it in your resume that's not the whole point right please keep this in mind mind we want to do projects to get hands-on experience and deeper understanding that's the reason we are doing projects and that's why I've allocated 15 plus days only for projects you get a chance to explore more and understand more when you're doing projects than learning once you're done your project you can push your project to get up okay so let's be honest how many of you have LinkedIn account if yes good if not no problem go ahead and create one after watching this video regularly Post in LinkedIn before posting please keep this two points in mind one your post should add a value to someone two your post should attract hress attention please don't um share everything before learning like for example I'm going to learn this or I'm going to learn that I'm going to learn SQL or python in LinkedIn instead you can share after learning like let's take for example if you have learned today about um SQL and you learned about agre functions so you can share in LinkedIn like what is aggregate function what are the types of aggregate function along with an example that will be good let me answer this frequently Asked question should I have to learn DSA to become data engineer I would say no unless you aiming for any product based companies because DSA is not a mandatory skill you need to acquire to become a data engineer but if you're interested you can learn after watching everything after covering everything which I've shared before before so as I said I've got something for you to stay consistent and track your Learning Journey I've created a Google dog like a Gold Tracker in that I've attached all the resources and the time frame you need to spend on each topic after finishing each topic you can mark it as done or if you just started you can mark it as in progress this will really help you whenever you open this dog to learn you can clearly see where you are and where you want to go before wrapping this video I just want to add few more points in the middle of the journey you feel like you don't understand anything which can make you doubt yourself and you'll have this negative thoughts uh popping up in your mind you know it's okay it's completely okay to feel that way you'll understand everything as you progress if you think like you can't do this or I can't do that it's only you're limiting yourself if you think you can you can do it if you think you can't you can't can't do it so all the best guys you've got this that's all for today guys I hope you enjoyed watching this video and if you found this video helpful you know what to do and by the way um if you want to know about my complete Journey how I became a data engineer as a fresher and also what are the techniques I've used to uh approach H and uh how I actually got a job everything I've shared uh everything I've uh mentioned in this video um you can check out this video if you want um so yeah uh I'll catch you next time have a great day uh take care bye-bye
Info
Channel: Asvitha VS
Views: 41,119
Rating: undefined out of 5
Keywords: Data engineering complete roadmap, how to become a data engineer, How to become a data engineer in 3 months, become data engineer in 3 months, data engineer roadmap in 2024, how to become a data engineer as a fresher, hoe to become a data engineer with non IT background, coding needed for data engineer, dsa needed for data engineer, how to become data engineer in 2024, data engineering jobs, data engineering complete guide, skills needed to become data engineer, apache hadoop, hive
Id: o8KGOVQa_q0
Channel Id: undefined
Length: 15min 39sec (939 seconds)
Published: Fri May 24 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.