Fastest way to Start Your Data Engineer Journey in 2024 - 100 Days Of Data Engineering Crash Course

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
is there a way you can become a data engineer quickly now I think the short answer is probably no there is a ton of skills and Concepts you need to understand in order to become a rock solid and rock star data engineer everything from SQL and programming to data warehousing and sometimes other skills like Cloud snowflake which obviously is in a separate skill but there's just so many tools uh you know they've made this whole crazy data landscape that learning to become a D in quickly is hard but what we can do is in 100 days and that's what we're going to focus on here is build up a baseline of skills that you can acquire that can get you further so in this video what we're going to do is cover 100 days and what you can learn in those 100 days in order to make sure you are on the right path to becoming a data engineer in 2024 now I wanted to switch up this video you know we always do the cliche uh video of here's a data engineer road map for 2024 here are all the best courses you can take instead let's just make it so there is an easy time constraint so 100 days so you can make sure you go through this checklist that we built and track your progress through it all so for this video the goal will be one to explain how you should actually use this checklist most effectively cuz just going through and watching the videos is not sufficient right that's the problem we all do we just go through we watch a bunch of videos and then never grow from that as well as we'll go over the different sections and why the there different concepts and hopefully you'll learn how you can go through this effectively and amplify your learning process now let's first talk about how you should go through this checklist and I and this is definitely more of my method of how I learn and you might have your own but I think you need to have an actual conscious and like meaningful approach to learning new Concepts right you can watch a bunch of videos on economics or learning a new language or how to play guitar but if you don't one try practicing your guitar every day I have a guitar somewhere that I haven't practiced or at least do some level of reflection on what you've learned you're not going to learn much so here's what I'd recommend you doing at the end of each day after after you've gone through whatever we' recommended you go through those videos that content you take a moment to actually try to answer a few questions those questions can be what have you learned actually write what you've learned like don't just let it fly out of your head uh what are two or three interesting points what did you find that was actually interesting just write those out what are two points you maybe don't fully understand and then go and look up why you don't understand it uh you know write a little chart write a little picture make it really interactive for you don't just again don't just passively consume content make sure there is this conscious choice to actually try to ingest it and then try figuring out how you could apply that skill right if you learn how to uh make a class where does that class fit like where do you use that is there somewhere in your day-to-day job you could use it you know if you've learned to automate a script on a Lambda how could you use that what's a way you could use that I think this is great because one of the things we're going to recommend you do is obviously projects and if you're not trying to think and be creative on what you can actually do with these skills it's going to be hard for you to come up with your own project ideas which is what you should do finally find a way to be kept accountable now we're going to create uh in our Discord Channel kind of your 100 days of deed engineering where you can kind of go through it but there are tons of ways you can keep accountable you don't just have to do it through this Discord channel it's just one way um there are tons of other groups uh you can find or just post about it on LinkedIn and and keep yourself accountable that way you don't have to post every day but you know maybe every other day every 4 days every 5 days just post your update um if you do it every 5 days just be like hey here's what I learned on day one 2 3 and four and five and summarize all of it I think it's a great way of just keeping yourself accountable and challenging yourself doing a lot more than again just passively consuming content which is what we all do on Tik Tok or Instagram way too much anyways so let's let's be a little more active so now let's go through the 100 days we've gone through how you should kind of approach this and now let's actually go through what the plan is so first 10 days our plan for you for the first 10 days is just to review the basics that means we're going to go over the basics SQL programming uh some data modeling and some Concepts about data pipelines that's really the focus here we're going to just try to make sure you understand where you fit you know are you comfortable with SQL are you comfortable with programming do you know you know is this your first time into into programming do you even know what a for Loop is and that's okay if you don't it's just good to assess where you're at or what you've maybe forgotten because the last time you learned a programming language was 5 years ago in college and you haven't touched it since that's the point here just to give you a quick refresher it's like that first few days in a course when you go to college it's all Meant To Be Easy 2 + 2 equal 4 yes it feels like review but it's a great way to set the tone and for you to start building this habit of going through the checklist because really that's going to be the key here is like the Habit that you form now will get you going forward and you don't need to speed through things I think sometimes I've seen people go through courses or something and just feel like if I go through it that's what teaches me the concepts but going through and reading a book without mindfully reflecting or going through a course and not thinking about what you're learning isn't effective and I'm going to keep yelling that because I do it all the time we all we all watch a video and Come Away with you know what did we actually take away from that so just make sure you actually think through it don't just watch an hour of content and then watch the next hour because you're trying to go through the checklist faster um that's not going to help you learn the the goal is to actually have this stuff stay with you for a long period of time now once we've gone through the basics we've done the basic Basics we're going to dive deeper into these Concepts right now you understand the for Loop you know if statements you can write some basic uh conditional statements and you can write some basic scripts great perfect let's take that to the next level you know on SQL let's answer some difficult problems using SQL right we're going to have you look at some data sets we're going to have questions you're going to answer those questions yes there isn't a perfect answer here we'll have some answers but not everything has an answer but you're going to go through that data set and you're going to understand you know what is it write some queries how do you write uh you know a rank function how do you write all these different Clauses how do you use them how do you actually apply this stuff same thing with python right we're going to be going over some Basics here with the instruction algorithms yes you might not always use it but I really do believe there's value and understanding it even if chat GPT can write it out faster than most of us can even try I don't think it's bad to understand some of these Basics same thing with data modeling I think everyone gives data modeling like we went over the the beginning part um the first 10 days you know we went over some things like Dimension tables and all the highle concepts here normalization Etc now let's actually try applying that and I think everyone maybe gets upset when I recommend Kimble because every time I recommend it people want more examples but you go through Kim's book he has a lot of examples on how you can actually build your data model so I I I would definitely check that out uh we'll go through it as part of this and you'll build your own data models as we're going through it because again this is really focused on making sure you really build that next layer we're kind of building this pyramid Baseline now the next level of skills which is going deeper and that'll be your next 30 days I also forgot to mention you're going to be going through the cloud as well now we didn't cover the cloud as much in last time we're going to go through it here as well you're going to be essentially going through IM am understanding what it is um how it maybe works on AWS AWS is just an easy one to work with uh spinning up an S3 bucket spinning up RDS interacting with them in code ingesting data so you can analyze it from a postgress instance doing things that aren't maybe that difficult but get you familiar with going through those motions okay now that you've done all that you've got all this data and code and and you you you've gotten comfortable it's time to build a mini projects it's time to show off what you're learning and it's great to kind of do this like every 30 50 days or so to actually like make sure you're solidifying whatever you're learning into projects that are easy to understand and now for this first project I'll kind of give you a bunch of data sources that you can go to and maybe a few questions to answer with that data source and some ideas of what you can do with it right you can take that data you can scrape it from sites or an API you can ingest it into S3 or into a postgress something like that um automatically through you know using something like a Lambda set up on an event bridge and then from there you know write some SQL queries build a TBL dashboard build something on top of it and so that that really just is something that you are going to do you're going to ask some questions of that data and then build a basic probably dashboard because that's probably the easiest thing in 10 days that you can go through and really solidify your understanding here now once we've done that the next uh days 51 through 70 the focus will be kind of a survey of tools and Concepts there are so many tools and Concepts in the data world right like there's spark you've got snowflake you've got data braks which is obviously somewhat connected to spark you've got uh various uh Open Table formats you've got data lineage data cadlog you've got things like Docker and all these different solutions so we're going to go through and we're going to make sure you have a good understanding of a few of them again we don't want to overwhelm you you really at the end of this 100 days the goal isn't to be an expert data engineer that has 10 years of experience because you have 100 days of not work experience but learning experience and that's okay but it is good for you to at least be dangerous with these tools to at least understand what they are to at least be able to run uh some Docker commands to be able to go into uh you know how to use things like data bricks and how to set up your own instances of various Solutions and answer questions like what is data governance again this is just to make sure that when you start getting asked about this in your job in your day-to-day workflows you're ready you don't have to go look up everything you have a good quick answer um and good understanding in terms of Baseline of what something does it's always good to just be ready I've had plenty of times where I didn't understand a baseline concept and it looks a little bit silly uh especially early on your career if you can't talk through even some of these basic concepts so that's what we'll do here is we'll cover that now finally again throughout this all you'll still be doing a little bit of python some SQL Some Cloud messing around with different solutions great finally the next 30 days the goal is to deliver a project right and here's the thing I want you to deliver the project I want you to actually think about what that project should be yes we could go through and uh build a project from start to finish uh I have a bad habit of not actually going through all of my part ones that we can maybe flash UPS here um but the point is you need to come up with a project idea and it's not as hard as you think I think a lot of people sit there and fear that they're not going to come up with a good idea or they're just not sure how to start and and so here's what I'd recommend you do we're going to kind of put some bullet points up here first pick some data sets which again will list some data sets for you pick one of them that You' like to work on you know it's interesting to you the Topic's interesting great then you can start trying to figure out some questions some problems that you'd like to solve here now the thing is it doesn't have to be a dashboard at the end and then part of this is like how would you actually show this data what you build doesn't have to be a dashboard you can build dashboard you could build some sort of you know output that maybe is based on AI or ml right or maybe you scrape all this data from some site and and con and ingest it and summarize it and put it into uh you know your own site I saw someone once do that where they basically took a bunch of YouTube um and podcasts and they took a bunch of YouTube transcripts and podcast and summarized them using uh chat gbt and posted them somewhere else then charge like a $1.99 for people to have access to it I think they're making like 10 grand off that a month so there's really simple things you can do that I consider somewhat data engineering that aren't Dash boards and you can totally build those and I think that's that's what's fun here um this is kind of like an example I've once heard which is like this is kind of somewhat similar or this exercise we're going to do is somewhat similar to the one you might have heard before which is think of like a 100 uses or think of as many uses as you can in a minute for a certain thing like let's say A brick is usually the example it's the same thing here you need to sit there and be creative and be like hey what could I do with this data it doesn't have to be a dashboard and I'm kind of stuck sticking on this point because I really want you to think what can you do with this data it doesn't have to be told to you you I believe you can figure out your own ideas of like what you could do with data next pick some sort of tool or framework um whether that's airflow whether that's Lambda obviously I'll give a shout out to Mage here which again I am partnered with so obviously I have some benefits to say that but there are plenty of solutions you can pick I'll put a few more of up here there are so many um but just pick a few make sure you're picking something that you see I think in job descriptions if you see airflow a lot if you see ssas if you see Azure data dat Factory um in the jobs that you're interested in yeah probably pick that solution and let's go with that next pick a data solution that you'd like to work with snowflake postgress S3 data bricks there are so many um again as long as it's being used at a company I think it's a good choice and then from there you can kind of do your ingestion store that data again you should have some sort of end goal in mind whatever it is your visualization your app whatever you're building do the transforms required and then finish and if you really want to take it to that next level write an article post about it share about it it's really great to just hear other people's opinions about what you're working on whether it could be better whether it could be you know changed what other things you could add to it because really just getting that initial MVP out there one is exciting so it'll hopefully get you doing this more and more often but two it's a great way to have other people give their perspective so once you're doing this and as you're doing each of the steps through these various kind of sections and days don't feel like you have to hold it to yourself share that information with other people again share those those things that you're learning remember how we said like write up some things about what you're learning every day share about that that's a great way for both you to learn more because you're writing about it you're making it stick into your brain in different ways and then also you're helping other people learn which is a great way I think for the whole Community to grow so it's not just about you you're helping other people grow you're helping keep yourself accountable you're solving a lot of different problems besides just learning CU learning is hard to do because most of us generally again pass consume content and feel like we're doing something but the goal here for this checklist is to really get you ingrained into it and keep you going right it's 100 days from 2024 and by the end hopefully you've got a good Baseline and then you can plan out your next 100 days on what you're going to do are you going to keep on data engineering or learning about data engineering or you going to learn a new topic because you're like I don't maybe like data engineering or maybe I need to specifically focus on SQL cuz I was weak there or maybe I can start applying for jobs that's another thing which I hope you can do at the the end of this 100 days there's so many things that you can plan out as you're going forward with that guys again I really didn't want to build just a basic road map or best of 2024 courses again this is meant for you to go through um a lot of this stuff should mostly be free in terms of the content that you can go through and make sure your skill set by the end of the next 100 days is at that next level as dat engineer and hopefully you continue uh through this rest of 2024 learning more about data engineering with that guys I want to say thank you so much for watching this video I really appreciate your time and I'll see you guys in the next one thanks all [Music] [Music] [Music] goodbye
Info
Channel: Seattle Data Guy
Views: 66,179
Rating: undefined out of 5
Keywords: data engineering, 100 days of data engineering, how to become a data engineer in 2024, data engineering roadmap, data engineering checklist, learning python, how to learn python, data warehousing, sql, how to learn sql, should i become a data engineer in 2024, fast way to become a data engineer in 2024, programming, seattle data guy, ben rogojan, data science, data science vs data engineer
Id: 9FVchWw3EbU
Channel Id: undefined
Length: 15min 56sec (956 seconds)
Published: Tue Jan 16 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.