Microsoft Azure Data Engineering Complete Roadmap 2024 (Top 10 Services To Focus)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
As per the latest report published  by Gartner, Microsoft Azure is the   leading cloud provider. As you have seen the  growth of Microsoft in the last few years,   this might be one of the best times to learn  Microsoft Azure Cloud and upscale yourself. So,   in this video, I will give you the complete  guide on how you can master Microsoft Azure   Cloud, get certified, and do one complete  end-to-end project in data engineering. In the first part of this video, we will talk  about some of the important services you need   to focus on as a data engineer. So, do not  skip this part because if you want to learn   different services on Microsoft Azure, I will  give you a clear explanation of these services. In the second phase of this video, I will give  you the complete roadmap where you can learn all   of these different things, get certified, and do  the project. I started my career by learning AWS,   okay? So, AWS Cloud was my first cloud  platform that I learned. Then I learned   about GCP and Azure. One thing that I've  learned is that once you learn one cloud,   it is very easy for you to learn any other  cloud in the market because fundamentally   all of these cloud services are the same. So, in  this video, I just want to cover Azure services,   but if you want me to make videos on  AWS and GCP, I will definitely do that. Remember, all of these cloud platforms have  more than hundreds of services available.   The good thing is that you don't need to  learn all of them; you just need to focus   on some of the 15 to 20 data services that  you will use as a data engineer. So, first,   I will give you an understanding of some  of the most important services you need to   focus on in Azure Cloud, and then we can talk  about the roadmap and all of the other things. So, here are the 10 different services  you need to focus on as an Azure data   engineer. Number 10 on the list is Azure  Storage. It is an object storage where   you can store all types of files that you  want. So, even if you have images, audios,   videos, text files, you can easily store  them on Azure Storage and integrate them   with some other services. We will talk about  other services where you can integrate all of   these different technologies. But for now,  just understand Azure Storage is where you   can store any types of file that you want and  easily access it from anywhere in the world. Number nine on the list is Azure SQL  Database. So, as a data engineer,   one of the skills you need to have is an  understanding of SQL because it is the   way you communicate with databases, and  Azure SQL Database service is provided   by Azure when you want to store all  of this data in structured format. If you have been following me from the past,  you know that we talk about OLTP and OLAP   databases. So, this is the OLTP databases that  are designed for CRUD operations (create, read,   update, delete). So, if you want to connect  your application to the relational database,   this is the service that  you use: Azure SQL Database. Number eight on the list is Azure Cosmos DB.  Now, this is the center of modern engineering   because this particular database is generally  used for high-scale applications. Cosmos DB   supports different types of databases such  as PostgreSQL, Cassandra, MongoDB. So,   it supports both SQL and NoSQL databases.  So, if you want to develop high-performance   and high-scale applications, then you  can easily do that using Cosmos DB. Seventh on the list is Azure HD Insight. Now, when  you enter the Big Data world, you generally hear   about Hadoop, Spark, Hive, MapReduce, and all of  the other things. These are open-source frameworks   generally used for processing data. Now, if  you want to install all of these applications   on your PC, you will have to do a lot of different  configurations, and you will have to get the sizes   of the hard disk, RAM, and all of the other  things together to process large-scale data. Now, HD Insight is a service provided by  Azure. You don't have to manage anything;   you can just click one button and have this  entire framework configured automatically. So,   you can just focus on processing your data and do  not worry about the underlying infrastructure. So,   this is a very good service if you  want to process your data at scale. Number six on the list is Azure Data Lake  Storage. So, Data Lake is basically the   centralized repository where you can store  all of your data. In data engineering,   we have data coming from multiple places. We have  the RDBMS, web analytics, some third-party data,   and all of these data are coming from  multiple sources. What we really need to do   is aggregate all of this data at a centralized  location. This is what we call as a Data Lake,   where you just store all of these data as  it is, and as per the requirement, business   people or the data analysts or data engineers can  choose the right amount of data. This is what we   generally call as a schema on read. Based on the  requirement, you choose the data that you want,   and then you can do the analytics on top of  it. So, Azure Data Lake Storage Generation 2   or ADLS is a service built for this purpose.  It will help you build a Data Lake on Azure   Cloud where you can easily organize your  data and retrieve it whenever you want. Number five on the list is Azure Stream Analytics.  So, in today's business, we see a lot of   applications of real-time data streaming, right?  Google Maps, Uber, all of these applications   give you real-time notifications about your  events. If you order something from, let's say,   Zato or Amazon, you get notifications at each  and every place, such as your order is confirmed,   your order is shipped, your order is returned.  All of these pieces of information are coming   to you in real-time manner. So, if you want  to build an application like this and if you   want to process your data in real-time, Azure  Stream Analytics is a service you should go for. Number four on the list is Azure  Machine Learning. Now, this service   is mainly used by data scientists and machine  learning engineers, but as a data engineer,   it is really important for you to understand  all of these different services because you   will be working on the business side  and also the engineering side. So,   it is really important that you understand the use  cases of Azure Machine Learning services, and this   topic is also important for your certification  exam. Azure Machine Learning service is generally   used for building, training, and testing your  machine learning models on a large scale. Now, these were the seven services that are  important, but the next three services that   I will be talking about are some of the most  important from a real-world point of view and   also from an exam point of view. So, number  three on the list is Azure Data Factory. Now,   we understood, right? As a data engineer,  we have data coming from multiple sources;   we want to process all of these data and store  this data onto some target location. Now, Azure   Data Factory service is built for that purpose  only. So, if you want to create the data pipeline   on a large scale where you want to extract  data from some sources, do some transformation,   apply some logic, and build your pipeline in  between by aggregating different logics, you can   easily do that using Azure Data Factory. It has a  drag-and-drop environment, so you can choose the   actions based on your work and create the complete  data pipeline. The best part about this service is   that you can integrate this service with any other  services available on Azure or even outside Azure.   So, if you want to retrieve data from, let's say,  some APIs, you can easily do that on Data Factory. Number two on the list is Azure Synapse  Analytics. So, at the end of our data   pipeline, we want to aggregate all of  the data, put it in structured format,   and load it somewhere in Azure Cloud. You  can load your data on Synapse Analytics,   which is kind of like the data warehouse service  provided by Azure. You can also do a lot of things   on Synapse; you can also build your Data Factory  pipeline; you can also do the transformation;   you can write SQL code; you can write Spark  code, all these different things provided by   Azure in one single service. So, you don't have  to worry about jumping to different services.   But the core idea behind Synapse Analytics  is to build a data warehouse so that you can   analyze your data in an efficient manner.  So, if you want to understand, let's say,   in the case of Amazon, like what was the last  six years of revenue compared to this year or   how many products did we sell this month compared  to the previous 6 months, you can do all of these   analytics using SQL and also Python, the Spark  language provided by Azure Synapse Analytics. And the number one on the list is Azure  Databricks. Now, Databricks is like the   managed environment for Apache Spark. Apache Spark  is an open-source framework used for processing   Big Data. But again, the installation of all of  these different tools is very difficult; you have   to configure a lot of different things. So, if  you use Databricks, within one single button,   you can configure your entire environment and just  focus on writing your logic and do not worry about   the underlying environment. Now, Microsoft did  the partnership with Databricks and integrated   this service on Azure Cloud. So, you can get most  out of all of these different Azure services. So,   if you want to connect Azure Databricks with,  let's say, Synapse Analytics or Azure Data Lake   Storage, you can easily do that because Microsoft  provides all of these different integrations   in the backend. So, within a few lines of  code, you can easily access data from ADLS,   do the transformation, and load your data onto  Synapse Analytics using Azure Databricks services. So, these three services that I  talked about, Azure Data Factory,   Synapse Analytics, and Azure Databricks,  are some of the most important services you   need to focus on if you want to learn Azure  Cloud from a data engineering point of view. Now, these were the important services, but there  are some other services or the new services that   are coming in the market. So, one of the  services we have is the Data Fabric. Now,   this service is very new in the market; it  was launched 6 months back. So, all you can   do is just learn about this service and what  is the use case of it. As we move forward,   we will understand how companies are  using this service in their business,   and then we can start talking about  if this service is important or not. And if I missed any of the services that you  have in your mind, and if you're working on   Azure Cloud, then you can definitely comment  it out so that I will get to know about it,   and also other people in the comment  section can also learn about it. So, we talked about some of the important  services; now let's focus on how you can learn   all of these services in the first place. So, what  is the roadmap to become an Azure data engineer? So, the first step I always suggest is to not  run behind the theories and the courses and   all of the other things. The first thing you  need to do is go onto the portal.azure.com   and create your free account. Azure provides  $200 credit for free and a 12-month free trial   on some of the services. So, the first thing I  want you to do is go create your account. Okay,   it is completely free; all you have to do is  fill in the basic information. You won't get   charged for any of these things. And once you  do that, you can easily access the Azure Cloud   portal and start understanding about the  different services provided by Azure. This   first step is very important because you  will get used to the UI of Azure Cloud. Because whenever we start learning about something  new, there is a little bit of discomfort in our   brain that new information is coming  to you and you don't understand it. So,   all I want you to do is the first step is create  your account. Once you create your account,   you will get the confidence that you took  the first step to learn something new. Now, once you create your Azure account, then the  second thing you can do is go through this amazing   free training provided by Microsoft. This is  freely available; all you want to do is go through   this step by step and learn everything. I will  provide links to all of these different things in   my Notion document so you can just check the link  in the description and understand more about it. After doing this training, you will understand  most of the things you need to know as an Azure   data engineer. But doing a good course on this  will give you more confidence that you actually   understood all of these different things using  hands-on practice. So, this is one of the best   courses available in the market on Coursera:  Microsoft Azure Data Engineering Associate. Now,   this course will teach you everything you need  to know to become an Azure data engineer. This   course starts with the basics, so first, you will  have a clear understanding of data engineering,   why we do data engineering in the first place and  the impact of it. And then as you move forward,   you will get more understanding  about the different services,   how they come together, and you  will also get hands-on labs. So, this course is freely available  if you just want to audit it, but if   you want to get the certification,  then you have to pay for it. So,   it's up to you. If you just want to learn  from it, just go through the videos;   you will get most of the understanding,  and then you can move on to the next step. So, once you complete your course on Coursera,  then the next thing I suggest you do is understand   how to combine all of these services and build  a final project. And for that, I have one free   end-to-end project available on my YouTube channel  called "Olympics Data Analysis - End-to-End Azure   Data Engineering Project." This project will give  you an understanding of the various services,   the important services that we talked about, such  as Data Factory, Databricks, Synapse Analytics,   ADLS, all of these important services, and  you will build a final project. So, I will   highly encourage you to do this project after  completing all of these different courses. So,   after completing this project, your confidence  will skyrocket, and you can easily appear for   the certification exam. If you want to validate  your knowledge and stand out from the crowd when   applying for jobs, you can get the certification  from Microsoft. They have the certification   available for Azure Data Engineering Associate.  Having this certification on your resume makes   you stand out and validates your knowledge as  an Azure Data Engineer. When a recruiter sees   that you have obtained this certification,  the chances of getting calls for interviews   increase. I cleared this exam in just 7 days and  have created a complete guide on it as well. So,   if you want, you can watch this video  to gain a complete understanding of   how to get the certification, including  information about the cost, the time it takes,   and the different resources I used to clear the  certification. That's all for this video. If   you've learned something new and found this  video helpful, don't forget to hit the like   button. This helps the channel grow and reach  more people. If you're new here, don't forget   to hit the subscribe button. Thank you for  watching, and I'll see you in the next one.
Info
Channel: Darshil Parmar
Views: 92,150
Rating: undefined out of 5
Keywords: darshil parmar data engineer project, data engineer, data engineering, microsoft auzre data engineer, become microsoft data engineer, azure data engineer roadmap, microsoft data engineer roadmap, learn azure for data engineering, how to become microsoft data engineer, learn data engineering for free, azure data engineering certification guide, become microsoft data engineering, azure data engineering for free in 2024, Microsoft Azure Data Engineering Complete Roadmap 2024
Id: JFCiH6ZALls
Channel Id: undefined
Length: 12min 55sec (775 seconds)
Published: Sun Dec 03 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.