As per the latest report published
by Gartner, Microsoft Azure is the leading cloud provider. As you have seen the
growth of Microsoft in the last few years, this might be one of the best times to learn
Microsoft Azure Cloud and upscale yourself. So, in this video, I will give you the complete
guide on how you can master Microsoft Azure Cloud, get certified, and do one complete
end-to-end project in data engineering. In the first part of this video, we will talk
about some of the important services you need to focus on as a data engineer. So, do not
skip this part because if you want to learn different services on Microsoft Azure, I will
give you a clear explanation of these services. In the second phase of this video, I will give
you the complete roadmap where you can learn all of these different things, get certified, and do
the project. I started my career by learning AWS, okay? So, AWS Cloud was my first cloud
platform that I learned. Then I learned about GCP and Azure. One thing that I've
learned is that once you learn one cloud, it is very easy for you to learn any other
cloud in the market because fundamentally all of these cloud services are the same. So, in
this video, I just want to cover Azure services, but if you want me to make videos on
AWS and GCP, I will definitely do that. Remember, all of these cloud platforms have
more than hundreds of services available. The good thing is that you don't need to
learn all of them; you just need to focus on some of the 15 to 20 data services that
you will use as a data engineer. So, first, I will give you an understanding of some
of the most important services you need to focus on in Azure Cloud, and then we can talk
about the roadmap and all of the other things. So, here are the 10 different services
you need to focus on as an Azure data engineer. Number 10 on the list is Azure
Storage. It is an object storage where you can store all types of files that you
want. So, even if you have images, audios, videos, text files, you can easily store
them on Azure Storage and integrate them with some other services. We will talk about
other services where you can integrate all of these different technologies. But for now,
just understand Azure Storage is where you can store any types of file that you want and
easily access it from anywhere in the world. Number nine on the list is Azure SQL
Database. So, as a data engineer, one of the skills you need to have is an
understanding of SQL because it is the way you communicate with databases, and
Azure SQL Database service is provided by Azure when you want to store all
of this data in structured format. If you have been following me from the past,
you know that we talk about OLTP and OLAP databases. So, this is the OLTP databases that
are designed for CRUD operations (create, read, update, delete). So, if you want to connect
your application to the relational database, this is the service that
you use: Azure SQL Database. Number eight on the list is Azure Cosmos DB.
Now, this is the center of modern engineering because this particular database is generally
used for high-scale applications. Cosmos DB supports different types of databases such
as PostgreSQL, Cassandra, MongoDB. So, it supports both SQL and NoSQL databases.
So, if you want to develop high-performance and high-scale applications, then you
can easily do that using Cosmos DB. Seventh on the list is Azure HD Insight. Now, when
you enter the Big Data world, you generally hear about Hadoop, Spark, Hive, MapReduce, and all of
the other things. These are open-source frameworks generally used for processing data. Now, if
you want to install all of these applications on your PC, you will have to do a lot of different
configurations, and you will have to get the sizes of the hard disk, RAM, and all of the other
things together to process large-scale data. Now, HD Insight is a service provided by
Azure. You don't have to manage anything; you can just click one button and have this
entire framework configured automatically. So, you can just focus on processing your data and do
not worry about the underlying infrastructure. So, this is a very good service if you
want to process your data at scale. Number six on the list is Azure Data Lake
Storage. So, Data Lake is basically the centralized repository where you can store
all of your data. In data engineering, we have data coming from multiple places. We have
the RDBMS, web analytics, some third-party data, and all of these data are coming from
multiple sources. What we really need to do is aggregate all of this data at a centralized
location. This is what we call as a Data Lake, where you just store all of these data as
it is, and as per the requirement, business people or the data analysts or data engineers can
choose the right amount of data. This is what we generally call as a schema on read. Based on the
requirement, you choose the data that you want, and then you can do the analytics on top of
it. So, Azure Data Lake Storage Generation 2 or ADLS is a service built for this purpose.
It will help you build a Data Lake on Azure Cloud where you can easily organize your
data and retrieve it whenever you want. Number five on the list is Azure Stream Analytics.
So, in today's business, we see a lot of applications of real-time data streaming, right?
Google Maps, Uber, all of these applications give you real-time notifications about your
events. If you order something from, let's say, Zato or Amazon, you get notifications at each
and every place, such as your order is confirmed, your order is shipped, your order is returned.
All of these pieces of information are coming to you in real-time manner. So, if you want
to build an application like this and if you want to process your data in real-time, Azure
Stream Analytics is a service you should go for. Number four on the list is Azure
Machine Learning. Now, this service is mainly used by data scientists and machine
learning engineers, but as a data engineer, it is really important for you to understand
all of these different services because you will be working on the business side
and also the engineering side. So, it is really important that you understand the use
cases of Azure Machine Learning services, and this topic is also important for your certification
exam. Azure Machine Learning service is generally used for building, training, and testing your
machine learning models on a large scale. Now, these were the seven services that are
important, but the next three services that I will be talking about are some of the most
important from a real-world point of view and also from an exam point of view. So, number
three on the list is Azure Data Factory. Now, we understood, right? As a data engineer,
we have data coming from multiple sources; we want to process all of these data and store
this data onto some target location. Now, Azure Data Factory service is built for that purpose
only. So, if you want to create the data pipeline on a large scale where you want to extract
data from some sources, do some transformation, apply some logic, and build your pipeline in
between by aggregating different logics, you can easily do that using Azure Data Factory. It has a
drag-and-drop environment, so you can choose the actions based on your work and create the complete
data pipeline. The best part about this service is that you can integrate this service with any other
services available on Azure or even outside Azure. So, if you want to retrieve data from, let's say,
some APIs, you can easily do that on Data Factory. Number two on the list is Azure Synapse
Analytics. So, at the end of our data pipeline, we want to aggregate all of
the data, put it in structured format, and load it somewhere in Azure Cloud. You
can load your data on Synapse Analytics, which is kind of like the data warehouse service
provided by Azure. You can also do a lot of things on Synapse; you can also build your Data Factory
pipeline; you can also do the transformation; you can write SQL code; you can write Spark
code, all these different things provided by Azure in one single service. So, you don't have
to worry about jumping to different services. But the core idea behind Synapse Analytics
is to build a data warehouse so that you can analyze your data in an efficient manner.
So, if you want to understand, let's say, in the case of Amazon, like what was the last
six years of revenue compared to this year or how many products did we sell this month compared
to the previous 6 months, you can do all of these analytics using SQL and also Python, the Spark
language provided by Azure Synapse Analytics. And the number one on the list is Azure
Databricks. Now, Databricks is like the managed environment for Apache Spark. Apache Spark
is an open-source framework used for processing Big Data. But again, the installation of all of
these different tools is very difficult; you have to configure a lot of different things. So, if
you use Databricks, within one single button, you can configure your entire environment and just
focus on writing your logic and do not worry about the underlying environment. Now, Microsoft did
the partnership with Databricks and integrated this service on Azure Cloud. So, you can get most
out of all of these different Azure services. So, if you want to connect Azure Databricks with,
let's say, Synapse Analytics or Azure Data Lake Storage, you can easily do that because Microsoft
provides all of these different integrations in the backend. So, within a few lines of
code, you can easily access data from ADLS, do the transformation, and load your data onto
Synapse Analytics using Azure Databricks services. So, these three services that I
talked about, Azure Data Factory, Synapse Analytics, and Azure Databricks,
are some of the most important services you need to focus on if you want to learn Azure
Cloud from a data engineering point of view. Now, these were the important services, but there
are some other services or the new services that are coming in the market. So, one of the
services we have is the Data Fabric. Now, this service is very new in the market; it
was launched 6 months back. So, all you can do is just learn about this service and what
is the use case of it. As we move forward, we will understand how companies are
using this service in their business, and then we can start talking about
if this service is important or not. And if I missed any of the services that you
have in your mind, and if you're working on Azure Cloud, then you can definitely comment
it out so that I will get to know about it, and also other people in the comment
section can also learn about it. So, we talked about some of the important
services; now let's focus on how you can learn all of these services in the first place. So, what
is the roadmap to become an Azure data engineer? So, the first step I always suggest is to not
run behind the theories and the courses and all of the other things. The first thing you
need to do is go onto the portal.azure.com and create your free account. Azure provides
$200 credit for free and a 12-month free trial on some of the services. So, the first thing I
want you to do is go create your account. Okay, it is completely free; all you have to do is
fill in the basic information. You won't get charged for any of these things. And once you
do that, you can easily access the Azure Cloud portal and start understanding about the
different services provided by Azure. This first step is very important because you
will get used to the UI of Azure Cloud. Because whenever we start learning about something
new, there is a little bit of discomfort in our brain that new information is coming
to you and you don't understand it. So, all I want you to do is the first step is create
your account. Once you create your account, you will get the confidence that you took
the first step to learn something new. Now, once you create your Azure account, then the
second thing you can do is go through this amazing free training provided by Microsoft. This is
freely available; all you want to do is go through this step by step and learn everything. I will
provide links to all of these different things in my Notion document so you can just check the link
in the description and understand more about it. After doing this training, you will understand
most of the things you need to know as an Azure data engineer. But doing a good course on this
will give you more confidence that you actually understood all of these different things using
hands-on practice. So, this is one of the best courses available in the market on Coursera:
Microsoft Azure Data Engineering Associate. Now, this course will teach you everything you need
to know to become an Azure data engineer. This course starts with the basics, so first, you will
have a clear understanding of data engineering, why we do data engineering in the first place and
the impact of it. And then as you move forward, you will get more understanding
about the different services, how they come together, and you
will also get hands-on labs. So, this course is freely available
if you just want to audit it, but if you want to get the certification,
then you have to pay for it. So, it's up to you. If you just want to learn
from it, just go through the videos; you will get most of the understanding,
and then you can move on to the next step. So, once you complete your course on Coursera,
then the next thing I suggest you do is understand how to combine all of these services and build
a final project. And for that, I have one free end-to-end project available on my YouTube channel
called "Olympics Data Analysis - End-to-End Azure Data Engineering Project." This project will give
you an understanding of the various services, the important services that we talked about, such
as Data Factory, Databricks, Synapse Analytics, ADLS, all of these important services, and
you will build a final project. So, I will highly encourage you to do this project after
completing all of these different courses. So, after completing this project, your confidence
will skyrocket, and you can easily appear for the certification exam. If you want to validate
your knowledge and stand out from the crowd when applying for jobs, you can get the certification
from Microsoft. They have the certification available for Azure Data Engineering Associate.
Having this certification on your resume makes you stand out and validates your knowledge as
an Azure Data Engineer. When a recruiter sees that you have obtained this certification,
the chances of getting calls for interviews increase. I cleared this exam in just 7 days and
have created a complete guide on it as well. So, if you want, you can watch this video
to gain a complete understanding of how to get the certification, including
information about the cost, the time it takes, and the different resources I used to clear the
certification. That's all for this video. If you've learned something new and found this
video helpful, don't forget to hit the like button. This helps the channel grow and reach
more people. If you're new here, don't forget to hit the subscribe button. Thank you for
watching, and I'll see you in the next one.