Learn Snowflake in 10 Minutes| High Paying Skills | Step by Step Hands-On Guide

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
You use applications like Instagram, Facebook,  Netflix. You even know that these companies   collect your data. Everyone knows about it;  everyone talks about it. These companies   collect the data because they want to make  better decisions, understand their customers,   and improve the overall business process. But  do you know how all of these things happen from   the technological standpoint? In this video, I  will give you a complete understanding of it. The growth of data is massive and  exponential. As for the recent report,   90% of the worst data was generated in just the  last 2 years, and this report is just 2 years   old. If you look at the bigger picture, you use  your phones and computers on a daily basis. The   amount of data that is getting generated and  processed on a daily basis is massive. Now,   this data generally gets stored on some  relational databases like PostgreSQL or   as a text file on some file-based  storage system. But, at the end,   all of this data gets stored at one place.  It is something called a data warehouse. Data warehouses are the system especially built  for analytical workloads. So when you want to   store a huge volume of data and read them in bulk,  you can easily do that using a data warehouse. You   can also find answers to questions such as: How  much time did this user spend on this page today   compared to yesterday? How did we do on this  particular product sale this year compared to   last year's? All of these questions can be easily  answered within seconds using a data warehouse. So, when companies like Google  and Facebook started growing,   they started collecting massive volumes of data,  and they wanted to process this data and find   valuable insights from it. Those were the days  when very few people had access to the internet,   and all of these write, so the amount of data  that used to get generated was very less,   and it was very easy to process  them using traditional technologies. But as more and more people started  getting access to the internet,   the data started growing at an  exponential rate. At this time,   the traditional data Vos technologies were  not capable enough to handle the huge volume   and the speed at which the data was getting  generated. They had the architecture like this:   This is the Shard Disc architecture where you have  one disc and multiple users trying to connect via   some network. Or the other architecture was  like this, where you have shared databases,   but to run the query, you have to use the  distributed query across different nodes. In a nutshell, all of these technologies  were not able to handle the speed and the   size of the data that was getting generated at  that time. Traditional data warehouse systems   started to struggle. They also required  significant time and resources to scale.   Your database's performance was the big issue.  If you wanted to process a huge volume of data,   it might take days or even weeks. And  more than that, the cost of managing   all of these things was very expensive.  Businesses were losing out on valuable   insights because they couldn't process  data on time or in an efficient manner. And the last thing is that all of these data  warehouse technologies only supported limited data   types. So if you wanted to store the data, you had  to make your data into a structured format. So we   have something called ETL (Extract, Transform,  Load), where you extract data from multiple   places, do some transformation, and then you load  the structured data onto the data warehouse. So   you have to do this entire processing before  you even load your data onto the data warehouse. So all of these technologies  couldn't handle the new age of data,   and we needed something modern. And this is where  the Snowflake database comes into the picture. Now, before we go forward, I just want to say  that I'm not sponsored by Snowflake at all.   This is the modern database that is gaining  popularity, and you will understand why as   we go forward in this video. Snowflake is a  new type of data warehouse available in the   market that is entirely on the cloud. The cloud  means you are using someone else's computer. So, in the traditional data warehouses,  you had to buy your hardware, make sure   everything scales properly, update the software.  Even if you rent the server from someone else,   you had to manage most of the things by  yourself. But the concept of the data   cloud changed the entire game. Now, you don't  have to worry about all of these things. You   just need to focus on your business side  and make sure how you process your data. What's cool about Snowflake is how it processes  and stores your data. It keeps your data storage   and your computer layer separate so that  businesses can store more and more data and   also process this data in an efficient manner.  So, this is the architecture of Snowflake. To   understand it better, at the bottom, we  have the data storage layer where all   the data really get stored. Then, we have the  compute layer where we can allocate resources   to process our data and run queries. And we  have the cloud service layer where you can   access different features like authentication,  security, and manage the overall infrastructure. So, for example, if you have multiple  teams working within the organizations,   you can create something called as the virtual  warehouses where you can allocate different   sizes of CPUs and RAM for different  teams. And based on the requirements,   they will only use the allocated resources.  This way, you will not face the performance   challenges. And even if you need more  resources, it will scale all of the systems.   This makes Snowflake one of the most powerful  database technologies available in the market. That is not it. Snowflake is adapting to support  the next wave of enterprise workloads. There are   so many new things that Snowflake has added  based on the problems they see in the market.   For example, if you want to store structured or  unstructured data, you can easily do that. If   you want to query data that is stored at some  different location, you can also create tables   on top of it and start querying it. You don't  have to move your data from that location to   your location. Also, you don't really require  the ETL part. You can directly load your data   onto Snowflake and then do the transformation  using Python code, SQL code, or even Spark code. One of my favorite features on Snowflake is  something called a Snowpipe. So, if you want   to create the data pipeline based on some event,  let's say your data is coming onto Amazon S3. So,   whenever any new file gets uploaded, the Snowpipe  will get triggered, and it will directly store   your data onto the Snowflake table. So, this is  one of the most powerful features I have seen. What Snowflake does is that they work with  large enterprises and try to understand   what the real problem is. And based on  that, they try to solve these problems   by building the right features. And they have  like hundreds of features that you can explore. So, let's look at an end-to-end example to  understand Snowflake in action and see what   you can do with it. Now, before we move forward,  I just want to plug my course here. Is that if   you want to learn the data warehouse technology  that is one of the most important skills you need   to know as a data engineer, I have created an  in-depth course where you will learn everything   about the data warehouse fundamentals. You will  do multiple projects, and you will understand   the Snowflake database in-depth. It took me  3 to 4 months to build this entire course,   so I will highly recommend you to at least check  it out. You will find the link in the description. Let's continue with our example.  Whenever you want to learn anything,   the first step is not to find courses,  resources, books. The first step is to   go to the website and create your account.  Okay, just get started. All you have to do   is just write "Snowflake" in your browser,  and you will be redirected to the Snowflake   page. Here, you will understand what  Snowflake is and everything about it. The first step we want to do here is to create  your account for free. They provide three trials   for 30 days, and you also get $400 worth  of credit. So, all you have to do is just   fill your information over here. So, I'll do  that. Once you fill your basic information,   then they will ask you to choose the  Snow Edition. You have three options:   Standard, Enterprise, and Business  Critical. So, we will go with the   Business Critical version because that  provides most of the features that we want. After that, you can choose your cloud  provider. So, if you're working in a   company and they have their existing  infrastructure on a particular cloud,   you can choose one over here. I'll just go  with AWS. And region, you can select the   nearest region as per your location. So, for me,  the nearest region is Singapore. Click over here,   "Get started." You can fill all of these  basic information, or you can just skip this. Once you do that, you will get the  activation mail on your email. So,   you can go to your registered  email, and here you will find   all of the information. Just click onto  the "Click Activate." Once you do that,   you just have to build your username, and then  you will be redirected to the Snowflake UI. So, this is the basic tutorial. So, the  important thing over here is that you   bookmark your URL. You can also follow the basic  tutorial, but for now, I'll just skip this. So,   this is the UI of Snowflake. As you can see, you  will get the worksheet. Worksheet is basically   where you write all of your SQL queries. Then  you have the dashboard, so whatever dashboard   that you create will be displayed here. We  also have the Stream Lead if you want to   create interactive visualizations, applications.  So, they also provide third-party applications   that you want to integrate. Then here you will  find all of this information about your data. So, all of the databases that you have,  whatever the sharing of the data that you do,   so overall management of the data can be done  over here. They also have the marketplace. So,   if you want to get some data for trial purposes,  if you want to practice SQL on reliable datasets,   then you can find all of these datasets already  available for free. So, you can play with the   Snowflake UI and get more understanding  about it. So, if you just play with it,   you will get used to this UI, and you will  get more comfortable learning Snowflake. So, all you have to do is just click on this plus  icon and click onto this SQL worksheet. Then you   will see, this is my Snowflake. So, this is  all of my data, and this is the sample data   that you can use to write the query. So, this  is pretty simple. First, you make sure that   you have selected the warehouse on top of it.  Then, over here, you select your database. So,   you just click onto the database that you want  to use. You can also select the specific schema. So, in this case, I will go with this particular  schema that has the tables and data that we   want to query. So, the first query that you can  write is something like this, which is a select   query. So, over here, the query I'm writing is a  select star from this particular database. Then,   there's a schema name, and there's a table name.  So, if I run this using this, you will see it will   start running this particular query on this  sample table, and you will get the output. So, this is scanning the entire table and giving  you the final output. So, as you can see, this   is a table information. You can see it over here.  Currently, it is only displaying the 10,000 rows,   but this table has like millions of rows. So,  it has like 15 million rows available. You can   also get more understanding about the query, like  how much time it took. If you want to debug that,   you can just click onto this query ID, and  you will get the information about each and   everything that this query did in the backend.  It did the table scan and gave you the result. So, you can also limit the result set just  by putting the limit as 10. We can also run   more queries. So, let's say if you want to run  something like this where I'm just aggregating   this entire quantity as a total quantity sold  from this particular table, if I run this,   we'll get the complete information  about the total quantity sold. Now,   I can do more analytics on top of it. It is pretty  simple. You just have to write the right query.   If you have the basic SQL understanding,  this might be pretty simple to understand. What I'm doing is that I'm just trying to query  this particular table and joining this table   based on the customer key that is available on  both of these tables, the order table and the   customer table, and then selecting these four  columns. It will take like 30 to 40 seconds   to run this query, and once you do that, you  will get the final result set given to you. If you want to understand about  your compute warehouse resources,   you can just go to the admin. You will find  the warehouses over here. As you can see,   we have the compute warehouses that are currently  running. If you want to create your own warehouse,   you can just click onto this plus, give  the warehouse name as "test," for here,   you have all of these options available. You  can just click onto the create warehouse. You also have the advanced option. So, if  you want to auto-resume it or to suspend it   after some minutes, Snowflake has a lot of  different features: time travel, Snowpipe,   and all of these other things. I can't  teach all of these in just an overview   video. As I already told you, I have a  detailed course on data barrows for data   engineers using Snowflake. So, I highly  recommend you to at least check it out. This was the complete overview about Snowflake.  Again, the intention of this video was not to   make you a master of Snowflake but give you a  quick overview about this technology so that you   get the confidence about learning this particular  thing. This is pretty easy. All you have to do is   just understand the UI and bit and pieces of it.  Once you do that, you will get the confidence. Okay, so this was all from this video.  I hope you gained clarity. If you did,   then don't forget to hit the like button  and subscribe to the channel if you're   new here. Thank you for watching.  I'll see you in the next video.
Info
Channel: Darshil Parmar
Views: 179,496
Rating: undefined out of 5
Keywords: darshil parmar, darshil parmar data engineer project, data engineering, data engineer, snowflake data engineering, snowflake datab, data warehouse, snowflake database engineering, snowflakedb, learn snowflake, learn data warehouse, learn big data, Learn Snowflake Database in 10 Minutes | Step by Step Hands-On Guide, learn snowflake database, step by step guide
Id: VIJH7TZXkaA
Channel Id: undefined
Length: 11min 17sec (677 seconds)
Published: Sun Feb 04 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.