SQL Portfolio Project - Part 1 - Zomato | Analytics | Ashutosh Kumar

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
everyone I am again back with a video onto the SQL portfolio project now this particular video will be more interesting than all the previous portfolio projects which I have covered under the sequel because into this particular video we'll be covering a lot of different advanced level Concepts so we'll be doing the data cleaning task the data exploration tasks so I have accumulated lot of different problems from medium level to advanced level and these kind of problems will help you to clear the interviews and also you will get an idea like what kind of problems you can expect into the real interviews so it is important from that perspective also you can also consider this to be a practice set so I have a set of problems you can try your hands on to all of these different problems to solve all of them also you can add this particular project onto your portfolio and that will add a lot of value to your resume as well thank you so much for showering your love and support to all the previous portfolio projects and I believe this particular video will also be helpful into this same aspect as those videos were so I would request you to please like this particular video and also subscribe to my channel if you haven't to receive such kind of useful information because into the future I'll be bringing much more different portfolio projects onto different topics onto sequel friends onto the previous video onto the portfolio projects I have been getting a lot of different frequently asked questions so I'll be covering almost all the questions into my future videos so do regularly check out my channel for all of those questions so this particular video I'll be exploring the data set of zomatoes I've gathered a lot of different problems onto the data set we'll be doing different data exploration tasks here on to all of these problems so these problems they range from medium level to advanced level of problems and we'll be using different concepts to solve these problems so I have taught all of these Concepts into my Advanced SQL playlist so I would highly recommend you to check out all those videos into that particular playlist and those videos are more than enough to clear your Concepts so all of these problems into this particular data set are quite different from each other with each problem you will be moving a level up and learning different concepts and you will be applying your Concepts and you'll be forced to think to solve these kind of problems because a lot of different kind of concepts are being applied into one single problem to solve a problem you need to have all your Concepts cleared since as we know zomato is a company where in the application we can just order our food whatever we want and so we have a similar kind of data set into our sheet as well we have the data around the customers and their order behavior and what type of food they are ordering also you know glomato has this gold membership sign up where the user any user let's say I'll be just be subscribing to the gold membership and I'll be availing certain extra benefits such as there is no delivery fee I'll be getting extra items on my order also I'll be availing certain extra discounts so these are some of the advantage of the gold membership so I also have included some dummy data sets onto the gold membership customers and I've created multiple problems onto the same data sets so these will help you to cover a lot of different range of Concepts so friends all the data set related to this video all the files be attaching to the description box of this particular video so do check out the description box I've attached all the files and also don't forget to hit the like button and also subscribe to my channel if you haven't and I give you the full authority to copy each and every line of this particular code but I would request you to understand before copying or grabbing or writing this particular project into your resume understand each and every concept each and every part whatever I am doing into this particular video will be helpful for you only so you can also from your end explore different data points onto the same data set also you can create power bi dashboards that can add a lot of value to your resume as well because you are showing your skills on SQL as onto the power bi so friends let us quickly jump over to our data set and let us start exploring the zomato data set friends before starting the portfolio project let us first of all understand the data set in a complete Manner and then we'll be looking forward onto which on points we'll be exploring the data or we'll be calculating the different data points we'll be seeing that so the data which I've taken for this particular portfolio project is kind of similar to the zomato data set so for example you can take this zomato gold membership with paying a certain amount and then you will be having certain benefits so the customers they Avail these kind of benefits by taking the zomato premier membership or the gold membership so similar to that kind of data I have here so all those customers who have taken the gold membership they will be availing certain extra discounts or certain extra offers so you can see onto my screen I have four different data points or data sets so these data sets are quite simple in nature into the very first data set goal users underscore sign up them into the very first column I have the user ID adding to the second column I have the gold sign update I'll be coming to each of the data set one by one so the second data set I have the sales data set into the very first column I have the user ID into the second column I have the created at and into the product ID like which is the product which was purchased by the particular user ID created at is basically the order date on which the order was placed by the particular customer for that particular product ID for example if I need to explain a certain row user ID S3 has purchased the product ID as one onto this particular dates so into all of these rows it's just the same data data regarding the orders or the product which was bought by the particular customer and on to which state the customer has bought that particular product into the third data set you can see I have the product ID and the product name and the price of the that particular product so into the product ID all the different product IDs are mentioned here into the product name are mentioned here and the price of all of these different products are mentioned the very last data set which I am having is the users into the very first column I have the user ID into the second column I have the signed update now you will see I have two different data sets just for the users it is very much clear that not all customers were signed on zomato will become a gold customer will take a gold membership only those customers who are interested they might take so that's why we are having two different data sets here so into one of these data sets I have all the users information let's say I'm a customer I have just signed up so I'm having the customer ID as one and this my sign update is mentioned here similarly let's say I am taking a premium membership so in that case I will be counted into the gold users underscore sign up data set also because I am a board member now so that is why my data would come up into this particular data set also so we'll be solving around 10 to 15 different kind of metrics which will be useful for let's say zomato for their business purpose let's say if I am into this strategy on the business team I will be tracking on to these 15 different metrics that can be useful for the business so this is a simple data exploration onto the zomato data set here I have imported all of these data sets into my SQL Server management studio so it is very much useful for me to explore all of these different data points so you can see here I have just imported exactly the same data set as I am having onto my Excel sheet let us start the data exploration journey and you can also explore various data points from your end also and please feel free to use each and every line of the code which I am writing here into your project for your resume that can help you to break into the analytics journey I would request you to understand each and every query which I'm writing here so that this can be useful into solving multiple level of complex problems which you can encounter into your AWS so friends let's start with the very first metric and which is what is the total amount each customer spent on zomato now I'll be covering from a very basic level from the very simple queries I will be covering multiple level of advanced level of concept into this particular portfolio project also so this can be considered as a basic data exploration onto the somatic data set and including certain data points which can be useful for the business purpose so here we need to calculate what is the total amount each customer spend on zomato so this is very much simple so so for solving this particular problem I am concerned with two of the data sets the very first data set is the sales the second data set is the product now why is that so if you will see here I am having the sales table here and I am having the product table here both of these tables I want to get the total amount which has been spent by each of the customer on somando so first of all you can see I'm having the amount column here like the amount of each of the product and I am having the sales information here like the user ID which is the product the user has purchased so I need to join first of all both of these tables and what is the common column on which I will be joining both of these tables so that is the product ID so let us do this very first step let us do a select a DOT user ID comma a DOT uh let's see product ID comma D dot price from sales inner join product on a DOT so what is the common column on which we need to join both of these tables so that is the product ID column dot product ID is equal to B Dot product ID so sales is a product is B let us execute this particular query and we want to know what is the total amount of order which has been purchased by each of the customer so into this particular query only we'll be using the group by function and we'll be grouping by Under the basis of the user ID so I'll just remove this product ID Group by user ID a DOT user ID and this should be some over price because we want to get the total amount which has been spent by each of the customer that's it let us execute the query here so you can see here user ID has one spend some mode rupees similarly we are reading the data for each of the customer so we should rename this column as total let's say total amount spend and say that all is required I'll just simply execute the query so you can see here I'm getting the desired data what is the total amount which has been spent by the customers on zomato the on their history so this is the very first data point let us move to the second data point which we need to explore and that is how many days each customer visited zomato so we need to calculate let's say for customer ID as one what are the distinct dates on which the customer visited zomato website or purchased from zomato so we want to get this particular data so we will be running our queries onto the sales table here and for huge and for each user we need to calculate the distinct created date which is mentioned here so let us do so we'll be using select user ID comma account of distinct created date com sales here also we need to group by Under the basis of the user ID column let us execute the query and this query would give the total number of distinct days on which each of the customer has come onto this zomato website to buy any product so this is the solution for the problem number as to let us rename this as distinct days this executing query again and you can see the column has been renamed it's been filed and the third number problem here is was the first product purchased by each of the customer so for each of the customer we want to know like which is the first product which is being purchased by every customer who is signing up now this is a very important data point for example I am running any e-commerce website and for every customer who is joining onto my website I want to know which is the very first product which is being purchased by each of the customer and if I know like okay 75 percent of the times a single product is being purchased by all the new users which are coming onto my website let's say I am having three different products here product ID but all the customers were joining onto my website their very first purchase is a certain product this gives a simple indication that the customers are getting attracted to that particular product and because of that they are joining our website and also buying the product so this is a very good metric here with let's say tomato if they are interested into they can look into this particular metric so for any business this particular thing is very much Essential for any case so let us get this particular data I want to know the very first product which is being purchased by each of the customer let us do this we will just do a selling star comma we'll be using a rank function and we'll be ranking over each of the customer purchases and we'll be having the partition so on to what basis partitions they need to be created they need to be created onto the basis of the user ID so for example I'll just copy paste this data here to help you explain this into much more better way so for each of the user here a partition will be created for example if I just filter out the data for the user ID as one so this is the data or the partition which would be created for the user IDs one similarly I'll just filter out the data for the user ID as 2 and the same thing I'll do for the user ID as three so all these partitions need to be created when I am using the rank function and within each of these partitions I'll be ranking my purchases which have been done by each of the customer so I need or I'll be producing a extra polymer which will be termed as Rank and for each of the columns or entries or rows for each of the partitions which have been created here I'll be giving a rank based onto details so this is my very first partition this is my partition number as 2 for the customer ID as to and the last one is the partition for the customer ID S3 partition number S3 is being created so I'll be using the rank function to do this and within each of the partitions I'll be providing the rank onto the basis of the this particular created ad and the Very listed would be given the rank as one and later on we'll be just filtering out the data where the rank is equal to one so rank over Partition by user ID order buys on the what this is the rank needs to be provided so the rank needs to be provided onto the basis of the created and this should bring to the ascending order because the rank 1 should be provided to the less State the very previous state the rank one should be provided from us rank from sales write this into a bracket before proceeding and letters execute the code so you can see here I'm getting product ID rank created date and user ID so for each users I'm getting a rank so you can see here now after this I'll be just filtering out the data select star from this give this an alias where rank is equal to one and this will give me the data of the very first purchase which has been done by each of the customer so you can see here the thing which I was telling upon is the product ID as one so this is the very first product which is bought by each of the customer who is joining so Mando this is a simple indication that all the customers they are being attracted to this particular product and let's say the business needs to invest a lot into this particular product because a lot of different customers would be attracted into the future so these are the simple Matrix or the final conclusions which you can derive from the analysis which you are doing here the exploration portfolio project task is to generate simple analysis for all of these cases so let us move to the fourth problem or the fourth data point which we need to solve what is the most purchased item on the menu and how many times was it purchased by all the customers now this is a different kind of problem earlier we needed to find what is the very first product which is being bought by the customer into this particular point we need to find what is the most purchased item onto the menu by any customer here we are not concerned by the customer who has purchased customer ID as one as purchased to as purchased we're just concerned with the product which has been bought the most number of the time and we'll be seeing like how many times each of the customers are buying that particular product which is bought a lot of many times this will give us a lot of money indications for the business purpose obviously so let us write the query to find the product which has been bought the most so here also I guess we'll be using the sales table only first of all I'll try to find the product which has been bought the most number of times so I'll just use the product ID column and count on product ID column so that I get to know like how many times a certain product was bought into the history so product ID comma count of product ID from sales obviously this is what is the output which is being generated we have three different and distinct products into our data set so we for three different products data would be generated so we'll be grouping by onto the basis of the product ID let us order by also under the basis of the count of product ID and into the descending order because we are concerned with the product which was bought the most number of times into the history let us simply read from this particular execute this particular query so you can see here product ids2 was bought the most the seven number of times into this stream so this is calculated on the entire level we are not calculating on the basis of the customer level so you can see to the column one please don't get confused so this is a product ID column product and DS2 was what seven number of time product IDs one was bought five number of times so this will give a simple indication like product id1 was though people are getting attracted and they are coming onto the website they are buying this particular product ideas one this is the very first purchase but people they are not interested in buying it regularly they are interested shifting or finding product idea is too as much more better to purchase for their purpose so that is why because you can see the customers are getting attracted to product ideas one but they are much more interested and regularly they are buying the product IDs too because that is the most bought product into the history of this amount of purchase as you can see here the product ID S2 has been what seven number of times so this is a very good signal for any business we'll just rename this column as something as CNT obviously and from here I want to get the product ID so to get the product ID I would have used the limit but limit does not work somehow in the ssms so we need to use the top function so you can see it is throwing in error here so instead of limit I will be using the top function which works perfectly fine and does the same thing here into the ssms so I'll just write the top one product and you can see here I'm getting the product now I'll be removing to just get the product ID which was purchased the most number of times I'll be removing this count from here select and execute so you can see I'm getting the product ID so this is the product which is being bought by the most by any customer and the second point which I need to know is how many times was this particular product which is the product ID S2 is being purchased by any customer so the very first problem is sorted so this particular problem it has to it had two parts the very first part was to calculate or to find which is the most purchased item on the menu so we have found that so after this I want to know how many times was this particular product is being purchased by any customer so I'll just do a select star from sales again where product ID is equal to this particular product ID so I just want to figure out the data for the product ID has to let us see this I'll just execute the code so you can see here I'm just getting the information for the product ID S2 now after this I've gotten this particular thing I'll be doing a user ID comma count of product ID so that I had to know how many times each of the customers have bought this particular product from sales web product ID is this and at the very last I'll be using the group 5 function and grouping bionic basis of the user ID let us execute the code here you can see here user ID has one has purchased the product ID as2 which is the most purchased product into the history of the battle three number of times similarly user ID S2 has purchased the product only one number of time and the user IDs 3 has purchased the product three number of times which is the most purchased product that is the product ID S2 so I hope this particular problem is quite clear and simultaneously my aim is also to explain why we are calculating these data points how we can use them so obviously after this particular data which is being generated you can create beautiful dashboards onto the results which are being generated here and present into your portfolio project you can add a lot of value into your resume as well and it can help you to crack interviews also this portfolio project videos can be taken into account to learn the different SQL concept to solve different problems so these kind of different problems they are being asked into the interview so I get a question a lot of many times like send me different kind of problems which are asking today interviews so similar to these kind of problems which I'm solving right now are generally asking to the analytics interviews so friends let us move forward to the next question which item was the most popular for each of the customer so we want to know for example for each of the customer customer ID is one two and three which is the favorite product for each of the customer in other terms because most popular product is simply the product the which the customer is buying a lot of many times so the customer is buying a lot of many things that simply means that the product is favorite so we want to Simply calculate or want to know which is the product which is being purchased by each of the customers a lot of money times so I'll just do a select start from sales just to look at columns this problem is also quite simple and you would know if you think like why this particular metric is very essential for any business to track knowing about all the customers very important so that is why we need to calculate this particular data point I'll just write here select user ID for each of the product like I want to know for each customer for each product how many times the customer has bought that particular product so we need to mention this here so user ID comma product ID comma count of the product ID I need to mention this from sales and last I need to group by Under the basis of the user ID and also to the basis of the product ID so this is my final query which will give me the answer which item was the most popular for each of the customer so you can see here I am getting for example user ID as one as only purchased the product ID as one two number of times so we are getting this information but what is the final thing which we required we require which item was the most popular for each of the customers we are getting a lot of many different rows so I'll just put this into a sub query and I'll use this to provide a rank I'll be using the same concept so for each of the partitions which will be created I'll be giving a rank so I'll be creating a partition under the basis of the user ID I'll be ordering onto the basis of the count column definitely going to a descending order because the product which is being bought most number of times is being loved by the specific customer and that is the most popular product for that particular customer so I'll be mentioning this into a descending order I'll be renaming this as a rank from this particular data set I'll just switch on the query I'll just execute the query okay this should be pi you can see here we have given a rank for each of the data point which is mentioned here and after this I just want to filter out the rank one because I want to know which is the most popular product for each of my customers so I'll be ranking or filtering out where rank is equal to one so this data point I need I'll just execute the code finally so you can see here I am getting the value so user ID as one his favorite product is two user IDs two pro favorite product is product and DS3 and the user ID S3 the favorite product of user IDs three is two so you can see here I am getting all of these different data points so by this particular data point I am getting to know which is the specific product which is being liked by each of my customer so I can find a lot of many different things from this so for let's say in a particular state or any particular reason a product is being liked by almost all the customers this gives a very good signal about the product or our customers that they are liking that particular product so so we can definitely run offers give different offers for that particular product so that our sales get increased so friends these are the different conclusions which we can derive from these different Metals friends this was the part one of the SQL portfolio project I'll be releasing the part 2 of this project next week or I'll be attaching the link into the description box of this video so if you have any queries or suggestions any doubts you can comment them down into the comment box of this video I hope you've liked this video don't forget to hit the Subscribe button meet you in the part 2 till then keep practicing bye
Info
Channel: Ashutosh Kumar
Views: 145,168
Rating: undefined out of 5
Keywords: sql, sql for analytics, sql tutorials for beginners, portfolio project sql, sql portfolio project, portfolio project on sql for resume, sql interview portfolio project, sql interview questions and answers, sql for data analytics, sql data science, interview questions on sql, sql hard questions, sql problem solving, ashutosh, ashutosh kumar, ashutosh kumar analytics, sql portfolio project alex the analyst, sql portfolio project shashank kalanathi, sql portfolio project krish naik
Id: BlN4U7GF704
Channel Id: undefined
Length: 29min 46sec (1786 seconds)
Published: Sat Nov 12 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.