Business Analyst Full Course [2024] | Business Analyst Tutorial For Beginners | Edureka

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] as organizations increasingly rely on data and Technology to drive their operations and make informed decisions the need for professionals who can help bridge the gap between technology and business has grown and then there comes the business analysts who play a crucial role in helping the organization as a result the demand for business analysts with the right mix of Technical and business skills is high and is likely to remain so in the future hello everyone and welcome to this session you are currently watching an edureka business analytics full course video by the end of this video you will have a thorough understanding of business analytics from Theory to practical applications now if you love watching videos like these then subscribe to edureka's YouTube channel and click the Bell button to never miss out any updates from us also if you wish to obtain edureka's certification course then please see the link in the description below now it's time to begin with our agenda where we'll have a brief overview of what we will cover in this business analytics full course video we will start with understanding the current trends and how to become a business analyst next we will see what does a business analyst do now it's time to delve deep into the technical concepts of business Analytics we'll start with the predictive analysis using python after which we will work on data analytics using Excel next we will go ahead with understanding the agile methodology followed by this from methodology we will also see how to become a bi analyst well we truly hope that this session assist you in getting jobs in the industry in order to accomplish this we will look at some of the important business analytics interview questions with answers so stick till the end now let's start with a session with how to become a business analyst now getting started with why become a business analyst now first and foremost aspect of becoming a business analyst is looking for job opportunities so when it comes to job opportunities comes with no surprise that there are ample amount of job opportunities all across the world but having numbers will always help us right so with that the first number we are going to consider here is for India which is around 13 000 vacant jobs while in U.S there are about 100 000 vacant jobs now this includes both for freshers or even experience of course this data is being taken from LinkedIn and glass store and similarly which is known as the Silicon Hub in India Bangalore itself has around 2000 vacant jobs merely for business analysts which includes experienced and entry-level jobs and silicon Hub of the World perhaps California has around 4 000 vacant jobs again data as taken by LinkedIn and Glassdoor now similarly your next question with respect to why you should become a business analyst should be salaries we have got you covered on that basis as well as an average salary right from entry level to an experience average in India will earn up to 7 lakh per year and while in U.S will earn up to 77 000 per year now this again is taken by LinkedIn and Glassdoor now with these kinds of aspects and data in your hand why do you want to wait just go ahead and start your preparation now before that you'll have hundreds of questions to be answered now do not worry because we have got all your doubts to be solved in this video itself the next thing that you will have is companies hiring definitely there are ample number of companies hiring from basic entry level to experience level now the companies that are hiring are even in product based and service based now they could be high-end companies such as IBM Deloitte cap Germany or apple Accenture and to any of the startups as well now these are some of the companies that I've mentioned here there are many more adding on to this list with that said how to become a business analyst well there are many aspects again to consider first being the job description now in job description you can see what exactly a company or a recruiter is looking for in you as a business analyst firstly I've considered a Amazon job description here Amazon is looking for almost entry-level business analyst itself with just at least one year of experience now with this you can see in the highlighted portions that Amazon is basically looking for a highly Advanced province in SQL Microsoft Excel statistical analysis he might also be looking for a person who is well versed in data visualization tools such as Tableau or even power bi here quick site and he must be good in identifying and accessing relevant data with extremely powerful insights on analysis and to draw conclusions so from this we can draw just that for becoming a business analyst in Amazon as per this job description he needs to be good in SQL at least no few tools of Microsoft good in data analysis use Tableau or any such data analytical tools statistic analysis as well similarly why don't we look at one of the other descriptions to understand better here I've considered IBM and again in IBM as you can see SQL is predominantly asked the next thing is basic programming languages along with again Microsoft tools data modeling and modeling techniques are something else that's been asked in IBM here drawing conclusions out of this is again SQL Microsoft tools data modeling programming languages here now if you can see in the first statement daily scrum handling is also an important job of a business and list we'll come to that later in this session today the next thing that we are going to do is we'll look at one more final job description to draw another conclusion so that we have clear-cut idea of as to what a business analyst should be well versed in here I've considered Oracle where they are looking for a project accounting project setup maintenance analyzing data support and closely work with project team everything is related with a project now you can analyze here that a business analyst is a business analyst with respect to a project and with respect to a business now that depends on what company is hiring you so you are considering a project the skills that one needs here is project management Microsoft tools and data analysis again common are Microsoft tools and data analysis now with all these skills kept in mind we can draw some conclusions which are the skills required to become a business analyst now here in this slide that's what we are going to focus on from the job descriptions that we just went through we saw that SQL or any other database concepts are something that is very very important now knowing SQL is very important as SQL itself helps in understanding the data writing queries as a business analyst the next thing is Microsoft tools now if you are an experienced candidate and you're switching carriers towards business analysts it is an added Advantage for you as we as ID professionals would have come across one or the other or three even Microsoft Tools in a regular basis so this is an easier aspect for us so knowing excel in a advanced manner will always help you in approaching towards business analysis the next thing is analytical skills which is a very important aspect effect for a business analyst now we saw in one of the job description that programming skills is also needed of course you don't have to write codes here it is not a core technical job anyway so knowing basics of programming language will also help as you know business analysts do not have to do any core technical coding here so knowing basics of programming language will also help it now if you are from a technical background then if you have a little coding knowledge also will come in highity the next thing is statistical analysis now knowing statistical analysis is quite important as a business analyst he needs to run through all the data related documents or any kind of information which involves statistical decision making the next thing is program management skills now all these skills that I'm discussing here are actually the skills which you have training in all on our website now if you go to edureka.com you will find there is course for Microsoft Excel there is course for analytical skills any kind of programming language or even project management there is a certification that you can get in PMP as well so as a business analyst like I was saying project management is one of the basic skills that you need to know how to manage a project what is a problem statement how to satisfy the customer what exactly do the stakeholders keep in mind everything from end to end you need to know the next thing is Scrum management now when a particular project is being started from the person who is a developer who is an investor who is the client who's the end user everyone needs to be on the same page for this scrum is very important you need to manage what is going on today what is going to be tomorrow everything and this who does this is business analyst so scrum management is an important skill that is to be needed by a business analyst next thing is data visualization tools such as Tableau or power bi any such things with that we should understand the trends right now we have in business analysis the first thing according to 2020 agile was extremely unavoidable now agile technique was not even implemented until probably 2020. of course there were companies which were using also but then predominantly they were not implemented so then after 2020 they realized that agile is something that every business or every company using business analyst should Implement agile technique and then our Prime focuses customers as well said saying says that customers are the king and hence our focus should be on customers the next thing is obviously visual thinking superpowers for product owners now as a product owner it is important that they estimate everything related to the product now it is important to have your visual aspect visual power with respect to product as even the client needs to understand what your product is All About Now mind you we are also one among the product owners by we I mean business analyst now taking product Centric approach like I said customer is a king product is a queen now with customer and product being the Centric any kind of business can reach towards the end that said data driven forward planning now every data from whose owners who is the client or who is an end user anything should be put into data and every day a work should be with respect to that and planning should be made with respect to the data that we have in and the data that we have in hand and this is where our data driven approach or planning comes into picture finally we need to understand that cyber security is something that every business every company be it big or small must have and business analyst because this person keeps everything in account will always help you towards cyber security and maintaining the security of your data now data security is very important be it you are in a sensitive business or a open business as well that being understood some of the tools a business analyst uses in order to perform his analysis is modern requirements for devops click up smartsheet Adobe Acrobat blueprint jeera core smart draw and many others now these are the tools that are mainly and predominantly used in most of the businesses top core businesses per se now there are many more tools if you look up on the Internet probably you'll get you can go on and learn about any of them if that interests you now learning at least one will help you and come in handy as a business analyst [Music] who is a business analyst so a business analyst is somebody who Bridges the gap between internal I.T team and the business using data analytics to assess processes determine requirements and deliver data-driven recommendations and finally present reports to Executives and stakeholders so let's break that down a little bit shall we first a business analyst is a person who stands between the internal I.T team and the client that has certain requirements so a business analyst understands those requirements whatever it may be to build an app to get an Insight on what direction their business should take whatever it may be they understand all of the requirements for the clients and they act as a proxy for the clients and help the IT team develop the results that are going to be satisfactory for the client that will help the client make data driven decisions and that is exactly what a business analyst is so now you might ask yourself well that is all nice nice and dandy but what do they actually do in order to accomplish that so let's use an example of Stuart who is a business analyst working for an application development firm and so that's the business analyst and let's take a look at the client so the client in this example is Martha who's an I.T manager at a modern hospital who is tasked with making a mobile application that allows its customers to make appointments consult doctors view their medical reports and receipts and make payments for hospital bills through their online account all without having to wait in line which is so much of a discomfort and inconvenience for patients so Martha thinks instead of hiring a whole team to make this application she's just going to approach a company that is going to help her with this so Mata approaches Stewart's company to see if they can help her with making the app that's where she meets Stuart who's been assigned to work on Martha's project from start to finish my is the vision behind the app and Stuart probes Martha to form a picture of what the final product should be like he assures Martha that he will take it from here and deliver the product as per requirements budget and time frame but how does he do all of that well let's find out so in this section we will attempt to understand the roles and responsibilities of a business analyst by using the same example of Stuart and Martha okay so let's get started so first thing that Stuart has to do is to understand the vision behind the app what Martha and the hospital are trying to achieve with this app here he understands properly what the motives behind making these applications are next he gathers all the requirements from Martha by asking her the appropriate questions about the apps he does this so that he can understand what the app should look like and how it should function so he coordinates with Martha to gather all the functional requirements like features and how they the function customer data Security payment system automatic assistant on the app Etc and non-functional requirements like look and feel of the app after he documents all of this he then allots resources so what does that mean he assigns the various parts of the projects to respective teams such as the development team the user experience and user interface team and so on and finalizes all the tools and software they're gonna need for the project keeping in mind the budget of the client of course and the time frame so the project has been delegated to appropriate parties and they are all off to races in order to finish the project and so his next duty is to make sure that he tracks the progress and offers recommendation so what do I mean by that so as the project develops he monitors its progress to make sure that the clients needs are addressed he also makes useful suggestions to improve the project and points out corrections to make if needed in order to make sure that clients needs are met but not only that he also conducts meetings between Martha and the assigned teams to discuss and decide on how to rectify the issues quickly making an application is a complex process so there are bound to be mistakes that are happening and issues to be cropping up so to iron them out he conducts meetings so everybody is on the same page this also ensures that Martha is always in the loop and knows how the project is coming along so let's say they're worked hard and a prototype is ready so once it is ready he begins user testing and after thorough testing collects their feedback this helps with Improvement correction and finalization of the product this also helps with deciding whether the app has met the client's expectations it gives an idea on improvements and corrections to make which are communicated to the development team and this cycle can continues until a deliverable product is ready so the Prototype has been through a few iterations of Corrections and Improvement and now is finally Looking ready so now what Stuart does is he analyzes the gathered data of the app and uses data visualization tools like Tableau power bi to make reports that provide Insight on the app's performance these reports could be as simple as charts in a document explaining the data or could be in form of dashboard with multi-dimensional visual reports based on key performance indicators of the app or outside this example it could be used to portray business sales figures employee performance and endless other possibilities but stay with me let's stick to this example after analyzing the data and generating reports and putting it in a nice visual form now Stewart can move on to the last part of the project finally the project is complete Steward document it's all pertinent information about projects such as app documentation like how to use it maintenance processes and so on along with the findings and reportings after testing so what presents all of this to Martha and other stakeholders from the hospital and give them a walkthrough of the whole app and explains the reports and so now is the product delivery time Stuart carried out all his roles and responsibilities in marca's project and is ready to deliver the hospital Health app to her on time and in budget Martha her CTO and the hospital are all happy with the app it will help their customers and staff immensely [Music] let us understand what exactly predictive analysis is so what is predictive analysis Predictive Analytics or analysis encompasses a variety of statistical techniques from data mining predictive modeling and machine learning that actually analyze the current and historical facts to make predictions about future or otherwise unknown events so this is the basic definition from Wikipedia so we basically use the previously collected data to predict an outcome or an event so typically historical data is used to build a mathematical model in our case we can call it a classifier or a predictive model or a regressor which actually captures the important Trends and then the current data is used on that model to predict what will happen next or to suggest actions to take optimal outcomes so let us take a look at various applications where we can actually use predictive analysis so we can use predictive analysis for a lot of things first of all we have campaign management so let's say we have a campaign we have to figure out what kind of audience will be there or what kind of our Target audiences so we can analyze the previous data of our previous campaigns that we might have managed previously and according to that we can figure out some suggestions or you know the course of action that we have to take so this is one campaign management that we can do using the predictive analysis or for recent examples let's say an election campaign you know a lot of people are gathering a lot of data of previous elections like how it happened and what are the major factors that led to the winning of some so and so person so this is how we can use predictive analysis in campaign management then there is customer acquisition so we can analyze the whole business and we can figure out different points to you know figure out what kind of tasks or events we can actually produce in order to make our business better so that we'll be able to make customer acquisition better and then we have budgeting and forecasting as well similarly you know taking a look at previous data we can finalize some budget and forecast few related pointers for example we have stock prediction using python or any other language such as R also and then we have fraud detection so we can you know manage a lot of data like for credit card companies they make use of hundreds and hundreds of users and they analyze the data to predict or you know detect the fraudulent transactions in their data and then there is promotions as well so we can analyze you know the target audience we can uh follow the trends like they are following you know the types of content they're actually going for and then similarly you can make promotions according to that and there is pricing also like you can figure out let's say you have a supermarket somewhat like you know what Walmart does so you have all the pricing and everything so what you can do is figure out the price of a product after several time based on the recent purchases and also the recent scenario or the previous install data upon which the price has been distributed accordingly and you can also plan for the demand as well using the predictive analysis so these are a few applications that I can think of right now and these are only a few applications where you can use predictive analysis to predict the for example I'll talk about football guys so let's say if you have a favorite player and in the next season you want to see how much price he might go for in certain other clubs so you can make use of the data at your B and depending on the purchases that happen in previous Seasons or the windows you can actually figure out somewhat around what kind of price your favorite player is gonna go for so that is one example I can think of right now so these are a few applications of predictive analysis now let's move on to the next topic of the session guys which is steps involved in predictive analysis so this is a very important Concept in this session guys so you have to fully understand what kind of steps that goes in while you're doing a predictive analysis so the first step has to be a data exploration so what you have to do is gather data upload it into your program then you have to take a look at your data in a perspective which will clear certain things for you like you have to figure out what kind of data you're dealing with what are the columns what are the features that you have inside your data what kind of data it is how many numerical values are there what kind of data types are there inside your data is it a CSV file or not so on so you have to figure out a lot of things while data exploration and after that you can figure out how to clean your data by cleaning I mean you have to figure out the redundancies that might hinder your model so for that you have to check for null values you have to check for missing values and then you have to figure out what kind of columns will be actually better if you put them inside your model and what are the Redundant variables like what kind of columns that you can actually remove and will not make a difference in your mod so that covers the data cleaning part and then there is modeling where you have to model your or you have to select your predictive model guys so there are a lot of models that you can go for but in this session I'm going to use the linear regression model because it's the very simple or the basic one so that the beginners also will be able to learn it properly after modeling you have to check for the evaluation or you have to check for the accuracy you know how your model is actually performing so let's talk about these steps in a little more detailed way so we'll talk about data exploration first of all as I've already told you data exploration is gathering your data and then taking a look at your data in a perspective that will clear a lot of things for example you will be able to see the number of columns number of rows you will have a description of all the data types what kind of variables are there you will have the mean values the average values minimum values and you can also check for Unique values in your columns as well so so this all comes in data exploration and after this the second step is data cleaning and I've already told you guys data cleaning is basically getting rid of redundancies in your data which includes the missing values which may hinder and you have to make sure that your model is not going to cause overfitting or and fitting due to the noise and noise is basically irrelevant data that may be in the form of null values so you have to make sure you get rid of them or replace them with average values in the column and then there is a redundancies like outliers which are not necessarily required in your model so you can remove them as well so this is all about data cleaning and then we have the third step which is modeling so for data modeling first of all you have to understand the relationship between the variables in your model so that you'll figure out what kind of model you're going to go for so for example if let's say if you have a Target variable in our case which will be a price of certain Goods so let's say you have to figure out the relationship between variables so if you're going for linear regression you have to make sure the relationship is continuous and let's say if you are going for logistic regression it is important that you go for continuous variables the target variable although has to be dichotomous or what you call it categorical which is like let's say if I am trying to predict something using the logistic regression the answer would probably be yes or no or be one or zero something like that but in case of linear regression we have to make sure that there is some continuous relationship between the variables which is my target variable and my independent variables taking a look at the fourth part or the fourth step and the final step is performance analysis so after you are done making a model so you have to perform certain analysis which is you know checking the accuracy of the model and making sure that it's above 70. I mean if you are a beginner and if you are trying to make your first prediction model anything above 70 percent accuracy score is very good guys but I would suggest that if you are working on a good model and if you want your model to be good the accuracy should be ranging around 0.9 which is nine more than 90 and if you get it the first time it's well and good but it solely depends on your data and the kind of model selection that you do so let's take a look at the next Topic in our session guys so this is basically where I'm going to perform predictive analysis using python on a data set so I have a problem statement in which I have a data set which has certain values or certain variables which has columns like you know how many bedrooms does a house have and what kind of square feet it is uh grabbing and all these things that I'll show you in the data and using that data I I am going to predict the house of a price so let's take it up to Jupiter notebook guys and I'll show you what I'm going to do over there so I have a jupyter notebook over here guys and if you're not familiar with Jupiter notebook guys I suggest you to check out our tutorial on YouTube we have um Jupiter notebook tutorial you will be able to learn it properly I mean there's not really so much to learn in uh Jupiter notebook it's quite easy that is why I'm using also and we have a cheat sheet as well so you can go for that so first of all I'm going to import some dependencies so for the first step that is data exploration I have to get the data so for that I'm going to use the pandas library and I'll import a few other libraries as well like I'm going to use the c bond to check the relationship between the variables basically for Eda exploratory data analysis and if you guys don't know what Eda is I suggest you to check out another tutorial which is exploratory data analysis that we have on our YouTube channel and then I'm going to import numpy as well just in case all right and you can see guys I have to just press shift and enter and this is why I'm using a jupyter notebook because the implementation is very easy and I can segregate my data or the code in different cells so I'm importing this and I can just make it okay I'll make it a little bigger so that it's visible to you everyone what I can do is I can comment a part let's say installing dependencies and it is a separate cell so that makes it quite descriptive when you're coding and when when you are trying to figure out what's wrong in your code it helps actually so after this what you have to do is I'll check I'll have to import the data for that I'm going to use the read CSV module which is basically going to go to the file and read my data guys the name of the file is house.csv we have a truncated error all right so guys I have to show you something so usually when you do this when you copy the file location you get that Unicode error but let's see if I change these backslashes to forward backslashes what happens do I still get the error or not okay we have uh all right so I was doing something over here so this is one exercise for you guys like earlier when I was using the backward backslashes I was getting a Unicode error but when I change it to the forward backslashes I'm not getting that error so this is one question for you guys tell me why you think it happened in the comment sections below now moving on I'll take the first look at my data guys so I'll just use the head method to get my first data so these are the first five rows in my data guys so I have ID we have date the price is there bedrooms we have bathrooms square feet living and square feet lot we have floors as well Waterfront is zero okay it has to be zero and one I guess there is view then grade square feet above so these are the columns that I have inside my data guys so I'll check the last five rows as well for that I'm using the tail method so as you can see you can get the first look of your data using the data dot head and data dot tail method after that I checked the Columns of my data and let's check the shape as well guys so that we'll know what we're dealing with okay it's not callable all right so we have 21 613 entries with 21 columns okay it's a quite a big data set and let me tell you guys this is one data set that I found on kaggle it's very easy to found the house data set and I am using this example of house data set because it's very common and to find this data set it's very easy you go on to Kegel and you just look for house protection data sets and it will show you a lot of data sets that you can download there from okay so you can find the date side on kaggle guys now we have checked the shape as well okay I'll use one more method that is data or describe all right this is callable all right so we have all these numerical values and using the describe method we can get the 50 minimum maximum and the standard deviation we can get the mean value and the count as well so let's say for bedrooms the mean value is three uh the most common entry in the bedroom Section is a three bedroom house and then for bathrooms also is a two bathroom house square feet is almost two zero seven nine square feet and then for maximum values we have even a 33 bedroom house as well and we have a house with eight bathrooms and the square feet is thirteen thousand five forty all right so minimum value is we have a zero bedroom house okay that's gonna be something else and square feet 290. so these this is how you use the describe method and this is the first step guys I am trying to do the data exploration now after this I think I'm pretty sure about what kind of data I'm dealing with now what I'm going to do I'm going to move on to the next step that is checking for the relationship between these variables so for that I'm going to use the data visualization and I'm going to use a few I'm going to use a few plot points using the cmon library and if you don't know about c bond we have a YouTube tutorial on c bond Library as well so you can find out different kinds of plots that you you can use for data visualization and data visualization is nothing but it's a process where you can visualize your data and you can try to figure out the relationship between the variables before that I want to check for null values or missing values because I don't want to get any hindrance in my data set where I'm modeling so first of all you have to do check for null values and let's get a sum as well so we have zero almost okay so we have no null values in this data set so usually if you find a null value and if it's a big data set and let's say if all these values are let's say 2000 and if you have 10 missing values you can just remove those 10 values but if there are more null values it's just you to replace them with the mean value and to find out the mean value you can just go over here let's say if you have let's say if you're checking for bedrooms how many null values are there and let's say there are 500 out of 21 000 so you can just replace the zero with this value which is three and similarly for any other column you can do the same using the mean value so since there are no null values inside this data set because it's a very clean data set that I downloaded it from kaggle so we're going to move on to the next step which is visualization and many guys this is the step in my data exploration part and the data cleaning part not the other steps that we use for predictive analysis all right so I have no null values but there are a few redundancies that I want to get rid of I'll talk about that later guys first of all I'm going to use a relation plot X I'll use okay so I want to check or my basic aim is to predict the price of the house all right so for X I'll just take price and let's check the relationship between these variables so I'll check for bedroom all right kind is equal to okay will not use kind data is equal to data okay all right so I think we can like almost there are so many I mean so this is one relationship that I'm getting over here price is not very clear but we are getting the relationship of the bedrooms or the most common bedroom so which is around over here that is zero to five and similarly I can check for other variables as well like bathrooms see you guys so the price is actually increasing with the number of bathrooms but it's not necessarily uh same for everything so there has to be some other dependencies as well because as you can see since the bathrooms are rising price is not actually Rising that much okay I'll just copy this for bedrooms the price is actually increasing uh pretty much with each bedroom I mean not really if we take a look at 10 bedrooms also the price is pretty much the same so there's not one decisive factor I can think of for this all right so we'll check for some other as well so we'll check first query square feet living so this is one linear relationship that I'm seeing over here guys so with each square feet Rising most of the prices are actually in this area only like from 0 to 40 400 000 or 40 million actually but uh we can see that it's a linear relationship with each increasing um square feet the price is actually Rising so this is one thing that has to be there in inside our train set I'll tell you what a train set actually is then there is floor as well we can check for floors all right okay so floors is actually a pretty descriptive over here also for most of the values are in the two floors area and then we can check for Waterfront as well okay we can do one thing I'll tell you one uh trick so we'll add the Hue over here and this is going to be water front so with the houses which actually has a waterfront are in oranges and the other ones are in blue so you can see the relationship between them and similarly I can use other okay so let's say latitude and longitude so you can figure out the relationship between the variables using uh the visualization so for me I think in this data set to get the price out we're gonna have to use bedrooms bathrooms there has to be square feet a lot of square feet has to be there floors also we can get and then what view front view we'll use square feet how we have to use and your build and you're renovated we can leave it out from the train set and zip code also we don't actually need latitude and longitude are also not decisive in uh prediction because we can use it for visualization and we can actually get the picture over there and square feet living is actually important so these are the redundancies that I was talking about inside your model so now we'll move on to the modeling part guys so what I'll do now is I'll import a few dependencies guys so from a scale on so first of all I have to import the linear regression from linear models I'm going to import linear regression right so I'll import the model selection import ant split all right so first thing that I have to do is I have to segregate my data into a training set and a test set so I'll do one thing guys right so I'll get my data in this cell so two things that I don't actually need inside my data are date because I'll get the training data or I'll write it as train so which is going to be data and I'm going to drop a few columns so I'll have to drop so I have to drop a few columns so first of all I have to draw price because it cannot be in my training set since I'm gonna predict it and then there has to be ID we don't actually need it and then I'm going to drop date as well there's not really a prove a lot of things over here so for now I'll just remove all these columns and uh for our test set or the dependent variable let's say I'm going to take the variable as a test let's say and for this I'm going to use data dot or we need actually just one column that is price method objects is not subscriptable I'm sorry I made a syntax error over here guys I have to add the access as well now it's fine guys so now what I will do is I'll use I'm going to segregate my data into X strain test y train and Y test and now I'm going to use the train test split method so first we have train then we have test we have test size which is let's say 0.3 and we have the random state is equal to let's say 2. right so we have made our extra in an extent method now I'll use one variable let's say our EGR regressor and now I'm going to call my linear regression model it's made guys so now I'm going to use the fit method to fit my X strain and X Y train data guys the training data have to fit right we have no errors here guys after this I can just uh okay I'll take one variable let's say predict and now the gr Dot predict X test and y test all right wait a second guys I'm sorry so now the modeling part is done guys so I'll explain what I've done over here I took the linear model which is linear regression and then for segregating my data into training and testing side I'm using the train test plate before that I segregated the data for my model inside which I am using the training set which has all the values from the data set except price ID and date so I have removed these three columns because I thought these are redundant for my model right now and the variable that I'm going to predict over here is the price so I'm taking that alone so after that I use the train test split method to actually separate the data into training and test set and then I call the linear regression model over here using the regression model I am fitting the training data and after that I am using it to predict the value so now comes the parkway we have to check the efficiency of the model so for regression models it is very easy guys you can just check the score for this you have to provide a few values X test and my test and we have the accuracy of 0.70 which is not bad guys if you're using the model or the data set this big it is quite predictable get this kind of accuracy but you can do something else to improve the accuracy I mean uh you can look at the data and remove all the values that you find will help you into improving our accuracy like you can remove latitude longitude you can remove zip code you are renovated your built that you can actually remove water front and view as well or keep only a few values that you actually need which is bedrooms bathrooms I think and everything is related to square feet that if you keep in your training set then it's gonna be a higher accuracy for you guys all right so now that we are done with the session guys I want to give you a exercise sort of which you can do for practicing this so I have used the linear regression model over here so I want you to do one thing guys check out other classifier and regressor models that you can use to predict a value I mean we have a tutorial on all of them on our YouTube channel and see if you can use the same data to make a prediction model using other classifiers like a random Forest classifier then we have a decision tree then you can use the logistic regression for this as well I mean if you have continuous data then you can go for linear regression but if you find a categorical data let's say we have Waterfront or not so you can check for that as well [Music] foreign to start with we are going to start with the statistical functions the first one is average function what does the average function do it's an inbuilt function in Excel that is categorized with statistical function average does exactly what it says and works similar to sum it will return the average value of the given series of numbers in Excel let's say we need to identify the average salary paid to the employees in the organization where is our salary data in the H column so what I'll do is I'll enter the average function followed by the Open Bracket in the cell we need the result in our case it is J2 so I start with an equal to I enter average open the bracket now I'll have to select the range of cell for which I need the average and in our case it is the salary column which is h2h when I close the bracket and press enter I get the average salary as 15538 this is the average amount paid to each of the employees in the organized the next function in our list is the median function the median function is again a statistical function which Returns the median of a given number what is median it is the middle number in the set of numbers that means if I have 10 12 and 15. so there are three numbers that I have I have to identify the middle number from this so is equal to I'll use the median function here just to identify what is the middle number and then select the range of cells that it has numbers close the bracket enter it will give me the middle number which is 12 in our case because it is not the highest it is not the lowest but a middle value the same goes if I want to know what is the middle value in the salary column I'll use the median function in cell L2 is equal to median open the bracket select our salary column which is h2h close the bracket and enter 15750 is the middle value of the salary column next we are going to see is the mode function the Excel mode function Returns the most frequently occurred numbers in the numeric data set this function only works with numbers it will identify amount or the number that occurs maximum time in the range of cells so for example here we have the basic salary where I can see the highest number is 49 000 and then there are few employees earning the same salary but I want to identify which salary is been earned maximum time by the employees so I can use a mode function as equal to mode open the bracket it is only mode select mode open the bracket and then select h2h column because I want to know from the salary column close the bracket 17500 is the amount that occurs maximum time in this cell ranges moving on to the next one is the standard deviation function what is the standard deviation it's a formula is to identify the standard deviation of the set of numbers the standard deviation function calculates the standard division in the sample set of data standard deviation is a measure of how much variance there is in a set of numbers compared to the average of the numbers standard deviation dot s function is meant to estimate standard division in a sample if the data represents an entire population use the standard deviation dot P function so we will look at our standard deviation sheet here I have a sample data so sample of few students have been picked up who have scored a particular number or a particular marks in the different subjects so we have taken a sample and we have not picked up all the student data so if we are using it for a sample only we will use the standard deviation which is standard deviation dot s it estimates standard deviation based on the sample it knows logical value and text in the sample so it has to be only numbers okay so it's standard deviation dot s when you have a sample data where you just want to identify how much variance is there as compared to the average you will use the standard deviation dots s but if you have a full population that means if you have all the marks of all the students in the school or the college you will have to use standard deviation dot e which is also there here calculates the standard deviation weight on the entire population given as an argument but here we are using the standard deviation for a sample data so I select that and then for which I need a standard deviation I'll just go and select that particular range I close the bracket I press enter it is giving me 4.89 which is a standard deviation value for that scores so this is how your standard deviation works next function that we are going to see in our list when we are going through our function list is our large function what does a large function do large function Returns the nth largest value from the sample for example if I go back to my data where I have the salary data or the employee salary list I have to identify what is the highest salary that I am paying I can do it by using the sort and function but it can be that as of now I have sorted this data but later on this person has Asha Trivedi has moved to a different department director sheetal Desai has left and Asha Trivedi has moved into being the in the board of directors list not only Asha we have also identified Bharat to move into the board of director so now we have two board of directors but both their salaries will be a little different say I will change the borrowed of director's salary to 51 000 for Bharat and or Asha Trivedi it is about sixty thousand so this is the salaries that we have decided for the Deborah director which are two board of directors now in this case if I have to see the first Value First find out what was the largest value in the salary section without changing the two values that I had now so is equal to large open the bracket I'll select the array which is our salary and K when I say what does the K mean k mean I have to tell which highest number that means if I want to know the first highest value or the first largest value from the list I can put the number as one but if I don't want to know the first largest I want to know the second largest salary paid to the employee I will press number two close the bracket enter now if you see our largest value is actually 49 000 however the answer that you get is twenty four thousand five hundred why because we are looking at the second largest value in the range of cells if I am looking at fifth largest value I'll put it as 5 and press enter it will give me the fifth largest value in these set of cells so which is 22 750. like in our example where I mentioned that ashra Trivedi is now moving to their board of directors let's not remove sheetal Desai for now she's still there in the board of directors we also move Asha Trivedi to the board of directors list and this time the board of director's salary that I'm moving to is 60 000. so when I say the board of directors salary is 60 000 when I move this salary now and say that which is the first largest salary the first budget salary will be sixty thousand the second largest salary will be 49 000 instead of twenty four thousand five hundred why because now there is even one more larger salary the first larger salary has become asha's salary which is sixty thousand earlier it was forty nine thousand was the largest so this way if you keep on changing anybody else's salary say I make it to 65 000 just to get a gist of how it works if you will see that earlier it was 49 000 Watts the largest now the sixty thousand has become the largest salary or the second large salary same goes with a small I am trying to identify what is the smallest salary that I am paying or the lowest salary that I am paying to the employee now again I have to give the K value now K value again like I said it will be the value of which smallest number that you want is it the first smallest that you want to know is it the second smallest that you want to know or the lowest that you want to know is it the third that you want to do so according to what is your requirement you can put the number there as say if I put 2 so it will give me the second lowest salary paid to the employees in the organization which is seven thousand if I make the lowest salary as even more lower so let's see what is our lowest salary it is 5950 let's make one of them to 5500 what will happen is now if I go back and see the second lowest salary value it is now 5950 because the first has now changed to 5500 so depending on how of your value is which is your lowest salary it will automatically calculate and give you that number so moving on to the next one which is our Coral what is coral or it is the correlation you can say we can use the correlation function in Excel to find the correlation coefficient between the two variables I'm sure we have all learned coefficient correlations in our College days when we were doing maths in the college if you had opted for maths in college you would have done the correlation coefficient however if you have not done it which is not a big deal the Excel is already doing it for you correlation coefficient formula are used to find out how strong a relationship is between the two data like here in our case the relation between the age and the relation between the glucose level of the person is it that the age increases and the gross level increases or is it that the age decreases in the glucose level decreases or is it that the age increases and glucose level decreases and vice versa so depending we are trying to understand what is the relationship formula always returns a value between -1 and 1. what is the relationship of minus one and one why is it always giving you in -1 and how is it bifurcated or split if your relationship is one or close to one rather I would say 0 to 1 it would indicate a strong positive relationship a correlation coefficient of 1 means that for every positive increase in one variable there is a positive increase of a fixed proportion in the other for example true sizes go up in perfect correlation with the foot length so when your foot line increases your shoe sizes increases so it's a proper correlation that is when your one correlation it is exactly how much your foot sizes increases your true sizes increases if the correlation coefficient is -1 it indicates a strong negative relationship our correlation coefficient of -1 means that for every positive increase in one variable there is a negative decrease of the fixed proportion in the other for example the amount of gas in the tank decreases in perfect correlation with the speed every time the amount of gas in the tank decreases of your car so when you're running the car if you're increasing the speed your gas keeps on decreasing why because you're increasing the speed so the gas is getting used or your fuel is getting used and that's the reason the fuel is decreasing but your speed is increasing so there is a opposite reaction fuel decreases but your speed increases that's how your coefficient correlation ship works if in case of -1 if the result is zero it indicates no relationship at all zero means that for every increase there isn't a positive or negative decrease the two just aren't related anything that is zero no relation at all one where there is a positive relation that means with the increase the other part will also increase while with minus 1 it is the opposite side with one increase the other one will decrease that's the three differences that you will see of course you will not get the result as 1 minus 1 or 0 it will be in between so we will try and understand that if it is between 0 to 1 then it is a positive one if it is between minus one to zero then it is negative 1 and if it is 0 then it is no relation at all so let's see an example in our sheet where I have the age considered as the x value h of the person and then I have the glucose level of that person as considered as y we are just trying to identify what is the relationship between the human age and the glucose level in their body so if you see the human age the X level it says 43 the glucose level goes to 99. 21 the glucose levels closes 65 but does it make sense by just looking at it because we can't identify what is a relationship is it minus one is it one or is it zero so to identify that I can use a function called correlation function which is coral in Excel in the cell where I need this result so is equal to Coral open the bracket and then I will select the different arrays for which I need to identify the relation so first relation that I have to identify for is the age so I will see that age against glucose level so the age is sitting in B2 to B7 so I'll select that as the first array the second array is c22 C7 so I select that second array which is C2 to C7 and close the bracket press enter it is giving me a value as 0.53 if I remove more decimal places that I have so our answer is 0.53 which is a positive result that means there is a close relationship between the age and the glucose level when it is positive it means that the increase in age does increase the glucose levels in the body if this was a negative point then it would have been the other way around that increase in age reduces the glucose level in the body so this is how your correlation function Works let's see another example that we have on our sheet which is a correlation function we have the price of the stock a that changes or the percentage change in the price of stock a and then we also have S P 500 weekly change or the percentage change of SNP 500 the S P 500 is a stock market index that tracks the stocks of 500 large cap us companies it represents the stock market's performance by reporting the risk and returns of the biggest companies so what are we doing here is we are identifying if there is any relationship between the SNP change and the stock a change so if there is a relationship they will be able to take a proper decision in the future when they are making some Investments so to start with I am going to find out the correlation so I'll do is equal to Coral open the bracket now we have two set of data which is two way arrays so I'll select the first array as this one which is C5 to c24 comma the second array will be D Phi to d24 I close the bracket I press enter you will see that it is very close to 1 so it is 0.89 or rather 0.9 this means that every time there is a one percent increase in S P 500 then there is one percent increase in the stock a prices same similarly if there is a decrease in the S P 500 the accordingly the stock prices of a will decrease by one percent that's how the exact correlation is while in the previous example there is 0.5 correlation that means when there is a one year increase in the age point five percent of increase in the glucose level so that's how the correlation is identified using the coral function in Ms Excel next we are going to see the charts in Excel what are charts when you want to graphically represent some data it is very difficult to interpret Excel workbooks that contains a lot of data charts allow you to illustrate your workbook data graphically which makes it easy to visualize comparison sentence Excel has several different types of charts allowing you to choose one that best fits your data in order to use charts effectively you will need to understand how different charts are used so we will start with the column charts so you'll see all these different types of charts that we are going to learn today so to start we'll go back to our example sheet where I have the sheet for column chart that means there is a data that is already available where I have the data for the region along or the rather the sales data for each region which is split into two years which is 2016 and 2017. this data is in millions so you can see that the Mumbai has done a sales of 65 million in 2016 when it has done the sale of 70 million in 2017. same goes with London it has done 55 million in 2016 while in 2017 they have done 65 million USA you can see 45 million and 52 million while if you see nagpur you will see that the sales have reduced from 2016 to 17. here these are smaller data so you're able to even identify where it is going wrong and how is it related how is the comparison done so instead of doing that if you want to send it to your management you can just select this data and create a chart that will show exact comparison on your screen and it will represent it graphically so how do I do that I go to insert in 2016 there is a new option that has come up which is called as a recommended chart that means whenever you select any data and then click on insert it will give you an option as recommended chart where the Excel will recommend you the types of chart that will look good with these kind of data or rather it will properly represent this data which you are looking at so I select the data that I want the graph for and then I go to recommended charts when I go to recommended charts you will see that there is a first chart that it is recommending is the column chart this is your column chart but as we are now seeing the column chart rule start with or each type of chart we will start with the column chart for now and then we'll come back to the recommended charts so you'll see the first one the column chart is your even in the icon for 2016 first 2D chart that you can see is the column chart I click on that and if the data is coming next to one another when is the column chart really used rule the column chart is useful in such kind of datas where there is a comparison required so column charts use vertical bars to represent data they can work with many different types of data but they are most frequently used for comparing information like in our case 2016 and 17. so I have got this data as column chart and it is comparing between the 2016 and 17. now looking at this chart it looks very nice however you will see there is something missing as in when I'm looking at the data for Mumbai if I don't see this table I will not be able to identify what is the real sale done by Mumbai I'll have to Mouse over it to get the number when I Mouse over it it says series 2016 Point Mumbai value 65 it gives you all the data however if I am just outside this data I will not be able to see the information on in one go it just shows which one is the smaller well you in Gujarat I can see that in 2017 they have done less sale as compared to 2016 but I need the exact number so for exact number what I can do is one way of doing that is I can go and look at it by doing a mouse over it another way to do that is to just see it I can add the data label how do I add a data label there are different ways to add a data label one way to do that is first select where you want to add the data label say I want to add the data label for 2016 and 17 both the bars so I cannot do it together so what I can do is I can select one at a time by using the left click and then right click on that same data and then click on add data label once I do that it will add those numbers on that data secondly I can do it for 2017 also by selecting it however there is another way of doing it if I now want to delete the data table so I'll just select and click on delete from my keyboard and it will work now if I do not have the data labels here I want to add it you using this plus sign is only available in 2016. when I select the chart there is an option as a plus sign I click on plus and I click on data labels it will automatically add data available for both the bars I don't have to select each of the bars this is how your data labels are added now if you can see there is a axis Y which is given here which has some numbers why those numbers are there because earlier we did not have data labels and that's the reason we could see with this numbers where are laser sits so just above 60 somewhere between 60 and 70 so I assume that it is 65 so that is how it was showing however now that I have the data labels I will not need this information or I do not need this access on the left hand side so how do I remove that excess from the left hand side that you can see there there are again one way of doing that is going to the design mode which is after you select the chart go to design mode and then add under add chart element I will see X's which is the x-axis that we used to call another one is the primary vertical axis vertical axis is the vertical which is on the left hand side that we are just talking about this one this is your primary vertical and this is primary horizontal so I want to remove the primary vertical as of now both are highlighted that means both are there I want to remove the primary vertical so I'll select that it just goes away that's one way of doing it easier way or the quickest way to do that is just select the axis B very careful that only the axis is selected you will see that for small wall kind of a thing that comes on the four corners of that axis once I do that once I select that I can press delete Del from my keyboard and it will go away third thing that I can see on this chart is the chart title I have to add a title to this chart so how do I add a title this is already there on the list I just have to add the title I select the data there and add the title as sales comparison or sales for two years and in the bracket I'll put in millions so that they don't get confused that why is it showing is only 50 40 or whatever so this is how I can show my data one more thing that I would recommend doing is removing this grid lines the grid lines will be more useful when you don't have the data labels because then it gives you where exactly your data sets with y-axis you'll be able to see the number but when I don't have I don't need this grid lines at all so I'll just select the grid lines and then click on delete on my keyboard that's one way another way to do that is just go to that plus sign and press this grid line grid lines are there by default now if you see the data it looks even more cleaner or it looks even more presentable because there is no grid lines at the back it's all clean this is how you can tweak your data as per your requirement this is how your column chart works but you can do the same thing with the other charts as well there are other quick things that you can do with the chart is by also changing the chart type if you want to or changing the data that you have selected so you have selected this data in order to get this chart however I want to to change the data so I can click on that design in the under design you will see under data select data once I click on that select data it will give me an option here to change the data range for the chart so I can click on this data range to change the chart very important part that you will see here is switch row or column so what happens is as of now my Y axis only contain the numbers and the x axis contain the Mumbai London and everything and the data which is the colors or the bars which are there that are in the colors that is 2016 and 17 which is the year is showing in two different bars now when I click on this switch row and column what will happen is all the region which is there will become your colors of the bar or different bars while your 2016 and 17 will come on the Xbox let's see how that happens by just clicking on switch rows and columns once I do that and click on OK you will see that 2016 data is showing under the x-axis and all the bars are showing now with the regions this also can be done however in this case it is a little difficult to identify the or do the comparison between the two ears you can do comparison between in 2016 what is the sales that you have done for each region that kind of a comparison can be done to identify if you see Gujarat has done the lowest sale in 2016. while again in 2017 Hyderabad has done the lowest cell now so this kind of comparison can also be done but depending on what kind of a comparison you're looking at you can always click on switch row and column you can also use this button to switch row and column otherwise you can click on select data and then use this information this is how your charts will work in Excel and how you can add new data into our data labels how you can format your data use using the different data sets under design moving on to the next chart our next chart in the list is the line chart line charts are idle for showing trends like you have in our list I have the number of days and the temperature that is changing as per the number of days on day one so we are trying to say maybe in December so in USA we have this days if you are trying to see that on day one of December if I start with day one what was the temperature and how is it going up or is it going down so that kind of information if I'm trying to identify I can use the line chart because the line chart will give me a kind of a trend on different days so how do I create a line chart when we have the data where we have to show the trend in a particular time frame that time we are going to show it as in the line chart like for example in our sheet we have the days from day one to day six identifying what is the temperature like in these days so we have a trend like in this first date is 43 the next day is 53. third date is 50 then fourth rate again goes up by 57 58 going up 59 and then 60 again it is 67 which was showing a trend of the data in that case I'm going to use the line chart so I'll use the line chart by selecting the data it is giving me the line and it's showing you how it is going up so it starts with the day one it is lower then goes higher and then keeps on going higher higher higher only in the day three it dips a little down but it keeps on going after that this is how your line chart works so whenever you have this kind of a data where you have to show a trend we are going to use the line chart what is the next one the next one is a pie chart pie chart make it easy to compare proportioned each value is shown as a slice of a pie so it's easy to see which value makes up for the percentage of the whole so for example I have this book type which are the book types that has been sold by the bookstore which is the classic mystery romance science fiction and spiritual the revenue that is generated out of these books are also listed here for classic it is eighteen thousand five hundred dollars by mistreat is 78 970 and so on and so forth I am trying to identify what is the total revenue that I have generated which book type is giving me the maximum Revenue while which book type is going to be giving me the lowest Revenue to do that I can use a pie chart because it's going to give me the part of a pie like here you will see that the orange part is actually the mystery part mystery is being sold maximum time part of the time and the revenue will also generated now similarly like we did it for the column chart you can also use the data tables in the pie chart as well how do we use the data tables in pie chart by again using that to plus sign on the right hand side or just right clicking and selecting the add data label as of now we'll use the plus sign and click on the data labels once I do that the data labels will be added on the list now in this case the numbers have been added however I also want to add the chart title the chart title I also already mentioned as the revenues so we can leave it as as it is if you need to change that you can always double click inside and start changing the name of the chart as well or the chart title in the data labels that we have added you will see that only the numbers have been added I have to check which color belongs to which in order to see where does the highest number land so if there are bigger lists there are only five different lists so that is fine so if there are about 10 to 12 less than you want to see which one belongs to which it is a little difficult to identify that so what I can do is I can add the chart labels on each of them like a data label so I can also add the labels on each one of them to identify how does that look so how can I do that by going to the data labels in the add chart elements and then click on more data labels option as soon as I do that a new window or a new level has opened up on the right hand side I can select the category name what will happen is as soon as I select the category name it automatically gives me the category name in the list now the category name you can see is separated by a comma and it is coming next to each other if you want to change the separator you can use any of these as a separator I can use new line so what will happen is the number will go in the new line so it looks a little better again like the earlier one where we removed the x axis in this case now we have added the names on the chart itself so I can remove these legends from the bottom so it looks a little cleaner now I can even make this a little bigger because we have enough space to make it bigger as I have removed the legends from the bottom so this is how your pie chart will look moving on to the next one when the next one is a bar chart bar chart works just like the column chart but they use the horizontal bars instead of the vertical bars where is it useful but it is useful in the data where you have the bigger names of the headings like in this case yes the inbound marketing demonstration a demonstrator Roi if I insert the normal column chart like this now your the data is smaller so it is just giving you in that same line however if you have a bigger data or even it will look better if you have the bar charts which is shown in the horizontal level like this because you can clearly read the headings here on the left hand side and your bar charts goes on the right again removing the grid lines adding the data labels will remain the same moving on to the next one that is the surface chart what is the surface chart and how is it useful surface charts are useful when you want to find the optimum combination between two sets of data like in our list we have the marketing finance and effort and they are which field is used or which department is used to to get these details like recruitment how much you have to recruit in these three departments which is the financial Market in the effort how much contribution is of the environment in the financial marketing and effort how much assets are there in the financial Marketing in the effort and the building how much is there and the expenses that has been formed a part of these three departments so if I want to know these so I'm trying to combine the recruitment part I'm not trying to identify the combination of marketing finance and effort how much recruitment has been done in the combination of all three in this kind of a situation I'll be using the surface chart so how do I use the surface chart where is this located again if I go to commented chart also I'll be able to see the service chart under recommended chart under recommended charge there is a wind or tab called all charts I can select that go to surface here I'll get that option of 3D surface I click on OK you will be able to see the surface chart here so it gives you the effort marketing and finance so you will see that the combination of recruitment here which is this part is showing your total recruitment for all the three departments same goes with environment how much environment is affected due to finance marketing and effort a set that has been used by these three departments and so on and so forth so how we are adding the values to this so this is how it shows you can again remove the grid lines if you want from this list so that it looks a little better and it also gives a number as 0 to 200 as blue which is this part because the 200 only is affected when I say 0 to 200 this is the effort is 159 that's the reason it is showing as 0 to 200 the second one is marketing 200 to 400 which is 345 in case of Finance so that's a bit little bigger it is showing here the orange one the next one is actually your gray which is 400 to 600 which is got a little even bigger so this is how your data is identified by using the surface chart moving on to the next one so we have learned how the charts work in Excel and how you can make a little bit changes in order to identify the best chart that fits into it you can also use the chain chart type option to change a particular chart so if you think that a surface chart is not fitting into this you can select the chart and click on trade chart and then use one of these recommended charts from the list this is how we learned the charts today let's move on to the next one which is the pivot tables what is the pivot table and how is it useful in your day-to-day life many people have the idea that building a pivot table is complicated and time consuming but it's simply not true compared to the time it would take you to build an equivalent report manually pivot tables are incredibly fast if you have well organized Source data you can create a pivot table in less than a minute so let's see how we can do that first of all I have a sheet with all the data information there like in our sheet I have the list of employee code their names so Department region branch and their salary information I am going to create a pivot table in order to show the different departments salary how much salary are we paying to different departments however before I do that I'll have to show you how to create a pivot table to create a pivot table I first will go to the database where I need for which I need to create the pivot table so I go to the database I click on insert after I click on insert there is an option called pivot table I click on pivot table a new window will pop up in the pivot table wizard I have to select the table or the range for which the pivot table is to be created so I select range A1 to h101 the next one that I have to select is either new sheet or the existing sheet new sheet that means when I click new sheet and click on OK it will automatically create a new sheet in the existing workbook and if the pivot table will be placed in that in new sheet however if I click on existing worksheet I have to give a new location that means the sheets that I already have on the workbook from that sheet which cell do I need the pivot to be placed hence I will click on the location and select the location that I need the pivot table on I can select from any of the existing sheet that I have on the workbook however this time I'm going to use a new worksheet and click on OK this will create a new Excel sheet for me just before the Excel sheet that I was working on currently so it is now creating that Excel sheet once the Excel sheet is created on that sheet you will see the pivot table wizard is already available or a blank space where it will show you the pivot table number and on the right hand side you can already see the pivot table fields that are shown under the pivot table Fields all the headings which were there in the database are have already occurred here like employee code first name last name Department region Branch hired it in the best basic salary so everything is already showing here now before we start with the pivot table we need to understand this four fields that are there on the pivot table so every field has its importance so let's start with the values area which is this one the values area is a large rectangular area below to the right of the column in the row heading the values area calculates and counts data that means anything that you enter your it will give you the calculation of that value for example if I pick salary which I drag and put it on the values column it will automatically give me the sum of the salary you can see it says sum of basic salary that means the total salary that has been paid you can change this to any other field for example if I click on this and click on value field setting I can then give other calculations right count average Max Min standard deviation anything that I need in this field I can select and give that option if I don't want to sum of the salary I just need to know the count you can click on count and it will give me the count of that cells which has the salaries so I click on sum for now the data field that you drag and drop here are typically those that you want to measure Fields such as sum of Revenue count of units or average of price row area which is this part placing a data field in the row area displays the unique value from that field down the rows on the left side of the pivot table that means because I said that I want to do it or the department when I drag it to the rows it will pick up all the unique value so even if admin is coming 10 times in our database like admin comes a lot of times here it will only show Once in the pivot table same goes with CCD do same goes with directive what does that mean it will group all the admin together then it will sum the salary of admin the next cell where I have done the values and it will give me the answer period so it will giving me the salary for only for admin then for CCD then for director Finance marketing Personnel attendee and sales this is how your database will look if you put the data into the rose area next is the column area the column area is composed of headings that stretch across the top of the column which is this so if you want to give a heading other than this if you want things like region to be shown in the column if I put column what will happen is now it will show me the east to west north now which will be split as per the east west north south in the column and it is kind of heading that is given on the top like for East and Edwin this is a total salary this is how it is giving you the split next one is the filter column or the filter area filter is an optional set of one or more drop down list at the top of the pivot table here the filter area contains the region field the pivot table is set to show all the region like this if I put it on the top here it is actually an optional area but it will only show me the region on the top then it will not show in the column the filter area allows you to easily apply filters to the pivot table for example if I leave my region here and try to put a branch in the filters column I can select which branch I need the data for for example if I need only for ambala I can click on ambala and click OK it will only give me data for ambalas same way if I select the rest of the other data it will give me the list for other if I want more than one data I'll just click on select multiple items and I can select two three at a time to and then it will be filtered as per the branch and give me the rest of the data this is how your pivot table actually works like I said you can use these value field settings to get the other calculations to be done with this value field setting this is how your data or the pivot table works now this was a simple part of the pivot table if you ever want to sort on the pivot table let me first remove the region to make it look easier so to understanding will be more easier I'm removing the branch from the filters as well I'm just leaving the department and the sum of the basic salary now what I'm trying to do is I'm going to filter as per the salary so I want to understand which department I am doing the maximum salary I select the salary column or I just click or keep my cursor in the salary column I go to data and the same way as we do the sort I'll just click on sort there once I do that it will give me a wizard where I have to select whether you have to sort as per smallest to largest or largest to smallest I will do is largest to smallest and I can see that in the sales department I am giving the maximum salary which is 3 lakh 45 075 so this is how your sort works in the pivot table so there are other things that you can also use in the pivot table which are really really important that we are going to see here there is something called as a slicer and pivot table what how what is slicer and pivot table and how does that work when you go to analyze data there is an option called as insert slicer if you are anywhere outside the pivot you're not keeping your cursor in the pivot you will not be able to see the analyze button to go to the analyze button you have to click on the pivot table and then click on analyze once you click on analyze there will be an option called as insert slicer now insert Slicer in the sense it's a kind of filtering of data so the same way similarly like I did it for putting the branch on the top or the region on the top where I could filter the data I can use the slicer to filter this data so if I click on slicer and select which field I want to filter it with like if I want to filter it with the higher date I can select and hired it and click ok now it will give me a slicer or the list of higher dates which I can select from so if I select only this higher date it will automatically give me data for only the employees who have been hired on this particular date so this is Javier visualizer Works however now you can see that in the slicer okay I can only select one data or one date at a time but if I want to select more dates then there is this something called as multi select which is on the top right hand corner which this is only available in 2016 version so when I click on Multi select it will give me options to select multiple dates at a time in the slicer so it is not necessary that you have to only do the dates you can also add some more information other than just the dates click on analyze I go to slicer again and now I can select anything else that I want now department is already there I can select maybe Branch this time so it will give me another slicer so you can see the another slicer which is there here so there are two slices now so not necessary that you can you have to only work with one slicer you can work with two slices or multiple slices I select Calcutta and the arju link and it gives me data for the replica and arjuning for these two dates only if I select one more date it will give me data for Calcutta Darjeeling for these three dates so the slicers can be multiple slicers on the list and it will give you information as per your requirement how to create a chart from the pivot table and it also looks the same as a normal chart so when I click on the database where I want to create the pivot table from and I then click on the pivot chart by going to analyze button and then pivot chart it will automatically create the chart that I want to similarly as we get the column chart in this case so I click OK once I do that it will automatically create your chart in the chart you can similar to the normal chart you can give the name as a chart title that we are looking at add the data labels by clicking on this plus sign and click on ADD it or just add data labels once you do that the data labels will be added all these fields which you can see some of basic salary Department if you don't want to show it on the table because it is a pivot tables chart it is giving you these information also if you want to just hide that you can right click and click on hide all field buttons from chart it will automatically hide those fields now if I add any additional information like in this one you can only see that the sum of the basic salary is given there is nothing under row columns so what I can do is I can also add data in the row once I do that it will automatically update the pivot chart as well and give me the information as per the row column also now again like where we had used or changed the chart type by selecting the data and switching the row to column you can use a similar function here to switch from row to column so here the data will show in a different way all together so as you update your pivot table your pivot chart also keeps updating and all the same functionalities that is available in the normal chart is also available in your pivot chart the only difference is that it is linked to your pivot chart the other data is linked to your database that is the only difference but pivot chart when you are looking at that it is very very useful because your pivot charts on updating as per the new data that is updated there also you are only giving the information or your only requesting the information that you want in the chart rather than everything coming into the chart and then you selecting which paths you want to display so that's how your pivot table and the pivot charts work if they are really really useful in your day-to-day reporting that you do for your companies and it will really help you to do that now we have also learned how to do the pivot table and how to use the slicers in the pivot table there is another important part of the slicer is that you can link two pivot tables to the slicer for example here I have one pivot table which is splitting the department and giving you this total salaries as per each department now I'm creating a new pivot table from the same database say I just copy and paste that will also give me another pivot table but I do not want Department this time I want to know the salary is given to the employees in different regions so this is split as per the region the slicer that I can add here is by clicking any pivot table I can select and then click on Slice on analyze and then click on insert slicer the slicer will be as per the region or the date or the last name whichever you want to as of now I'm selecting the branch and click on OK once I click on Branch it will give me the options of the branch with which I can select and analyze the data but what I need now is I want to link this pivot table which is a pivot table 2 to pivot table one that means when I select number Bala you will see that only second pivot table has given me the data for ambala which is a North Region the first pivot table stays as it is because this slicer is only linked to pivot table 2. I want to link this slicer to both the pivot tables how can I do that I will have to click on the for the slicer right click on that then click on report connections once I click on report connection it will give me the list of pivot tables available in your workbook now we know that in the sheet 1 we have both the pivot tables sheet 10 is not our sheet so both the pivot tables are in sheet 1 so I'll select both the pivot tables from sheet1 and click on OK once I do that now I will be able to link both the data and now when I select Bangalore or one branch it will automatically give me data for that one branch just like that coaching it will be giving me data for that one branch same goes by and if I select more than one it will give me all the the ones selected and give me data for that two or three branches that I had selected so basically what I am trying to say is you can connect two or three pivot tables to one slicer so that you will be able to get the filters with one slicer for both the pivot tables so this is how your pivot table is or the slicers are useful in case of trying to get the filters for your data the next topic that we're going to see today is the data analysis using Excel so how do you do a data analysis using Excel so there's something called as the analysis tool pack in Excel through which you can do the data analysis the analysis tool pack is an Excel add-in program that provides data analysis tools for financial statistical and Engineering data analysis to load the data analysis tool pack you have to execute some steps in Excel so how do you do that let's go to our Excel on the file tab I click on the options once I click on the options under the add-ins select the analysis tool pack and click on OK so when I go to add-ins under the analysis tool pack that you see here under the name where you see add-ins select the analysis tool path and click on go button or click on OK button once I click on OK button what happens it will open the new data for you or the new window for you click on analysis tool pack and click on go once I click on go it will give you that option of analysis tool pack analysis tool pack VBA Euro currency tools and the solver add-in I will add the analysis tool pack and click on OK once I do that now I will be able to see that under data there is something called as data analysis added to my ribbon so you can now click on the data analysis to create your different analysis that you want to do from the list that you have now let's see how we can create data analysis using the Excel data that we have so what do we have here on our Excel sheet I have a list of data like I have the quantity sold have the quantity sold the price of the product and the advertising now this example teaches you how to perform a regression analysis in Excel and how to interpret the summary output so now in the below data we can see here on the list the book question is is there a relation between the quantity it is sold and the price in the advertisement so what is it is there a relation between this and this and the output so what are they trying to understand is it this is the price and this is the kind of advertisement that we are doing or this is the amount that we are spending on the advertisement is the quantity sold increasing or decreasing or any relation that we are able to see so can we predict the quantity sold if we know the price and advertising so if I know only these two things can I predict what will be my quantity sold that is to do such kind of analysis we can use this data analysis tool pack so I go on to the data tab in an analysis group and click on data analysis now select the regression from this list and click on ok now there is a range that supplies some basic regression statistic so how do you do this now there are two things that you have to give the input y range and the input X range this will always be by default blank and we have to give the ranges to the Excel so how do you give the range and how do you decide what should be range so the first thing that we'll give to excel is the Y range select the Y range A1 to A8 this is the predictor variable also called as the dependent variable so this is something that we have to predict it is dependent on these two things we don't know the quantity sold we are trying to understand what will be the quantity sold if we have the price and the advertising so your dependent variable will be your y range which is your A2 to A8 second one that you have is the X range these are the explanatory variables these are also called as the independent variables these columns must be adjacent to each other so whichever you're selecting so I am selecting the eight these two should be adjacent to each other now I have to ensure why I am taking these two things because these are my independent variables that means this true I know and this is dependent or the quantity sold is dependent on this tool that's the reason the quantity console is called as a dependent variable while the price in the advertisement is called as the independent variable now we have to check on the labels and click on on the output range box and select the where do we need the output so where do I need the output to be shown so I click on the output range box and then I click on wherever I need the output say I need the output in a11 so I select that then check on residuals now the residuals will not be checked you will have to check the residuals and click on okay once I click on OK the answer will be displayed in the summary output box here what does that say so if you see the results here you can see that there is a range that supplies some basic regression statistics including the R square so what is it doing it's giving you the basic regression statistics regression statistics means we are trying to find the regression and that's the reason that is what it is giving you the standard error the number of observations below that information the regression tool supplies analysis of variance which is Innova including information about the degrees of freedom this is called the degrees of freedom sum of squares mean Square value the F value and the significance of f now beneath the Innova information the regression tool supplies the information about the regression line calculated from the data now what is that calculated from the data so it says that including the coefficient standard error that's the same thing that we are looking at the standard error here then you have the T stat and the probability value for The Intercept as well as the same information for the independent value so this was for the dependent this is for the independent value that you can see then which is the number of at Excel also plots out some of the regression data using the simple scatter chart so you can also add add the scatter chart that you want to according to your detail that you need from this so this is how your regression analysis work so this is more of the statistical analysis or the financial analysis that you are planning to do when you are to the financial department using Excel this is how your Excel helps you to do your work when it comes to working on the data analysis so back in the day there was something called as waterfall development or the waterfall model of development so when I say waterfall you can think about something like you know a banking application or insurance application or some Police Department application so the moment I say waterfall model you can think of like a really huge application which is you know made up of small chunks of code for example this application might have a front end this application would obviously have a back end and then it has some DNS routes and a few services that it's dependent on so it doesn't really matter how many services are part of this application but this entire application was shipped as a one whole application that's how things used to happen back in the day and this was referred to or this method of development the waterfall method the application was referred to as a monolithic application now it's called monolithic if I can write it properly monolithic application education now you know everything was fine and dandy unless like 2000s and 2002 and that's when things started to become a bit extreme because the clients would have 10 different requirements that would change almost every day now if you have one single application you have a single point of failure so if you were changing even this little part of this application and if this part was to fail this entire application would stop working that's when people started brainstorming about better approaches to software development and how can you meet the requirements of the Millennium of the today's world without actually affecting how the software is written or any application is written in the first place that's where the ideas of Agile development came into the picture so that was one part of the problem the second part of the problem was because it was a monolithic application the time that it required to actually push the changes for example let's say you have an application up and running in your production environment and your development team actually created a new feature or they modified the existing feature now that feature is supposed to go in production right but it actually used to take days and months back in the day because you would never really know what's going to happen if you put it to production today there are so many moving parts that you're never sure if it's going to break the existing application that you have so people used to actually schedule the maintenance I'm pretty sure you would have received those emails right like we will be unavailable during this weekend because there is a schedule downtime now that's what used to happen now that cannot really fly anymore right think about companies like Netflix Uber Amazon they can't really be down even for a moment because you never really know how many people are accessing the service now that is the second part of the problem that people were trying to solve that's when agile came into the picture now agile put in simple terms is a philosophy to rapidly deploy an application in much more organized way now obviously there's a lot more details than meets the eye but in a simple sentence that's what you mean by agile you want rapid deployment of the software or the code that you're writing without having to wait for a longer time at the same time you want to make sure that you have small chunks of code that can be shipped to the client or whichever application you are working with that's the reason agile exists today and now we're going to look at what do you mean by agile and how can you actually implement this so I hope we are pretty clear about why was there a requirement of agile to begin with this is what you mean by waterfall model now this is the traditional software building practice right so you would gather the requirements you would design or architect how you imagine your software to be and then there would be actually coding that would be executed there would be verification and there would be maintenance post deployment so this is what has been happening for the past four decades but this won't really fly in today's world because there are multiple changes being pushed every single day so you can't really just go through the months of planning for a little change so the way applications are developed have changed and so is the way applications are deployed so that's what you mean by waterfall model now I think I have explained it pretty well there are a lot of companies that still follow this model but at the end of the day all of them are trying to you know migrate to a more Agile development it's not so easy depending on the size of the company but you will still see a few companies using waterfall and they are in process of moving away from this model then comes agile to rescue so what is agile we've talked about the requirement or why does agile even exist so far now let's look at what do you mean by Aja so as agile is nothing but a chain of Rapid development and deployment meaning the first section of your software development is always the planning part but now you obviously know what you're about to build but you kind of break that entire application down into small chunks of code and then you work on those small Services one service at a time ensuring that first of all you kind of follow the microservices model and at the same time you don't really affect the entire application in general so you plan you design you architect and you actually develop the application you test you deploy it and you review it if you notice launch is actually outside of this entire circle meaning every time you make a change it could be something as simple as just one line of code change just a variable being renamed so it doesn't matter how small or how big the changes the idea is the moment the change is made it has to be deployed even in a Dev environment so that you can get a constant feedback over what's happening with the code that you have imagine if you actually had to wait for one month or two weeks just to get the feedback on if you really want that change or not now that could be a little Annoying or frustrating from the developer's point of view so that's the first aspect of agile you plan you design you develop you test you deploy the changes and then you review the changes before you actually launch them into production the other major aspect of agile is now instead of working with like a huge chunk of application you work in iterations so when I say iterations what I mean is you have a specific set of tasks that have to be completed in specific priority so that you already know what you're supposed to be working on and you are not really worried about 10 different microservices at the same time you have a specific requirement where you should be focused so you have the the first iteration which might be the first part of your application second iteration third iteration so think about it this way if you have amazon.in if you're working for you know Amazon.com or amazon.in which is a shopping website you might visit amazon.com and you might think yeah it's a one website it's not really one website it's a website that's broken down into several other services for example the website itself might be called a front end now the moment you reach to amazon.com you can click on a product right and then you can view the details of the product so that service it's not really a part of the front end anymore that is being called from something else called as catalog now once you decide that this is a product that I want to buy you click on buy now and then it's going to take you to something called a shopping cart and after you made the payments and all of those things you have email notifications and you have text notifications the point here is even though all of these things are working in Synergy they are actually completely separate Services completely separate tasks in the underlying architecture so if I am working on something on the front end I don't really have to be worried about the catalogs and the shopping because first I'm getting a constant feedback even before the launch I'm getting a feedback about what's going to happen once we launch our code to the production at the same time I don't really have to be afraid that it's going to break my entire application because all of them are developed as separate microservices your one service will never affect another service of course the dependent Services might be affected but the idea is you never really want a single point of failure that's the idea of a job now let's move forward now what are the terms and the values of our job so the first value is people over processes and tools working software over comprehensive documentation customer collaboration over rigid contracts and responding to change rather than following a plan so people over process and tools this kind of gives you you know a development Centric and client-centric environment meaning just because you've been doing something in a traditional way for the past eight years it does not mean you don't really explore the options that you have right now for example whatever you did with PHP MySQL yesterday can also be done with python and flask today right I'm not saying change your entire application what I'm saying is the model is pretty much people-centric the people like the development team and the customers and the end users are given more important and working software over comprehensive documentation now this is something that all of us would have noticed at some point in time right so every application would have an internal document of how long 100 Pages 150 pages about all the classes all the methods how the application is being built then why the application is being built who's the owner and like plethora of other details that you as an individual is not even concerned about you are concerned about what you have to build and how far along are you in that development task so in agile the functional application is given much more importance than the documentation because if you think about it the code itself is a documentation right if you knew how to interpret the code you could look at the code and that can also act as in documentation so I'm not saying there won't be any documentation all I'm saying is the development is given more importance over the documentation part and then customer collaboration over rigid contracts and responding to change rather than following the plan so agile is really feedback dependent meaning by back in the day you know the managers and the product owners will have multiple meetings they'll come up with the kind of software that they want everything would be discussed over like three four months and then people would want to stick to the plan because you already spent four months planning this thing now if you wanted to change even a little part of this the entire meetings and planning would have to be done all over again now agile changes that agile Works more on the feedback just because the plan has been made does not mean there cannot be any changes because you have broken things down into smaller chunk of tasks any one of the tasks can be modified according to the requirements at any point in time so these are the values that agile bring to the table right so there are two parts of the puzzle you have a benefit and then you have a value so benefit is what you get like a right off the bat and value is what you derive out of it so what we saw before we were the values where everyone you know on the table receives some of the other kind of benefit because of our job principles of agile satisfied customer welcome changing requirements deliver working software frequently frequent iterations with stakeholders motivated individuals face-to-face Communications measured by working software maintain constant Pace sustain technical excellence and good design keep it simple Empower self-organizing team reflect and adjust continuously now this might look more like a textbook thing like here are the 10 benefits you know just go with us but that's not really the case we are actually going to look at how all of this materializes like you know over the future slides when we actually talk about how can you implement a job in the working or the team that you're working with so let's move on now advantages of agile we pretty much kind of you know touched upon all of these things for now persistent software delivery increased stakeholder satisfaction inspect and adapt welcome to changes at any stage design is important and daily interactions now comes the meat of the entire presentation now you have a basic idea for a job right but question that everybody has at any point in time is what's in it for me okay you told me what's agile and how can it help but how can it help me as an individual or how can I actually Implement a job so there are multiple Frameworks or philosophies when it comes to Agile so scrum extreme programming lean kanban Crystal are some of the examples the most popular one out there it's called scrum now again these philosophies are not really set in stone it's not like if you're following scrum that you know it's 100 what scrum dictates you how to do it that way that's not really the case in majority of your cases what people do is they primarily Implement scrum and then they have some ideas from kanban extreme and lean and then they kind of have their own philosophy that works for their organization but scrum is the one that's used by the majority already other people so even before looking at the slide you know before we go through what we see in the slide I can just kind of explain scrum to you the way I know it right because I've worked with multiple development teams I've seen most of these being implemented and I know how each one of them work in a real world example so what is Scrum so scrum is basically an iterative philosophy meaning you iterate over the changes you iterate over the deployments and software development one at a time so if you wanted to talk about scrum scrum is an iteration of plan then build then test and then review now you would constantly be iterating over all of these aspects now what do I even mean by this so let's first look at how or what does a scrum implemented team looks like so in scrum implemented team you have the very first person that I want to talk about someone called as product owner now when I say product owner if you're coming from you know more of a traditional software development environment you can think of a product owner as a manager he is the guy who holds the responsibility to make sure that the application is deployed as and when committed at the same time the application is built exactly as the way it has to be built so product owner is the guy with the ideas he might not necessarily be a technical person he might as well be a guy from the management he does not necessarily have to know the development or the technicalities in detail he's the guy with the idea and the owner of the application that would be developed so pretty much all the accountability lies on him and then there is someone called a scrum master now scrum Master is someone that you would have traditionally referred to as a team leader or a project owner now you can think about scrum Master as a team leader in the hierarchical sense this is the person that's right below the product owner and this is the person who actually does the day to or handles the day-to-day operations like you know running the meetings or planning the tasks that have to be done and then you have the team itself which will consist of your developers and testers and you know depending on your requirement it might have a few more roles but then you have the actual team who will execute the tasks so these are the three roles that you have but now that we know the people that are involved how exactly this works I mean this looks pretty much similar to what you do at your office it's just fancy names right you have a manager you have a team leader and you have a team so how is it any different from what you do at your office so that's what we want to look at now I hope the roles are kind of clear to you now that you have the roles defined let's look at the first thing about the development so the first part of the development that we want to look at it's called Product backlogs now here's where things start to get a bit different from how you might have been working at a traditional environment now in a traditional environment you have an application that has been already planned for months and you along with others have been working on deploying the application and you know the project usually goes on for a few months or even a year or two depending on the size of the project now in product backlogs you actually have the same application iterated over in smaller tasks so when I say smaller tasks you can think about the same amazon.in example that we talked about so the first iteration will have plan it will have build it will have test and it will have review now in this case I'm not really building the whole application I obviously have an idea as to what the application is supposed to look like but for now let's say I'm only concerned about the front end or what the main or the primary website would look like then I have a second iteration where I would you know the same cycle plan build test and review but this is where I'm actually working with email notification I'm actually coding how would my email notifications be sent out and you know how do you manage the email cues and the rest of the things and then you have a third iteration which might just be your payment processing so in this case again the same cycle you plan you build you test and you review but the benefit here is that once you have the product backlog these are all your product backlogs these are the things that are supposed to be done over a period of time so the first thing you do is you define the product backlogs not you as a developer but the product owner and the scrum Master these are the people that would actually come up with all the backlogs instead of you know just one single application that says I want this thing to work they actually break it down into small Gems or code so that is the job of the product owner and the scrum Master because as I said product owner might not be a technical guy necessarily so scrum Master is the one that would always be your technical guy right so both of them would come up with the product backlogs once you have a product backlog there's something called as user stories so each one of these would now be referred to 2 as user stories and your scrum Master actually ends up prioritizing them meaning if you have front-end email payment processors and 10 other backlogs that have to be developed let's say over next five or seven months in that case the scrum master and the product owner would kind of prioritize now obviously payment processor is of no use if you don't even have a project so logically I would want to prioritize my front end over my payment processor so scrum master and product owner will prioritize the user stories that you have and depending on the priorities that has been set they come up with something called a Sprint backlog now spread backlog is when your development team actually gets involved in this because now you already have an organized and prioritized user stories that you are supposed to be working on so 10 different things are not just dumped on you at the same time you are given a logical and reasonable tasks that have to be executed one at a time and once you have the Sprint backlog you can actually start working on it as a developer now I'll get rid of this beautiful drawing that I've made for now and let's just look at now Sprint backlog I'm sorry I can't really you know write with my mouse but I hope you realize it's called Sprint now there are different you could call them ceremonies or rituals but there is something called a Sprint planning now the spread planning again it's just a fancy name for the meetings and discussions that you have during the Sprint planning the product owner will actually explain how he imagines the end goal or the product or the application to look like so you have something called a Sprint planning you have something called as daily scrum now daily scrum is nothing more than you know the 15 minutes meeting that happens every day where the developers and the testers and any other role that you have in the team can actually discuss what happened and where you stand if you need any help there are any blockades and what do you plan to do today or tomorrow and then there is something called a Sprint review just print review actually occurs at the end of the user story or the backlog that you've been working on so each and every one of these user stories right they usually are designed with the timeline of two weeks in mind now some companies the Sprints may vary like it could be two weeks to four weeks but in majority of the cases each Sprint will last two weeks so that you know exactly what you're supposed to do for the next two weeks now at the end of the two weeks along with you know your planning your daily meetings once your Sprint is completed you have a Sprint review where you actually demo the code that you have or you know just some kind of verification to make sure that Sprint is actually completed and then you move on to a new Sprint or you move on to a new user story that you have to work on so that's the idea that's how things work in general now with that in mind if we move on to the next slide this is what the scrum looks like so you have a product backlog and then you have a Sprint planning now as I mentioned before each one of these prints the timeline is usually a couple of weeks depending on the size of your organization it might last up to a month but for all the companies that I've worked with it's always been between one to three weeks so you plan for what has to be done during the next two weeks and then you have the user story or the backlog that you're supposed to work on and then you have your team that actually works on it along with the daily scrums so you have the daily meetings at the same time and at the end of the Sprint you have the review and then you ship the part that you coded now when I say you shift the part I don't mean you necessarily put it in production but you know that the part is ready to be assembled into the application that you have so the idea is at the end of every two weeks you have a shippable part of the application that is ready to be deployed so instead of working a huge application that would have taken a year anyway Now you kind of break it down into things that can be actually shipped in two weeks depending on the priorities that have been set by the product owner and the scrum master that you have that's the idea of scrum to break everything down into smaller chunks of coal smaller chunks of tasks so that everyone knows exactly what they're supposed to do that's like the methodical part of it right you have a method there's a specific best set of practices that you're following along with the technical side because you have a rapid deployment the moment you write a code you can actually test it in the dev environment now that's where people like me devops come into the picture but the idea is you don't really have to wait for a month just to see what you coded right now if you push the code right now in matter of minutes you will actually see that working in the dev environment so that's the technical side you have the instant feedback to figure out if you have to move on or you know if you have to make some some changes to the code that you have right now now that's Chrome and agile in general then there is a second method so scrum was one of the philosophies or framework then you have something called as extreme programming now this was one of the first ones a group of developers came up with it back in 2001. I think the guy was called Kent so they kind of came up with the idea of Agile development they came up with a set of best practices and then they even signed a manifest so they actually came up with a manifest that you know these are the things that we should be following in the industry these are the best practices and these are the principles and they even signed it so extreme programming has been around for almost a couple of decades and scrum is kind of the next iteration of extreme programming it's a bit different but as I said before majority of the organizations use the scrim programming so in extreme programming they came up with you know the basic sets of principles like people say Centric environment discipline and then you have rapid deployment so project requirements stories test cases tasks completion customer input iteration planning now both of these things are happening in parallel so you have project requirements you have stories test cases tasks and completion and at the same time you have customer input and at some point you have customer iterations in the meeting for example you developed 20 of the application and your end user or your client came back with a better idea or if they need some modifications so those are the changes then you have your uat testing you have client-side uad testing and acceptance at the end of it so extreme programming the ideas are somewhat similar to scrum but at the end of the day all of these philosophies are trying to make the lives of developers the end users better and not compromising on technicalities rather making the shippable product better and faster is the idea all right let's move on then you have lean programming so lean principles even this has been around for a while so eliminate waste amplify learning decide as late as possible decide as fast as possible empower the team build integrity and see the whole yeah so you could call it a framework you could call it a philosophy or methodology now it really depends on you know the word that you want to use but at the end of the day it's again trying to become a developer-centric and people-centric and setting best practices to make sure everyone in the team knows exactly what they're supposed to do and you have a cross functionality when I say cross functional essentially I'm pretty sure if you're watching this video is because you have some of the other development experience and if you do then you would have come across this point right when you talk to someone hey have you seen that feature have we looked at the code and that guy would be like you know that code doesn't concern me it's none of my concern I'm working on something completely different now we are used to that sort of development right where people individually know what they're supposed to do and they're not even worried about what the other person is doing now it's time we actually break the silos just because you're not coding that part it does not mean that the code does not concern the part of the code that you are writing so everybody has to come together and work on the same application which is what you'll call a cross-functional team in the scrum example once you have the user stories and the backlogs that you're supposed to work on in the next two weeks it doesn't really matter what role do you play in the team it's your team's responsibility to make sure that the task has been completed and the task has also been designed with the timeline in mind so it's not like you're expected to do six weeks worth of development in two weeks so the task itself is designed with the timeline in mind and then you should have something like kanban so kanban is similar to scrum the difference here is in case of scrum you have you know smaller chunks of backlogs that you're supposed to work on for the next two weeks or three weeks in case of kanban it's a continuous process so there is no such thing as Sprint in canva what you do in kanban is you have a list of tasks that are supposed to be done and for example you have something like you know a whiteboard you have a build Q you have a test queue and you have a ship queue now this is a hypothetical example you would obviously have plan and the rest of the picks but the point is you might have four things that have to be built or let's say three services that have to be built and once the first service has been built it actually moves on to the testing queue and then this place is occupied by another service and once this is done this is moved on to testing and this is occupied by another service meanwhile if this testing is done it will move to the ship queue and it would eventually be shipped so the idea is whatever tasks have been achieved will be replaced by a new item in the queue so if you have a build queue test queue and ship queue all of these would be moving parts so if your job is to build this and let's say your colleague job is to test this the moment you push this first item into the testing queue another item will replace this first item so that you know what is the next thing that you are supposed to be working on so kanban is more like a continuous implementation of the software development so that's what you mean by agility in general even if you think about it the English word agility means to be really rapid right agility would mean whatever you're doing is happening in Rapid successions right that's what you mean by agility in general so in case of development by bringing agility to your team you are ensuring that everyone's happy at the end of the day and you still have a technically smarter team that's able to get instant feedbacks now I'll give you a quick example of this now I'm pretty sure all of us or at least most of us are aware about Netflix now you would be surprised to know that Netflix is pushing more than 1 000 changes every day into their Productions if you actually worked at a Netflix development team you would know that these guys are pushing a thousand changes in production every day now how do you think that's possible always they are not pushing it to production without reviewing it or without testing it even with all of those things in place how are they able to deploy more than 1000 changes every day now these could be very little changes like you know some UI fixes some database fixes some payment process and fixes so we are not really concerned about what the changes are but I know for a fact that that's the number that's the amount of changes that they actually push every day that's possible because you mean rapid deployment by agility or by becoming agile so that's the level of agility you can actually obtain the moment you have an organized team that is working on the principles of agile now obviously there are external factors like you know how's your infrastructure how's your devops stream but it is possible at the end of the day and then there is one more framework that's called Crystal so these are the ideas of a job now I hope I was clear into how can agile help your team in becoming a better development team so there are three aspects of it right philosophical Technical and the way software is built so philosophical being the best practices like how do you define your teams Moses scrum Master who's a product owner what's the team what is a Sprint what are the tasks that you're supposed to do then you have the technical side of it like if you build a code how exactly can you deploy the code like automatically how can you review the code how can you test the code automatically and then there is a software development aspect of it that you're moving away from a monolithic application towards the idea of microservices so these are the three aspects that move parallely and at the end of the day it gives you a peace of mind it gives your product manager a peace of mind and the end user a peace of mind with better ideas of deployment so that you don't really have to run around on 10 different desks confirming if your changes are actually deployed or not [Music] thank you so water scrub now to understand the importance of scrum let's look at how scrum compares to the older alternative which is the waterfall development the waterfall model consists of a long time which is involved in planning then again by long time I mean a couple of weeks maybe a couple of months then it might take about the same amount of time to build the product say a couple of months more then testing followed by reviewing and then finally deploying the product at this point you might end up bringing the wrong product according to the market demands to the market because when you started out with planning this product it was almost a year ago and now the market demand might have changed now I'm sure a lot of you might already have guessed what the problem is with such a model firstly the entire plan should be complete before you start building or testing your product now obviously if you're doing this in one whole iteration it means that most of the planning is being done without understanding the project or the difficulties that you might face while building the project the drawbacks the loopholes and most times after you start building the project it is sent back to the planning phase again which brings you back to your Square One in which case your project has to start over or the developers are just criticized for not understanding the plan properly this process can be repeated many a times which consumes a lot of time followed by the product getting thrown over the fence to test where in case you encounter problems the project is thrown back to the building process or sometimes again back to the planning process and the same thing can happen over the next few steps including a lot of backstepping and doing over this can lead to lag times and many months or several years in order to get a product out the door scrum solves this problem by breaking the project into smaller parts first you start out with just enough planning to get the project started next you build the project with the minimum amount of features which is like your base features and then you test for the same and review for the same and once the cycle is complete you have in your hands a potentially shippable product now this process takes around two to four weeks and it is repeated time and time again reducing the time taken for the cycle of planning to testing to reviewing a certain product this way you end up with several different versions or several different incremental releases each taking lesser time than the previous or just about the same time and by the time you have five iterations of these you already have five vividly improved versions of the same product each keeping up with the market development that's going on outside of your company and this is called scrub so scrum is a framework within which people can address complex adaptive problems while productively and creatively delivering products of the highest possible value now you need to understand three things about scrum firstly scrum is lightweight second it is simple to understand and third it is very difficult to master scrum itself is a simple framework for effective team collaboration based on agile and it is the opposite of a big collection of interwoven mandatory components which are a part of the previously discussed waterfall alternative now scrum promotes developing products through processes techniques and practices with various iterations and increments to deliver maximum value as I had mentioned before so that way the first cycle of your planning building testing reviewing shall be called your first iteration the second one will be your second iteration the third time you plan build test and review will be your third iteration followed by the fourth and finally the fifth now these iterations are called Sprints and at the end of each Sprint you have the launch of a potentially deliverable software each software version much better than the previous one each Sprint takes up about one to four weeks and you keep repeating these prints until your product is feature complete sometimes you might even end up shipping your product after just the second Sprint but sometimes it may also take 3 4 or 5 or even more Sprints but at the end of it you have a good product and that is what matters now scrum follows an agile approach to tackle complex problems now agile is not a methodology or framework or process now agile promotes self-learning and self-organization as opposed to a lot of planning and a big collection of interwoven mandatory components it implements a scientific method which replaces a programmed algorithmic approach with a more heuristic one with respect for people and self-organization to deal with the unpredictability of complex problems this approach basically means not planning the whole entire process to create a potentially shippable product but to only plan enough to start out with and then react to the responses that you get with each Sprint or each iteration now there are three roles in scrum first of all you have the person with the bright ideas who is the product owner next you have the scrum master who is the one basically implementing agile and preparing the team to follow the agile approach and making them more efficient and finally you have the team mostly the team consists of developers testers writers it can be anything each of them can be replaceable by either one of them so on certain days you might find that developers are testing testers are writing so on and so forth the main objective here is to create the product in the best possible way there are a lot of reversible roles that are being played in the team now to make you understand the approach of the scrum master or the agile approach let me take a problem statement so take for example there are a number of people in the room and they have to queue up according to their respective Heights taking as less time as possible so there are these many people in the room now to this problem there can be two solutions first of all we have the supervisor approach here we have a traditional supervisor or a manager and basically what they do is they arrange people one by one taking up the time for each person to align themselves which basically takes up a lot of time and also as a team you learn nothing about organization but then you have what you call the agile approach here let's bring about the same number of people and we have its scrum Master now what the scrum Master does is he allows the team to organize themselves and in the end he brings about whatever changes he deems necessary and that my friends is the agile approach now the agile approach which is mostly used for software development is an approach under which requirements in Solutions evolve through the collaborative effort of self-organizing and cross-functional teams so basically people here are self-organizing which consumes less time under the continuous guidance of their scrum master so what it is is that it's a set of practices that promotes continuous iteration of development and testing throughout the development cycle of the project as I had mentioned before it promotes self-organizations and assists teams in responding to the unpredictability of the problem in hand and the person that promotes such an approach is called the scrum master he or she is a person which allows the team to self-organize and manages the process for how information is exchanged now there are a few anti-patterns when it comes to the scrum Master first the scrum Master is not a supervisor now this problem is quite common with people that are usually old project managers being a scrum Master is completely a different role than a project manager and therefore the way the individuals act must be completely different typical project managers tend to organize the work of the people instead of allowing them to self-organize they tend to say how by whom and by when the work must be done not giving the team much space to think on their own the result of this is basically a team full of robots instead of people who have the mind to think on their own a scrum Master must be aware of his behavior to be able to correct it it's not easy usually these people have worked for several years in a typical project management position and simply deleting this Behavior will take up some time but being aware is the first step let people grow let people have their own breathing space and do not try to be a supervisor it is not a control situation here second the scrum Master is not just a secretary now many people think this way in such a situation I get extremely worried because it means that people do not know what a scrum Master should do when people think the scrum Master is just a guy booking meetings felicitating dailies and serving coffee to their colleagues this is where things get very dark and unfortunately this is very common the scrum master has dozens of different roles and hats but this is way far away from the secretary job that most companies associate seats from masters with scrum Masters teach recruit and Coach people to grow into true leaders and work cohesively as a team and finally self-organization does not mean the absence of management like I had mentioned before a scrum Master is there to constantly monitor the growth of the team letting a team self-organize does not mean that there is nobody to supervise them with that let's move on to the scrum framework now scrum is a simple lightweight framework it is not a methodology scrum implements the scientific method of empiricism which replaces a programmed algorithmic approach with a more heuristic or a self-learning one with respect for people and self-organization to deal with unpredictability and solving complex problems as you can see in the graphic in front of you it starts with the inputs from stakeholders and then the product owner with the bright ideas all the way till the finished products which is then reviewed in the Sprint retrospective now most of you might wonder what empiricism is now empiricism basically means you're working in a fact-based experience based and evidence-based manner scrum implements an empirical process where progress is based on observations of reality and not fictitious plans scrum also plays great emphasis on the mindset and cultural shift to achieve business and organizational agility so basically there are three pillars to empiricism first of all you have transparency now this means that you are presenting facts as is all people involved the customer the CEO the individual contributors are transparent in their day-to-day dealings with others they all trust each other and they have the courage to keep each other abreast of good news as well as bad news everyone strives and collectively collaborates for the common organizational objective and no one has any hidden agenda the next pillar of empiricism is inspection now inspection in this context is not an inspection by the inspector or auditor or supervisor but it is an inspection by everyone on the scrum team the inspection can be done for the product processes the people's aspects practices and continuous improvements for example the team openly and transparently shows the product at the end of each Sprint to the customer in order to gather value feedback if the customer changes the requirements during inspection the team does not complain but rather adapts by using this as an opportunity to collaborate with the customer to clarify the requirements and test out the new hypothesis and finally as I had just mentioned adaptation is the third and final pillar of empiricism adaptation is this context which is about continuous Improvement the ability to adapt based on results of the inspection everyone in the organization must ask this question regularly are we better off than yesterday for profit-based organizations the value is represented in terms of profit the adaptation should eventually relay back to one of the reasons for adapting agile for example faster time to the market increased return on investment through value value-based delivery reduced total cost of ownership through enhanced software quality and improved customer and employee satisfaction so these were the three pillars of empiricism now let's go ahead and look at the scrum life cycle now this slide is a graphical representation of an agile project using scrum starting on the extreme left you can see the product owner owning the backlog and developing user stories with the team or requirements for the project or the product the product backlog is prioritized with a higher priority items on the top of the backlog the team with the product owner then decides to group the user stories into releases based on the product roadmap once the release plans have been completed the user stories are then selected for a Sprint the Sprint can be from two to four weeks or sometimes one to three weeks the team then disaggregates each user story into tasks and then in each Sprint the product is developed and as the code is written it is integrated into the system and daily scrums are held at the end of a Sprint there is a Sprint review where the working software is demonstrated and presented then it is presented to the customer for acceptance the team then conducts a Sprint retrospective to ask themselves the question what could be done better the stats on the team are then updated as are the information radiators which transparently displays the status of the team which again make their way back to the user stories and then the whole cycle begins again next let's discuss what is a Sprint now as I had mentioned earlier Sprint is basically an iteration of planning building testing and reviewing a Sprint has consistent durations throughout the development phase and this is mostly between two to four weeks it cannot be more than four weeks long as described in the scrum guide a Sprint is an iteration as I had mentioned before it has a time box of two to four weeks during which a product is done usable or potentially releasable and by product I mean an increment of a product like a version of a product now Sprints have consistent durations which are usually limited to One calendar month now during this print no changes are made that would endanger the Sprint goal the quality of goals do not decrease and the scope may have to be clarified now each Sprint may be considered a project with no more than a month's Horizon in which you have to accomplish something each print has its own goal a design and a flexible plan that would guide in building it with the resultant product increment now the thing with the Sprint is if the Sprint's time box is too long then the definition of what is being built may change complexities may arise and the risks may increase hence Sprints enable predictability by ensuring inspection and adaptation of the progress towards a Sprint goal at least every calendar month now there are a few factors which affect the duration of the Sprint first of all the stability of backlog now obviously if backlogs keep increasing that may push the data off the Sprint and secondly you have overhead costs now Sprints also limit risk to One calendar month of cost if the cost increases then it may affect the duration of a Sprint now overhead costs now each Sprint is going to have a Sprint meeting the testing phase a review and a retrospective now these are overhead costs now if the overhead cost can be reduced by planning and automation integration Etc these costs can be absorbed Now by reducing these costs overhead costs you can reduce the duration of the Sprint now it is the scrum Master's job to make sure that the development of the product goes according to the Sprint planning now there are a few other Sprint terminology which I want to make you aware of first of all you have your Sprint goal now as the name suggests what you want to achieve at the end of the Sprint your product with the feature sets that you have decided with your product owner and your customer is your Sprint goal then you have your Sprint cancellation now once the Sprint duration has been determined and the user stories are selected neither the duration of the Sprint nor any user story can be altered however a Sprint cancellation occurs if there is a significant change in priorities or mid course correction in between is Sprint considering we are only talking about two to four weeks of work the cancellation of a Sprint is highly unlikely to occur next let's discuss the scrum artifacts now scrum has three artifacts product backlog Sprint backlog and product increment firstly you have product backlog Now product backlog as it is described in the scrum guide is an ordered list of everything that is known to be needed in a product it is the single source of requirements for any changes to be made in the product and the owner is responsible for the product backlog including its content availability and ordering now a product backlog is never complete the earliest development of it lays out the initial known and best understood requirements the product backlog evolves as the product and the environment in which you're going to use it evolve continuously it is dynamic and constantly changes to identify what the product needs to be appropriate it is competitive and useful if a product exists the product backlog also exists now from the backlog refinement is the act of adding detail estimates and order two items in the backlog this is an ongoing process in which the product owner and the development team collaborate on the details of the product backlog items during this product backlog refinement items are reviewed and revised multiple scrum teams often work together on the same product one product backlog is used to describe the upcoming work on this particular product next you have something known as is Sprint backlog now the Sprint backlog is a set of product backlog items that are selected for the Sprint plus a plan for delivering the product increment and realizing the Sprint goal the Sprint backlog is a forecast by the development team about what functionality will be in the next increment and the work needed to deliver the the functionality into a complete increment now the Sprint backlog makes visible all the work that the development team identifies as necessary to meet the Sprint goal to ensure continuous Improvement it includes at least one high priority process Improvement identified in the previous retrospective meeting the Sprint backlog is a plan with enough detail that changes in progress can be understood in the daily scrum the development team modifies the Sprint backlog throughout the Sprint only the development team can change its print backlog during a Sprint the Sprint backlog is a highly visible real-time picture of the work that the development team plans to accomplish during a Sprint next you have the increment now the increment is the sum of all product backlog items completed during a Sprint and the value of the increments of all previous prints at the end of a Sprint the new increment must be done and done in scrum has a different meaning done in scrum or agile basically means that the product must be in a usable condition and meet the scrum team's definition of done now you might think your product is done but if that differs from my definition of done and we both are in the same scrum team then your product is not considered done an increment is a body of inspectable done work that supports empiricism at the end of the Sprint now the increment is a step towards a vision or a goal and it must be in usable condition regardless whether the product owner decides or not to needs it with that let's move on to the ceremonies or events in scrum now scrum has four ceremonies first you have Sprint planning then you have the daily scrum then you have Sprint review and finally you have Sprint retrospective now I'm going to touch up on each of these one by one first you have Sprint planning now Sprint planning is basically the plan of work to be performed during this print it is time boxed to a maximum of eight hours for one month or two hours a week for shortest friends the event is usually shorter and the scrum Master ensures that the event takes place and attendants understand its purpose the scrum Master teaches the scrum team to keep it within the time box now Sprint planning answers three major questions first what can be delivered in the increment resulting from the upcoming Sprint this basically discusses the advanced feature set that you are going to deliver in your upcoming version of your product next how will the work needed to deliver the increment be achieved this is where you plan what you do in the week and the week next to it is a scrum master that teaches the scrum team to keep their work within the time box and three when is your work considered done as we discussed the definition of done is very important to the entire scrum team everybody has to agree upon a checklist on completing which your work should be considered as done next we have the daily scrum now the daily scrum is an internal meeting for the development team it is a 15-minute time boxed event to synchronize your activities and create a plan for the next 24 hours it is held every day of the Sprint and the development team plans work for today and tomorrow this optimizes team collaboration and performance by inspecting the work since the last daily scrum and forecasting upcoming Sprint work the daily scrum is held at the same time and place each day to reduce complexity and answers fundamentally three questions what did I do yesterday what am I going to do today and what are my impediments my impediments I mean challenges and requirements that you have while completing your work it is very important that your scrum Master know the challenges that you're facing and why you couldn't complete your work yesterday or why it might be difficult to complete your work today the scrum Master ensures that the development team has the meeting with the development team is responsible for conducting the daily scrum the scrum Master teaches the team to keep the daily scrum within the 15 minute time box and he or she also ensures that if there are others present they do not disrupt the meeting next we have this print review now Sprint review is an event to inspect the increment and adapt the product backlog if needed there could have been a single deployment or many deployments during a Sprint which lead up to that increment to be inspected during a review the scrum team and the stakeholders collaborate about what was done in the Sprint and based on that any changes to the product backlog during the Sprint attendees collaborate on the next things that could be done to optimize the value and this comes in the form of an informal meeting and not a status meeting the presentation of the increment is intended to elicit feedback and Foster collaboration it is time boxed at one hour a week and the Sprint review includes first the product owner explaining what product backlog items have been done and what has not and two the development team demonstrating the work that it has done and finally the entire group collaborates on what to do next this is at most a four hour meeting for one month's prints and for shorter Sprints as I mentioned it is one hour a week finally the aim here is so that a Sprint review provides valuable input to subsequent Sprint planning the review of how the marketplace or the potential use of the product might have changed will obviously affect what is the most valuable thing to do next and review of the timeline budget potential capabilities and Marketplace and competition for the next anticipated releases of the functionality and capability of the product is also discussed the result of this print review is a revised product backlog that defines the probable product backlog items for the next print now this backlog may also be adjusted overall to meet new opportunities finally you have the Sprint retrospective now this is basically an opportunity for the scrum team to inspect itself and create a plan for improvements to be enacted during the next print this occurs after the Sprint review process and prior to the next print planning this is almost a three hour meeting for one month's Prince and it is time box at 45 minutes a week this is an opportunity for the scrum team to improve and all members should be in attendance now each team member has to answer three questions firstly what went well this basically helps you tell your scrum Master what went well in your previous print and those practices should be enforced in your coming Sprints as well next what didn't work well what were the impedances difficulties and requirements which weren't fulfilled in the previous print and finally what should be done differently here you introduce the changes that you want in your working process or in terms of software that you're using and these are the differences that you need to commit to to improve in the next print the scrum Master encourages the scrum team to improve its development process and practices to make it more effective and enjoyable in the next print during each Sprint retrospective the scrum team plans ways to increase product quality by improving work processes or adapting to the definition of done if appropriate or Not In conflict with the product or organizational standards by the end of the retrospective the scrum team should have identified improvements that it will Implement in the next print and implementing these improvements in the next print is the adaptation to the inspection of the scrum team itself although improvements may be implemented at any time the Sprint retrospective provides a formal opportunity to focus on inspection and adaptation with that we've come to the end of all the scrum ceremonies that I had to discuss to understand why one should become a business analyst there should be a several aspects that you should be looking into first being the job opportunities well speaking of job opportunities in India there are up to 6 000 vacant jobs as per now and this includes both for experienced as well as freshers while in the US there are up to 43 000 vacant jobs now this data is based on data available on LinkedIn Glassdoor or even indeed similarly we need to take a look at the Silicon valleys because silicon valleys always offers ample number of jobs now speaking of Silicon Valley of India Bangalore offers around 2000 vacant jobs includes freshers and experienced while that of California offers up to 4 300 jobs now again the this includes data collected through Linkedin and Glassdoor the next aspect which comes into your picture when you are starting your preparation to become a bi analyst would be salaries definitely we've got you covered here and average salary in India comes up to 6 lakh per year while in U.S comes up to 91 000 dollars per year now again this data is collected through Linkedin and Glassdoor now through this you can understand this is an average salary which includes even for freshers and until experienced so the salary can go up to 10 lakh if you are experienced or even beyond that with that said I hope this might have satisfied you to start your preparations for bi analyst well if not yet we'll go ahead in today's session and understand who is a bi analyst why one should become a bi analyst and furthermore questions well the next thing that you will have in your head is who is hiring which are the companies where should I apply well these are the companies that you should be applying for now as you can see in the data collected here most of the product based companies is hiring for bi analyst now you can say by this that every company which is product based predominantly is looking for a business intelligence head now what is business intelligence mean like I gave in the start of today's session bi is basically applying intelligence to one's business now already established service based companies definitely they would need bi analyst but they would not need in such dominance because they have already established so they need someone who can manage and go through to take forward but while product based company they always need to think in Innovative way think outside the box this is where a business intelligence analyst comes into picture and hence applying for product based company is always a go here hope that was clear with that let's move on to our next session which covers as to who is a business analyst now basically like I said in the start of today's session business intelligence analyst is someone who applies their knowledge or their experience or perhaps their intelligence into business making into project making into establishing new aspect in their business so business intelligence analyst who is also known as bi analyst like I said uses data and other information to help organization make sound business decisions now of course there are several aspects one has to consider in order to become a business analyst now some of these aspects includes data mining reporting descriptive analysis statistical analysis data visualization visualization analysis querying data preparation now through all these process you realize that these are the tasks that has to be covered as a analyst so mining of data is very important as and when you get reports or as and when you get raw data from your clients perhaps or even your end users customers you have to clean it you have to analyze it you have to report make reports of it give a descriptive analysis statistical plot graphs and give a visualization out of it now we will approach as to how we can visualize these data further ahead in the session today so acquiring is important visual analysis is important so these are the common tasks a bi analyst have to carry on so I hope that was clear and you got to know what a busy I analyst does in a skeleton with that we'll understand how exactly does an organization benefit out of bi analyst now there are several ways one organization can benefit some of the points I've covered here which is to identify ways to increase profit like I said applying intelligence to any job will always benefit them now imagine you have a protocol that has been followed what if something comes out of the box what if something comes which you have not anticipated that is where a business intelligence analyst comes into picture and he will think as to how this particular situation can be handled this is where you can increase your profits the next thing is to analyze customer Behavior just like I said a customer might feel unsatisfied or dissatisfied at any given point of time there he might just change his requirement and these things should in Prior be analyzed by someone because these things are the ones which consume a lot of time now if it has been analyzed prior it will consume time and money also efforts perhaps so this is where a bi analyst will come into picture the next thing is to compare data with competitors now who will do if there is a developer and tester he will just work on that particular product or delivering that particular product to the team but how do you know if this product is doing good in the market how do you know if this is beating someone else in the market this is where a bi analyst will collect all the data and make a report out of it the next thing is to track performance it's not just important just to maintain your records or to reach to the customers requirements sometimes the performance will break down time to time it is important for us to keep track and maintain that balance this is where a bi analyst will help you do that the next thing is to optimize operations of course the end operations will be maintained by the person who has created it now to optimize or to give the requirements of optimization will be done by a bi analyst you need to predict success without seeing what what is the outcome or at least drafting the outcome no one can actually work on an end product you need to spot market trends now without analyzing the market values like we just discussed about competitors you don't know where to stand perhaps what product would you make now without knowing the market trends you will not know which product you should be making to which customization the next thing is to discover issues and problems definitely with every business every project that you encounter there will be certain problems be it big or small now this has to be documented prior with a good intelligence with that being understood we will move on to our next session which says how to become a Pianist now we have well understood who is a bi analyst what does he do how this will benefit an organization so we need to understand how should we become one because all these things can be answered only when you know how you can become a bi analyst this is where our job description will come into picture now if you go through several job descriptions you will know and understand as to what we should be learning for so going through these job descriptions we will come to a conclusion of skills that we should be learning or even tools that we should be learning as a bi analyst so let's go through the job description here again a very popular online platform which is Disney plus hot star they are hiring for a bi analyst now what are the requirements that they see so I've taken a job description here where I have highlighted few points which summarizes the requirements of the person who has to be working as bi analyst now it says that he should be well versed in power bi Tableau prepare and analyze data data analytics perhaps there is R SQL python good communication skills analytical skills well through this I have summarized all the points which goes like power bi and Tableau is a must ensured tool you should have good analysis of data good communication skill which is a general one to everything there is SQL there's python or R now if this is not enough we'll go ahead and see one more job description which is Dell now Dell is a popular service based and product based company as well here I have summarized to project management yes project management is also one important skill of course this will not be asked in every company but in most of the company or several companies this could be asked why not be unprepared right let's be prepared on every possible skill that will be asked in every kind of company as a bi analyst there's project management again power bi Python and SQL as you can see as we go ahead in several job descriptions here power bi Python and SQL stays constant let's see one more which is Puma again another product based company and here again we have power bi Excel data analysis SQL and python so from this we can conclude some of the skills that we need to become a a bi analyst so as we have come across we need some database Concepts SQL predominantly SQL is must in should be known by everybody who is applying for bi analyst Excel now Excel is also one more important tool that is to be known by a bi analyst if you have good Excel skills you can actually analyze data now sometimes there are cases where people do not know any kind of data visualization tools did you know that these kind of data visualization can be done even in Excel so knowing Excel is an added Advantage analytical skills of course being an analyst analytical skill is a must and should power bi now here I can go ahead and add Tableau also the next thing is python python is where you use it is a programming language where you use for your data visualization now to plot graphs or even to make your analyzation of data easy you can use Python so usage of python is very important so make sure you know python now project management skills here I must mention most of the skills that I mentioned here or perhaps even every a skill that I just mentioned here we have courses for everything on our Channel I'll speak about it towards the end of today's session but here I had to mention that be it SQL or Microsoft Excel or even analytical skills power bi python project management also we have course on our website now if you think that you don't want to do the course you are good enough and you just want to brush up you can go ahead on our YouTube channel also like I said I'll be speaking about it in the end of today's session now moving forward we have our language also R again is used predominantly when you have to clean data you have to process data or even plot some visualizations you need to be using R Tableau again one of the data visualization tools just like power bi at this point I would have to mention that here it is not just these skills you need to be aware of certain tools as a bi analyst now in the job descriptions if you are an experienced person there are chances you that you will be asked for a tool based experience now speaking of that we will address some of the tools here companies use different tools now here I have spoken about sap business objects data Pine microstrategy SAS bi intelligence this click sense there's Zoho analytics license and even power bi is a tool now these are some of the tools that companies predominantly use some of them are free some of them have trial basis so you can go ahead and pick any one of them or even you can search on web and you will find several number of bi tools so with that we will move on to our roles and responsibilities now as a bi analyst it is very important that you know certain roles and responsibilities prior to starting your preparation so the first thing you need to know is you need to be learning and fully understanding the data landscape in databases and applications so data is your king now everything that you have to work is in and around data so make sure you understand and study data completely the next thing is using and developing data collection process again it surrounds around data itself so it will be around understanding data the next thing will be around collecting data so everything will revolve around data itself the next thing is reviewing and validating here you can understand there is a sequence or a step of responsibility that a bi analyst have to follow the next thing is gathering end user reporting and dashboard requirements finally you have to review validate maintain customer data as collected also maintaining is very very important make sure you will always maintain good data and also your projects [Music] now let us discuss bi analyst interview questions and answers as already mentioned they are divided into two categories general questions and Technical question first let's have a look at the general questions that you may be asked for example you may be asked about your experiences business intelligence here you should talk about your educational specialization if it is related to bi your expertise in it your career experience with it and if so you mentioned the field of business intelligence and which you specialize then describe beauty of career Journey also remember to show a clean interest in the company's business sphere then you may be asked about your opinion on a giant software development for bi project firstly you should Define agile a giant software development refers to the software development methodologies that involve iterative development where requirement contributions evolved through collaboration between self-organizing cross-functional team it Advocates adaptive planning evolutionary development early delivery and continual Improvement the benefits of agile law that it is more collaborative saves on time and since the teams directly work with the client they provide the clear outcome in an incremental way so it is but many companies still prefer using waterfall methodology especially for projects with concrete timelines and well-defined dollars if you have worked in a child talk about your experience then you may be asked about your experience in sdlc well still there was to software development life cycle you should Define lcmc software development life cycle is a process used by the software industry the design developed and test high quality software it consists of a detailed plan describing how to develop maintain replace and alter origin hand your specific software you should talk about your experience in fplc if you don't have experience in it you can talk about your interest in working more with it now let's have a look at some examples of technical questions that you may be asked the first you may be asked to Define business intelligence business intelligence is the field that aims to extract meaningful and useful insights from data and represent their Insight in the form of report drafts Maps summary cards and other such formats they helped me strategic and tactical business with you next you may be asked about uses of business intelligence the users of business intelligence includes that it helps improve decision making by gathering inside it helps in studying market trends it helps what weak points in business operation and it helps improve the internal efficiency of organization next question what is data flow data flow consists of the sources from where we extract the data and the destination where we load the data it consists of transformation that modify and extend the data and the paths that link sources Transformations and destination with three important categories in data flow are data sources transformation and destination next question what is control View a control flow consists of the tasks and containers that execute when the package runs to control order or Define the condition for running the next task for container in the package control flow we use Precision constraints which connect the tasks and the containers in a package excellent server integration Services provide three different types of control flow Elements which includes containers that provide structures and packages tasks that provide functionality and principles constraints that connect the executable containers and tasks into an ordered control flow what is ola olap is an online analytical processing it is a Computing method that enables users to easily and selectively extract and query data in order to analyze it from different points of view olap business intelligent queries often even French analysis financial reporting sales forecasting budgeting and other planning for officials next you may be asked to differentiate between olap and EPL tool ETL can extract transform using various Transformations that are available in the tool and aggregate the date the output can be used as an input for the olap tool ETL is the initial part of data we have while Olaf supports online reports after performing certain join operations and creating some queues reporting dashboard consolidation planning analysis Master data management workspace and Foundation you may be asked about the tools used in msdi there are three tools exercise or Excel server integration services it is useful for integrating the data coming from different data photos in a data web is generally responsible for World transaction ssas or SQL Server analysis which is an online analytical processing tool online analytical processing Data Mining and Reporting tools used in business intelligence while SSRS stands for SQL Server reporting service once the data is in its final State SSRS provides the previous necessary to create reports to better understand the data next what is the workflow in ssiu a workflow is a set of instructions programming video how to execute tasks and containers within exercise packages next what is the difference between where and having clause in SQL Server now there can be three differences first having Clause can be used with Group by clone while hair Clause is used with constructs like select update delete Etc the having Clause is applied as a filter up to the broker clone there are in weight loss is applied to every Row in the select update delete delete etcetera construct when both having and way crosses are doing having Clauses applied after weight loss to further filter the output of the wear flowing while the where Clause is applied before the having Clause to every Row in the select statement next what is the difference between View and material View first if the definition of your displays a queries output as a table while a materialized view or schema object which can be used to summarize pre-compute replicate and distribute data example to construct a database then all operations were formed on a view will affect data in the base table and Source subject to Integrity constraints and triggers of the base table foreign by storing the results of the query in a separate schema object a view can be used to simplify SQL statements for the user or to isolate an application from any future change to the base table definition a view can also be used to improve security by restricting access to the free retirement set of roles or columns SQL Server costly truncate is a detail that is data definition language command while deleted the BML that is data manipulation language crankcase removes all the records from a table without making a log entry for individual Revolution while delete removes all or selected records from a table by making a log entry for individual resolution trances possible while delete is pure bronchase removes all the records from a table under various levels or filter condition cannot be used well in case of delivery it can remove selected record or all records view from Global their clause which is optional is used or not effectively trumpet cannot be used on a table if it is referenced by one or more foreign key constraints or it is marked enabled for replication while the need can be used in these conditions finally truncate research size entity in any of the columns in the table while delete does not reset the identity in any of the columns in a table next we have the intermediate level question first name the components of exercise now SSH has five components first is the Olaf engine it is used to provide for adverb queries by user then drilling which is the process of exploring details of the data pivoting is the process of switching categories of data between rows and columns slicing is the process of casing data in closing columns and the fifth component which is dimensional database are used in Ola next you may be asked to type the sscs architecture the architectural view of SQL Server analysis Services is based from a key Tire architecture which consists of course the rdbms the data from different sources like Excel database text and others can be pulled with the help of ETL tools into the RDP second SSA which Aggregates data from Advocate data from rdbms is pushed into sfas Cube by using analysis of this project the SSH repeated an analysis database and once the analysis recognition it can be used for many purposes portal Etc coming to the next question describe errors in ssis errors generally occur due to unexpected data values when a data flow component applies the transformation to column data extract data from sources a load stay time to destination types of errors are they occur when The Connection Manager cannot be neutralized the connection string in case of data fulfill data exploration and control flows that use the connection string data transformation error occur while detecting transformed over a data pipeline from Source the destination while expression evaluation error occur if expression cannot evaluated at runtime perform invalid what languages are used in SSO the language of use in SSH or SQL which is structured to any language explain lets you access and manipulate databases multi-dimensional expression it is a query language for online analytical processing using database management system like as well it is a query language for olap food there is also a calculation language with syntax similar to spreadsheet formula next DMX for data mining extension is a language that you can use to create and work the data mining models and SFA you can use VMS to create the structure of new data mining modules to train these models and to browse manage and put it against them next we have the analysis Services scripting language air system which is an extension to Identity that adds an object definition language and command language for creating and managing in order to focus of structures directly on the server you can use ASL in constant application to communicate the analysis purposes over the Internet Protocol next what is right back right back is the ability to update Source systems such as databases to maintain systems of Records while staying within the context of a bi application this is essential to the concept of embedded bi a good bi application will enable these data updates in a secure and managed way the business analysis enhancements that are available in Microsoft SQL Server analysis services are I am intelligence account intelligence dimensional intelligence custom aggregation semi-aductive Behavior custom member formula custom sorting and UNICEF third settings and dimension right back you may also be asked to write an SQL query let's take an example to write an SQL query to extract post Apple from a cell the SQL query for this will be select the dot calendar dot calendar year dot member dot item 0 on 0 from Adventure book your adventure box is the database it will give the output of the first couple from the set as you can see now under normal circumstances each level in a hierarchy Microsoft SQL Server analysis services that is associated has the same number of members above it as any other member at the same year in the right hierarchy The Logical parent number of at least one member is not available complicated now let us come to the advanced level question firstly what is the difference between DTS and SSID DDS stands for data transformation services while SSI stands for SQL Server integration services in TCF there is limited error handling while SSI provides more complex error handling the messages in VPS the message book use active while in ssis there is no deployment reserved while in exercise list we are provided with an interactive deployment wizard there are more transformation finally in VTS there is no bi functionality where in ssis there is complete integration with PL Rich lookup cache mode are available in SSI now there are three lookup cache modes in ssis the first is the full tertiary mode which is the default version mode will cause the lookup transformation to decrease and store in ssis Cache the entire set of data from the specified lookup location as a result the data flow in which the lookup transformation decide will not start processing any data for code until all of the rows from the lookup query have been captured in ssia next we have the partial cashew when using this lookup values will be caching but only as each distinct value is encountered in the data flow initially each distinct value will be retrieved individually from the specified code and then cash it to be clear this is a Roper or Looper for each distinct key value next the no cache mode will not add any values to the look of catch in ffi as well sorry every single Row in the pipeline data set will require a query against the lookup code since low data is partially it is possible to save a small amount of overhead in ssis memory in cases where key values are not easy memory the pipeline data set is small and is not expected to grow there are expected to be very few or no duplicates of the key versions in the pipeline data how to log ssis execution ssis includes login feature the drive log entry when runtime events occurs we can also write custom materials all that is not enabled by default integration Services supports a diverse set of low providers giving you the ability to create custom lock provisions the integration Services loan provider can write log entries through text files SQL Server profiler a skill server Windows Event law or HTML files logs are associated with packages and are configured at the package level each transfer container in a package can log information to any package loan the task 10 containers in a package can be enabled for logging even if the package itself is not how do you deploy ssis packages now there are two ways in which the ssis packages can be deployed the first value that the ssis project built for also deployment manifest file after running this file you can deploy it to either the file system or the SQL Server the second method is to import the package from SAS Ms from file system or SQL Server now in MDX you can create a calculated Network which is the member based on other member you can create two different kinds of calculated members ones that are measured and once that are not remember that a measure is considered to be a member of the measured valuation to create a calculated member that intersects all measures you would put it in a dimension other than measure because the member in a dimension cannot interpret its own relatives in that Dimension next what is the use of property called non-emptive Behavior while creating a new member in a queue non-emptive behavior is used for ratio calculation an MDF expression will turn an error if the denominator is empty just as it could if the denominator will equal to 0 by selecting one or more measures for the non-empty behavior problem is establishing the requirement that each selected measure first be evaluated before the calculated equation is evalued if each selected measure is emptied then the expression is also treated as an unknown this is returned how are cubes implemented in SSL plus multi-month dimensional Cube or hypers Services which is built using databases to allow near instantaneous analysis of data the useful feature of an olap Cube is that the data in the cube can be in an aggregated form ssas cubes are created step-by-step process using the tube Builder this brings us to the end of today's session I hope now you have a better understanding of the news intelligence analysis and the interview question
Info
Channel: edureka!
Views: 39,785
Rating: undefined out of 5
Keywords: yt:cc=on, Business Analyst Full Course, Business Analyst Training For Beginners, Business Analyst Course For Beginners, business analyst training, business analyst course, business analyst, Business Analyst Training For Beginners 2023, Buisness analyst tutorial for beginners, business analyst tutorial, how to become a business analyst, business intelligence analyst, buisness analyst edureka, edureka
Id: czymrnQV2p4
Channel Id: undefined
Length: 203min 32sec (12212 seconds)
Published: Tue Aug 01 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.