Hierarchical Cluster Analysis using SPSS with Example

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
multivariate data analysis cluster analysis using SPSS cluster analysis is an explorative data analysis tool for organizing observed data into meaningful clusters based on combination of variables cluster analysis using SPSS for doing this cluster analysis I'm using an example of job satisfaction in this example I have five variables age work experience allowed towards job expert a job and job satisfaction the purpose of cluster analysis is to group the cases into different clusters in this example I have total 384 respondent data is there and I want to divide this 384 respondents into different clusters based on these five variables in these five variables two variables are continuous data and remaining three data is Likert scale first data is age second is work experience and third fourth fifth are Likert scale I have asked the questions like I allow my job I am expert in my job I am satisfied with the job and the respondent gives answer basing on three Likert scales strongly agree to strongly disagree now basing on these questions how we divide the respondents into clusters is a main purpose of cluster analysis the researcher does not have an idea like how many clusters will be formed and what are the important variables that are going to make this respondents into groups right when we want to use cluster analysis there are two methods are the first one is hierarchical cluster method second one is came in clusters okay we can see this in analysis part when we go for classified we have came in cluster and hierarchical cluster basically hierarchy Casta is used when the respondent want to explore how many number of clusters will be forming he does not have an idea how many clusters will be formed once the respondent have an idea like it is going to form two clusters three clusters then the researcher can use came in cluster so first I'll use hierarchic custom and divide this 384 respondent data in two clusters and check the same thing with caming cluster before I go I just show the data this is the data which higher I have and I have total number of cases that is respondent data is at 384 I want to do this cluster analysis with minimum steps and a easy way of understanding how to come come to conclusion that clusters will be formed with that object to I am doing the analysis I am going for analysis in this classify in this hierarchical cluster the variables okay if wants it reset it and bring all the variables to variable side will go for statistics in statistics I'll select only one check button that is agglomeration scheduled in the analysis I explain you what is agglomeration schedule that's it continue don't go for plots in plots will have dead diagram instead of using dendogram I'll explain it with agglomeration the easy way to understand how many clusters will be forming so I am NOT going to use any plot I am going to remove the check button for plot for method I'll be using cluster method which is watts method this is a general method which we use for making clusters we can use other methods also Watts method and interval is squared occlusion distance so these are the two important things which you must remember to do the settings watts method and interval is create occlusion distance say continue I am NOT going to save the cluster once it formed because initially I'll check how many clusters will be forming later I can go for K mean cluster and I can do that saving that each responding into one clusters right I did only just two important things in statistics agglomeration and in methods this two things should be selected and then I'll say okay once I say okay are the analysis starts now here we got case processing summary total number is 384 respondents are there and a hundred percent data is used there is no missing the important thing is agglomeration schedule in this schedule if you observe we have something like a column called stage and it runs from 1 till 383 total cases are 384 in that stages are 383 it mean will start from the bottom and come up to see how many clusters should be considered the system is taking each respondent as a cluster and this is how many clusters should be considered in the analysis for that purpose we concentrate on this column called coefficient so observe only these two columns stage and coefficient I moved to the bottom of the table Here I am this is stage three hundred eighty three stages are there and this is coefficient now the last stage three hundred eighty three I have the coefficient is eight thousand one hundred and three hundred eighty two case is 4200 means around nearly there is a difference of four thousand okay and if we move from 382 to 381 see the difference between this two-stage coefficient from 4204 it has reduced to two thousand three hundred eighty six means a difference in nearly I'm taking it really like it is a nearly like two thousand but when I move from 382 to 381 the difference has reduced see 381 to 385 mu the difference has reduced to only three hundred so in the initial case between these two stages the difference is four thousand between these two stages the difference is nearly two thousand but when I come here automatically it has come to three hundred I am saying nearly means that the difference is a drastically getting read used so we'll take the major difference when it is happening this major jump is happening between these two that is four thousand two hundred to two thousand three hundred eighty-six if I further go the jump is little the difference is little so 384 stage is considered to be major difference this is very important to understand where there is a difference in the coefficient is coming between these two it is four thousand and between 382 to 381 is 2000 but automatically you will be seeing from 381 to 380 there is only 300 difference so we take this 381 stage as major difference in coefficient I'll show the same thing in a form of a diagram which I had drawn separately from SPSS okay this diagram is not automatically generated in SPSS for you to draw this diagram in this diagram you can see I have taken x-axis as stages and y-axis as coefficients okay so 383 stage - 382 from 8,000 it has come all the way till around 4000 but 382 to 381 steeply it has jumped or dropped till 2000 from here the slope is less we call this as a elbow and you must look at the formation of the elbow the elbow has formed at 381 I have drawn with the red color line showing you where the elbow has been formed this stage should be remembered 381 is a stage where the slope is steep very sharp from here it is going on very less difference I have shown them in the form of a small calculation so in my case total number of cases are 384 and the stage where the elbow has formed is 381 so number of cases in my example that is going to form cluster is with this calculation that is number of cases - number of stages so the number of clusters that is going to form my example is 384 - 381 the difference is 3 so the cluster analysis using hierarchical cluster analysis the total 384 cases are divided into three clusters so this is a conclusion which we do in this example totally we are getting three clusters and now I am going to move to the next stage where I am going to use K mean cluster and divide each case into some cluster I am going to make each case into
Info
Channel: My Easy Statistics
Views: 25,211
Rating: 4.687151 out of 5
Keywords: Multiple variate Analysisi, Multivariate Data Analysis, Multivariate Analysis, Cluster Analysis, Hierarchical Cluster Analysis, K means Cluster analysis, multivariate data analysis lecture, multivariate data analysis in r, multivariate data analysis spss, multivariate data analysis youtube, multivariate data analysis examples, hierarchical cluster analysis using spss with example, multivariate data analysis in hindi, data analysis, hierarchical clustering, visualize clusters
Id: z6RjlBAtSGQ
Channel Id: undefined
Length: 13min 24sec (804 seconds)
Published: Sat Jul 16 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.