Cluster Analysis In SPSS (Hierarchical, Non-hierarchical & Two-step)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

I don't know what either of these two things are.

👍︎︎ 2 👤︎︎ u/DrDerekBones 📅︎︎ Mar 26 2013 🗫︎ replies

Great video on Cluster Analysis In SPSS

👍︎︎ 1 👤︎︎ u/milehigh899 📅︎︎ Mar 26 2013 🗫︎ replies
Captions
Salaam alaikum peace be on all of you this is Musa Mel and in this video I'll talk about how to conduct cluster analysis in SPSS but before we start doing that let's try to understand what cluster analysis adds cluster analysis is a statistical technique that's used to group let's say people together or it's used to group respondents together or firms together so it's a grouping technique now there are two main types of cluster analysis one is a hierarchical cluster analysis and the other is a non hierarchical cluster analysis within hierarchical cluster analysis we have different methods and within non hierarchical cluster analysis we again have different methods let's take a look at where the option for hierarchical and non hierarchical cluster analysis is in SPSS we'll click on analyze then we will go to classify here and here we have three clustering techniques and the first one is hierarchical clustering technique which I spoke about but we can see that there is no non hierarchical clustering technique here that's because k-means clustering technique is a non hierarchical clustering technique and the reason it's shown here is because it's the most commonly used the most popular non hierarchical clustering technique so in spss it's put there as non hierarchical clustering technique and the third one here is two-step clustering technique it's kind of a combination of hierarchical clustering technique and non hierarchical clustering technique in that in two step clustering technique pre clusters are formed first of all from the from the respondents or from the firm's and later on those pre clusters are used as cases in the second step which is based on hierarchical clustering technique so the first step is kind of non hierarchical class ring the second step is the second step is hierarchical clustering further into step clustering technique and this is the only technique like that in which you can use both categorical as well as metric variables so that's just a step drawstring technique in k-means clustering technique let's take a look at this one first k-means clustering technique what happens is that we assign a predetermined number of clusters that we want so let's say given the literature support we have we want a three class resolution and after that we'll we'll tell SPSS to divide let's say respondents on the basis of five criteria let's say those five criteria are this one two three four and five so I'm telling SPSS divide my respondents and label them by cases ID on the basis of these five criteria and I'll put them here under variables next thing I do is to click on iterate and ensure that ten is the maximum iterations here that's the normal one and under save if you want to save the cluster memberships which means if you want to save the class memberships as a variable in SPSS here then you click on this one and then we will click on continue we'll click on options and if we want we can have a NOAA table and if we want we can have cluster information for each case so that we find out which respondent which firm or which person is clustered in which cluster but that's you know if you really want that information it'll be a long table we'll click on continue here and then we'll click on okay and we'll have the result here we can see that we have a three cluster solution here and these are means so this is how we have obtained three clusters now if you want to see those clusters but before that we can see here that in first cluster we have 72 respondents in second cluster we have 163 respondents and in third cluster we have 104 respondents this is the total number of respondents now if we want to see those clusters we can see them also in variable view here so this is the cluster that was created just now and within this variable there will be three clusters we can go to analyze descriptive frequencies and we can check those clusters here see we got three clusters again the same information in the first cluster 72 respondents in the second one 163 and then in the third one 104 so that's briefly what k-means clustering you predetermine a number of clusters and then you divide your data on on some parameters into those number of clusters the second clustering technique is hierarchical clustering technique which is a very popular clustering technique so click on hierarchical clustering technique and here again we kill spss to divide our respondents and label them by cases and the criteria we tell SPSS to divide those respondents on again these fives I'll put them here I'll click on statistics and ensure that glom eration schedule is selected continue under plots I'll ensure that dendogram is selected it's kind of a visual way of looking at your clusters and a method like I said before under hierarchical clustering method there are different under hierarchical clustering technique there are different methods most commonly wads method is used and here we choose equilibrium distance and under transform values under standardize we normally choose either z-scores or range minus 1 to one and then we'll click on continue' again if we want to save the cluster membership we can choose the options here and the beauty of hierarchical clustering is that you can have a list of different class strings for example you can ask for a single cluster solution which doesn't make much sense or when I say single cluster solution I can type here that I want three clusters that's what singles cluster solution means oh I can ask for let's say a two cluster solutions and then till 5 cluster solutions means I'll have two clusters solutions three cluster solutions for class four solutions and find cluster solutions but in here let me let's say we want only a three class resolution we will click on o we click on continue and then okay still running clustering and here is the agglomeration schedule telling us which cases or which respondents are put in which clusters long table and this is the dendogram like I said a visual way of looking at your at your clusters now we can of course go about interpreting it but that's and the clusters that were created here is the clusters that were created it says we have created tree grass stressed now let's look at those clusters I'll run a frequencies test here and there we go we have three clusters here by using hierarchical clustering technique and Watts method 136 respondents were put in the first cluster one study in second class 473 in the third cluster so this is the technique that we used for clustering our data now the third and the last type of clustering I would like to mention is the two-step clustering technique it's a very interesting technique and a very useful one like I said before you can put a could categorical variables in here you can put continuous variables in here as well so again I'll put the continuous variable here again same five criteria and put them under continuous variables I don't want to put any categorical variables here normally categorical variables would be like gender or type of industry so on and so forth here I tell SPSS that the number of clusters the maximum number of clusters I want is five and I'll click on options I'll not change anything here I cannot click on up I'll click on output and if I want to see how the clusters differ across industries then I can put the evaluation fields here and if I won't create cluster membership means I want to save the variable in SPSS I'll click on create cluster membership variable then I can continue and okay this is the output I have it tells me here that I divided my data on the basis of five parameters remember those five continuous variables it tells me that it was able to generate two clusters and tells me that the classroom quality is fair it's likely to good so it's okay to get further information out of this I'll double click on this chart here when I do that and this window popping up on the right-hand side here it tells me the clusters that I obtained now sixty three point one percent of the respondents were put in cluster one and thirty six point nine will put in cluster two I can further analyze it I can click on predictor importance and see which predictors were important so OC dot ha is the most important criterion in creating these Vera in creating these clusters I can remove type of industry here by the sliding it and then I have a better view of importance of the predictors now I can go to the left hand side here and I can click on model summary and choose clusters and this one gets interesting here here it tells me that there are two clusters 1 and class 2 2 in cluster 1 this is the mean of the first parameter that I chose this is the mean of the second parameter data 2 so on and so forth so in second class 2 we can see that the mean is low in comparison to mean of the parameters in cluster 1 we can of course click on display here and when we click on evaluation feels up comes type of Industry remember if we want to further see the differences across industries we can click on this one so I'll click on OK and then it gives me a new chart and here I can see that the most frequently occurring category in cluster 2 is 4 as we can see here oops ok and the most frequently occurring category in cluster 1 is 2 that means industry two we need to go back to our our data view and variable view and find out what 2 means what which industry does it signify because this is a categorical variable and further we can of course explore these options and these are quite interesting options and easy to understand so this is what our to select two step truster is about different clustering techniques again two types of cat' techniques hierarchical clustering techniques and non hierarchical clustering techniques in hierarchical clustering techniques what happens is that our algorithm creates a number of clusters initially which is equal to the number of cases or respondents and then after that it goes on looping them together on the basis of similarity and in case of non hierarchical clustering techniques will help what happens is is that seeds are created means categories of clusters are created and then the cases or respondents are put in those clusters so these are two different techniques and the third one we spoke about was the third technique was two-step trustor which involves using both categorical as well as non categorical variables to classify our data so that was a brief overview of cluster analysis and different clustering techniques I hope this video has been useful thanks for watching bye bye
Info
Channel: StatArena
Views: 131,440
Rating: 4.8309231 out of 5
Keywords: spss, cluster analysis, clustering techniques, hierarchical, non-hierarchical, data analysis
Id: eW9KnSRDyTk
Channel Id: undefined
Length: 14min 10sec (850 seconds)
Published: Sun Jul 15 2012
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.