Why does cluster analysis sometimes give different results?

 

Cluster analysis is a statistical approach to the data – there are no underlying assumptions or knowledge of the data in terms of its marketing implications – it just groups (clusters) data based on proximity.

The approach to cluster analysis used in the free template on this website is known as k-means clustering. While there are many variations (algorithms) that can be used in cluster analysis, the approach adopted generally requires the initial centers of each of the proposed market segments to be randomly selected (please review the articles: a simple guide to cluster analysis and how the template calculation works for more information).

Random starting points may influence results

Because of this randomized approach to kick-starting the cluster analysis process, it is possible that slightly different market segment centers (averages/means) might be formed. This can occur even when using the same data set with the same statistical package and simply re-running the data into the same number of market segments.

A structured random approach for the free Excel template

However, please note that as this free template is essentially designed as a teaching and learning tool, that there is a “structured random” start to each of the market segments (clusters) – which should mean that the same outputs and graphs are generated each time the same data set is used. Obviously, as you change the variables in the data set, the cluster analysis outputs will be revised accordingly.