Concerns with Cluster Analysis


Limitations of cluster analysis for forming market segments

Cluster analysis is just a statistical process of grouping “related” numbers into sets. Note the word “related” – the data may be truly related or it could be somewhat related just by chance. Regardless, keep in mind that there is no underlying marketing logic to the statistical aspect of the cluster analysis calculation. This means that there are potential pitfalls in the use of cluster analysis for market segmentation purposes.

Cluster analysis relies upon suitable consumer data

The first and most significant limitation of cluster analysis for a marketer is that you need to have access to appropriate consumer information. If you work for a service firm, for example, you may have a reasonable customer database which you then can utilize to run cluster analysis and identify market segments. And larger firms will typically have access to suitable marketing research survey data.

However, most firms, especially smaller and newer ones, will not have access to appropriate data and will not be able to utilize cluster analysis at all.

It’s just numbers – not marketing logic

Another limitation is that cluster analysis is simply a statistical technique – it assumes no underlying knowledge of the market or how consumers may behave. In other words, it is just clustering the data around a series of central points – which way it may or may not make sense once the analysis have been undertaken. The skill of using the technique is not in running the analysis, but interpreting and using the output to determine suitable market segments and then a successful marketing strategy targeting one or more of these segments.

Different results – same data?

Some forms of cluster analysis will generate slightly different results each time they run the statistical analysis. This can happen because there is often no set way to approach the data. At the start, especially when they are many variables to consider, some form of random or arbitrary approach to “guessing” the possible locations (means) of the various data centers (that is, market segments) is undertaken.

Therefore, depending upon the initial starting point (seed) of the data for each of the segments, the outcome can be somewhat different. This will even happen even if you run the same data set using the same statistical software – because the underlying statistical approach can be to use a random starting point.

But this spreadsheet will deliver consistent results

Obviously, for students new to the concept of cluster analysis, changing data outcomes can be quite confusing and somewhat frustrating. Therefore, the Excel cluster analysis template available on the website uses a random, but somewhat structured, approach to selecting the initial data points – which means that consist outcomes are provided, making it a more valuable teaching and learning tool. Please see how the Excel template works for more information.