 Cluster analysis (in marketing)

Cluster analysis is a class of statistical techniques that can be applied to data that exhibit “natural” groupings. Cluster analysis sorts through the raw data and groups them into clusters. A cluster is a group of relatively homogeneous cases or observations. Objects in a cluster are similar to each other. They are also dissimilar to objects outside the cluster, particularly objects in other clusters.
The diagram below illustrates the results of a survey that studied drinkers’ perceptions of spirits (alcohol). Each point represents the results from one respondent. The research indicates there are four clusters in this market.
Illustration of clustersAnother example is the vacation travel market. Recent research has identified three clusters or market segments. They are the: 1) The demanders  they want exceptional service and expect to be pampered; 2) The escapists  they want to get away and just relax; 3) The educationalist  they want to see new things, go to museums, go on a safari, or experience new cultures.
Cluster analysis, like factor analysis and multidimensional scaling, is an interdependence technique: it makes no distinction between dependent and independent variables. The entire set of interdependent relationships is examined. It is similar to multidimensional scaling in that both examine interobject similarity by examining the complete set of interdependent relationships. The difference is that multidimensional scaling identifies underlying dimensions, while cluster analysis identifies clusters. Cluster analysis is the obverse of factor analysis. Whereas factor analysis reduces the number of variables by grouping them into a smaller set of factors, cluster analysis reduces the number of observations or cases by grouping them into a smaller set of clusters.
Contents
In marketing, cluster analysis is used for
 Segmenting the market and determining target markets
 Product positioning and New Product Development
 Selecting test markets (see : experimental techniques)
Basic procedure
Main article: Cluster analysis Formulate the problem  select the variables to which you wish to apply the clustering technique
 Select a distance measure  various ways of computing distance:
 Squared Euclidean distance  the sum of the squared differences in value for each variable
 Manhattan distance  the sum of the absolute differences in value for any variable
 Chebyshev distance  the maximum absolute difference in values for any variable
 Mahalanobis (or correlation) distance  this measure uses the correlation coefficients between the observations and uses that as a measure to cluster them. This is an important measure since it is unit invariant (can figuratively compare apples to oranges)
 Select a clustering procedure (see below)
 Decide on the number of clusters
 Map and interpret clusters  draw conclusions  illustrative techniques like perceptual maps, icicle plots, and dendrograms are useful
 Assess reliability and validity  various methods:
 repeat analysis but use different distance measure
 repeat analysis but use different clustering technique
 split the data randomly into two halves and analyze each part separately
 repeat analysis several times, deleting one variable each time
 repeat analysis several times, using a different order each time
Clustering procedures
Main article: Cluster analysisThere are several types of clustering methods:
 NonHierarchical clustering (also called kmeans clustering)
 first determine a cluster center, then group all objects that are within a certain distance
 examples:
 Sequential Threshold method  first determine a cluster center, then group all objects that are within a predetermined threshold from the center  one cluster is created at a time
 Parallel Threshold method  simultaneously several cluster centers are determined, then objects that are within a predetermined threshold from the centers are grouped
 Optimizing Partitioning method  first a nonhierarchical procedure is run, then objects are reassigned so as to optimize an overall criterion.
 Hierarchical clustering
 objects are organized into an hierarchical structure as part of the procedure
 examples:
 Divisive clustering  start by treating all objects as if they are part of a single large cluster, then divide the cluster into smaller and smaller clusters
 Agglomerative clustering  start by treating each object as a separate cluster, then group them into bigger and bigger clusters
 examples:
 Centroid methods  clusters are generated that maximize the distance between the centers of clusters (a centroid is the mean value for all the objects in the cluster)
 Variance methods  clusters are generated that minimize the withincluster variance
 example:
 Ward’s Procedure  clusters are generated that minimize the squared Euclidean distance to the center mean
 example:
 Linkage methods  cluster objects based on the distance between them
 examples:
 Single Linkage method  cluster objects based on the minimum distance between them (also called the nearest neighbour rule)
 Complete Linkage method  cluster objects based on the maximum distance between them (also called the furthest neighbour rule)
 Average Linkage method  cluster objects based on the average distance between all pairs of objects (one member of the pair must be from a different cluster)
 examples:
 examples:
See also
 marketing
 marketing research
 factor analysis
 multi dimensional scaling
 quantitative marketing research
 positioning
 perceptual mapping
References
 Sheppard, A. G. (1996). The sequence of factor analysis and cluster analysis: Differences in segmentation and dimensionality through the use of raw and factor scores. Tourism Analysis, 1(Inaugural Volume), 4957.
Categories: Psychometrics
 Marketing
 Market research
 Product management
 Cluster analysis
 Applied data mining
Wikimedia Foundation. 2010.
Look at other dictionaries:
Cluster analysis — The result of a cluster analysis shown as the coloring of the squares into three clusters. Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so that the objects in the same cluster are more… … Wikipedia
analysis — a‧nal‧y‧sis [əˈnælss] noun analyses PLURALFORM [ siːz] [countable, uncountable] 1. a careful examination of something in order to understand it better: • The researchers carried out a detailed analysis of recent trends in share prices. •… … Financial and business terms
Cluster sampling — is a sampling technique used when natural groupings are evident in a statistical population. It is often used in marketing research. In this technique, the total population is divided into these groups (or clusters) and a sample of the groups is… … Wikipedia
ClusterAnalyse — Unter Clusteranalyse (der Begriff Ballungsanalyse wird selten verwendet) versteht man strukturentdeckende, multivariate Analyseverfahren zur Ermittlung von Gruppen (Clustern) von Objekten, deren Eigenschaften oder Eigenschaftsausprägungen… … Deutsch Wikipedia
Outline of marketing — The following outline is provided as an overview of and topical guide to marketing: Marketing refers to the social and managerial processes by which products, services and value are exchanged in order to fulfil individuals or group s needs and… … Wikipedia
List of marketing topics — This is a list of marketing topics. Marketing fundamentals * [ [Marketing] * Consumer * Business Marketing * Core * Customer ** Customer lifetime value (CLV) ** Customer relationship management (CRM) * Marketing mix * Marketing orientation, also… … Wikipedia
Topic outline of marketing — For a more comprehensive list, see the List of marketing topics. Marketing refers to the social and managerial processes by which products, services and value are exchanged in order to fulfil individual s or group s needs and wants. These… … Wikipedia
Factor analysis — is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved, uncorrelated variables called factors. In other words, it is possible, for example, that variations in … Wikipedia
Preference regression (in marketing) — Preference regression is a statistical technique used by marketers to determine consumers’ preferred core benefits. It usually supplements product positioning techniques like multi dimensional scaling or factor analysis and is used to create… … Wikipedia
multivariate analysis — A statistical procedure that simultaneously analyses multiple measurements on each individual or object under study in a marketing research enquiry. Examples of the procedures used include multiple regression, factor analysis, cluster analysis… … Big dictionary of business and management