Cluster Analysis, Fifth Edition |
||||||||||||||||||||||||||||||||||||
Click to enlarge See the back cover |
As an Amazon Associate, StataCorp earns a small referral credit from
qualifying purchases made from affiliate links on our site.
eBook not available for this title
eBook not available for this title |
|
||||||||||||||||||||||||||||||||||
Comment from the Stata technical groupCluster Analysis, Fifth Edition by Brian S. Everitt, Sabine Landau, Morven Leese, and Daniel Stahl is a popular, well-written introduction and reference for cluster analysis. The book introduces the topic and discusses a variety of cluster-analysis methods. This book has a wealth of practical information—for example, how to best visualize clusters, how (and whether) to select and transform variables, how to choose among the clustering methods, and how to compare the results of different cluster analyses. Several examples illustrate the discussion. Among the many updates in the fifth edition is the complete rewrite of the chapter on cluster analysis using mixture models. An additional chapter has been added for analyzing structured data using mixture models. |
||||||||||||||||||||||||||||||||||||
Table of contentsView table of contents >> Preface
Acknowledgement
1. An Introduction to classification and clustering
1.1 Introduction
1.2 Reasons for classifying 1.3 Numerical methods of classification—cluster analysis 1.4 What is a cluster? 1.5 Examples of the use of clustering
1.5.1 Market research
1.6 Summary 1.5.2 Astronomy 1.5.3 Psychiatry 1.5.4 Weather classification 1.5.5 Archaeology 1.5.6 Bioinformatics and genetics 2. Detecting clusters graphically
2.1 Introduction
2.2 Detecting clusters with univariate and bivariate plots of data
2.2.1 Histograms
2.3 Using lower-dimensional projections of multivariate data for graphical representations2.2.2 Scatterplots 2.2.3 Density estimation 2.2.4 Scatterplot matrices
2.3.1 Principal components analysis of multivariate data
2.4 Three-dimensional plots and trellis graphics2.3.2 Exploratory projection pursuit 2.3.3 Multidimensional scaling 2.5 Summary 3. Measurement of proximity
3.1 Introduction
3.2 Similarity measures for categorical data
3.2.1 Similarity measures for binary data
3.3 Dissimilarity and distance measures for continuous data 3.2.2 Similarity measures for categorical data with more than two levels 3.4 Similarity measures for data containing both continuous and categorical variables 3.5 Proximity measures for structured data 3.6 Inter-group proximity measures
3.6.1 Inter-group proximity derived from the proximity matrix
3.7 Weighting variables 3.6.2 Inter-group proximity based on group summaries for continuous data 3.6.3 Inter-group proximity based on group summaries for categorical data 3.8 Standardization 3.9 Choice of proximity measure 3.10 Summary 4. Hierarchical clustering
4.1 Introduction
4.2 Agglomerative methods
4.2.1 Illustrative examples of agglomerative methods
4.3 Divisive methods4.2.2 The standard agglomerative methods 4.2.3 Recurrence formula for agglomerative methods 4.2.4 Problems of agglomerative hierarchical methods 4.2.5 Empirical studies of hierarchical agglomerative methods
4.3.1 Monothetic divisive methods
4.4 Applying the hierarchical clustering process4.3.2 Polythetic divisive methods
4.4.1 Dendrograms and other tree representations
4.5 Applications of hierarchical methods 4.4.2 Comparing dendrograms and measuring their distortion 4.4.3 Mathematical properties of hierarchical methods 4.4.4 Choice of partition—the problem of the number of groups 4.4.5 Hierarchical algorithms 4.4.6 Methods for large data sets
4.5.1 Dolphin whistles—agglomerative clustering
4.6 Summary 4.5.2 Needs of psychiatric patients—monothetic divisive clustering 4.5.3 Globalization of cities—polythetic divisive method 4.5.4 Women’s life histories—divisive clustering of sequence data 4.5.5 Composition of mammals’ milk—exemplars, dendrogram seriation and choice of partition 5. Optimization clustering techniques
5.1 Introduction
5.2 Clustering criteria derived from the dissimilarity matrix 5.3 Clustering criteria derived from continuous data
5.3.1 Minimization of trace(W)
5.4 Optimization algorithms 5.3.2 Minimization of det(W) 5.3.3 Maximization of trace (BW−1) 5.3.4 Properties of the clustering criteria 5.3.5 Alternative criteria for clusters of different shapes and sizes
5.4.1 Numerical example
5.5 Choosing the number of clusters 5.4.2 More on k-means 5.4.3 Software implementations of optimization clustering 5.6 Applications of optimization methods
5.6.1 Survey of student attitudes towards video games
5.7 Summary 5.6.2 Air pollution indicators for US cities 5.6.3 Aesthetic judgement of painters 5.6.4 Classification of ‘nonspecific’ back pain 6. Finite mixture densities as models for cluster analysis
6.1 Introduction
6.2 Finite mixture densities
6.2.1 Maximum likelihood estimation
6.3 Other finite mixture densities6.2.2 Maximum likelihood estimation of mixtures of multivariate normal densities 6.2.3 Problems with maximum likelihood estimation of finite mixture models using the EM algorithm
6.3.1 Mixtures of multivariate t-distributions
6.4 Bayesian analysis of mixtures6.3.2 Mixtures for categorical data—latent class analysis 6.3.3 Mixture models for mixed-mode data
6.4.1 Choosing a prior distribution
6.5 Inference for mixture models with unknown number of components and model structure6.4.2 Label switching 6.4.3 Markov chain Monte Carlo samplers
6.5.1 Log-likelihood ratio test statistics
6.6 Dimension reduction—variable selection in finite mixture modelling6.5.2 Information criteria 6.5.3 Bayes factors 6.5.4 Markov chain Monte Carlo methods 6.7 Finite regression mixtures 6.8 Software for finite mixture modelling 6.9 Some examples of the application of finite mixture densities
6.9.1 Finite mixture densities with univariate Gaussian components
6.10 Summary6.9.2 Finite mixture densities with multivariate Gaussian components 6.9.3 Applications of latent class analysis 6.9.4 Application of a mixture model with different component densities 7. Model-based cluster analysis for structured data
7.1 Introduction
7.2 Finite mixture models for structured data 7.3 Finite mixtures of factor models 7.4 Finite mixtures of longitudinal models 7.5 Applications of finite mixture models for structured data
7.5.1 Application of finite mixture factor analysis to the ‘categorical
versus dimensional representation’ debate
7.6 Summary7.5.2 Application of finite mixture confirmatory factor analysis to cluster genes using replicated microarray experiments 7.5.3 Application of finite mixture exploratory factor analysis to cluster Italian wines 7.5.4 Application of growth mixture modelling to identify distinct developmental trajectories 7.5.5 Application of growth mixture modelling to identify trajectories of perinatal depressive symptomatology 8. Miscellaneous clustering methods
8.1 Introduction
8.2 Density search clustering techniques
8.2.1 Mode analysis
8.3 Density-based spatial clustering of applications with noise8.2.2 Nearest-neighbour clustering procedures 8.4 Techniques which allow overlapping clusters
8.4.1 Clumping and related techniques
8.5 Simultaneous clustering of objects and variables8.4.2 Additive clustering 8.4.3 Application of MAPCLUS to data on social relations in a monastery 8.4.4 Pyramids 8.4.5 Application of pyramid clustering to gene sequences of yeasts
8.5.1 Hierarchical classes
8.6 Clustering with constraints 8.5.2 Application of hierarchical classes to psychiatric symptoms 8.5.3 The error variance technique 8.5.4 Application of the error variance technique to appropriateness of behaviour data
8.6.1 Contiguity constraints
8.7 Fuzzy clustering 8.6.2 Application of contiguity-constrained clustering
8.7.1 Methods for fuzzy cluster analysis
8.8 Clustering and artificial neural networks 8.7.2 The assessment of fuzzy clustering 8.7.3 Application of fuzzy cluster analysis to Roman glass composition
8.8.1 Components of a neural network
8.9 Summary 8.8.2 The Kohonen self-organizing map 8.8.3 Application of neural nets to brainstorming sessions 9. Some final comments and guidelines
9.1 Introduction
9.2 Using clustering techniques in practice 9.3 Testing for absence of structure 9.4 Methods for comparing cluster solutions
9.4.1 Comparing partitions
9.5 Internal cluster quality, influence and robustness 9.4.2 Comparing dendrograms 9.4.3 Comparing proximity matrices
9.5.1 Internal cluster quality
9.6 Displaying cluster solutions graphically 9.5.2 Robustness—split-sample validation and consensus trees 9.5.3 Influence of individual points 9.7 Illustrative examples
9.7.1 Indo-European languages—a consensus tree in linguistics
9.8 Summary 9.7.2 Scotch whisky tasting—cophenetic matrices for comparing clusterings 9.7.3 Chemical compounds in the pharmaceutical industry 9.7.4 Evaluating clustering algorithms for gene expression data Bibliography
Index
|
Learn
Free webinars
NetCourses
Classroom and web training
Organizational training
Video tutorials
Third-party courses
Web resources
Teaching with Stata
© Copyright 1996–2024 StataCorp LLC. All rights reserved.
×
We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.
Cookie Settings
Last updated: 16 November 2022
StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.
These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.
Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.