[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: "testing" a cluster analysis

From	Ron�n Conroy <[email protected]>
To	[email protected]
Subject	Re: st: "testing" a cluster analysis
Date	Wed, 7 Feb 2007 10:34:58 +0000

On 7 Feabh 2007, at 00:06, Adam Seth Litwin wrote:

Hello. I just ran a cluster analysis, not a technique I use frequently. I have seven binary variables forming, at the moment, five clusters. I thought a useful exercise would be the following:

For each of the seven variables, examine its mean in all five clusters. Then, run an F-test to show that the means are not equal across all five clusters.

So, for example, I type
- tabstat var1, by(CLUSTER) stat(n mean)

But, I'm not sure how to run the F-test.

Careful. An analysis of variance is a hypothesis test. The model is specified in advance and the anova calculates the values of the model parameters.

In your case, the model was generated from the data. The usual interpretation of the F ratio does not apply.

Cluster analysis is an exploratory technique. You need to think about validating the clustering by showing that the clusters differ on variables which were not used in the clustering but which are theoretically related to the cluster process.

For example, if you use clustering to define five clusters of people based on the type and frequency of their social interactions, then you would expect that the clusters would differ on things like loneliness and perceived social support, and you would hope that they differed in dimensions like mood or (headline from this month's Archives of General Psychiatry) risk of Alzheimer's disease.

So I'd forget the F-test and start validating the clusters. Your hypothesis is that the clusters are different from each other in some respect other than the variables you clustered on.

=========
Ron�n Conroy
Royal College of Surgeons in Ireland
[email protected]
+353 (0) 1 402 2431
+353 (0) 87 799 97 95
http://www.flickr.com/photos/ronanconroy

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: "testing" a cluster analysis
  - From: "Adam Seth Litwin" <[email protected]>

References:
- st: "testing" a cluster analysis
  - From: "Adam Seth Litwin" <[email protected]>

Prev by Date: Re: st: Interpreting conditional logistic regression equations using 2 similar types of matching.
Next by Date: st: RE: How to solve a system of nonlinear inequalities in Stata
Previous by thread: Re: st: "testing" a cluster analysis
Next by thread: Re: st: "testing" a cluster analysis
Index(es):
- Date
- Thread