Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: cluster analysis

From	Nick Cox <[email protected]>
To	"'[email protected]'" <[email protected]>
Subject	st: RE: cluster analysis
Date	Tue, 24 Jan 2012 13:43:06 +0000

As I understand it, you want to cluster data for two variables into two groups. 

Any clustering that makes sense will be evident on a scatter plot and allow scientific interpretation. 
K-means sounds to me overkill for such a problem, but tastes differ. 

I know that many economists don't believe anything without a P-value attached. 

A more formal approach to such data would presumably start with a discriminant analysis. 

Nick 
[email protected] 

Gianluca Cafiso

I have run this cluster analysis:

cluster kmeans X1 X2 if id_X3==1, k(2) name(ca2) s(prandom) keepcen 
cluster list ca2 
cluster query ca2
return list 
sreturn list

However, I do not manage to get the following information related to the cluster analysis:

1 - the initial mean values used as group centers
(I command the way they are defined "prandom", but I want to see the values too)
2 - the value of the dissimilarity measure (L2,  euclidian)

Furthermore:

- Is there a way to test statistically whether my partition makes sense?
(I mean: do the data really flow into 2 groups?)
A statistician friend of mine suggested to look at Wilks' lamda. 
Does anybody know if it makes sense with Stata's cluster algorithm and , if so,
how to get it?


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: cluster analysis
  - From: Gianluca Cafiso <[email protected]>

Prev by Date: st: svyset and complex design
Next by Date: Re: st: Calculating an Index
Previous by thread: st: cluster analysis
Next by thread: st: Job opening data manager in Munich, Germany
Index(es):
- Date
- Thread