[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Cluster analysis - cluster kmeans-

From	[email protected]
To	[email protected]
Subject	Re: st: Cluster analysis - cluster kmeans-
Date	Fri, 11 May 2007 16:36:16 -0500

Herve STOLOWY <[email protected]> asks:

> I have a group of 21 observations with one variable (a score) and
> would like to create three "homogeneous" groups.
> 
> I found the -cluster kmeans- command. Here are my command lines:
> 
> gsort - finance_aggregate
> cluster kmeans finance_aggregate, k(3)
> 
> Each time I run these commands, I get a different result (i.e., a
> different clustering: the three groups are different). I looked
> at the help file but don't understand. (It might be related to
> the start option but I am not sure).
> 
> Is there a way to obtain the same result everytime?

You can -set seed 183289- (or any other number you like) before
each call of -cluster kmeans- so that the same set of random
starting values are selected each time.  Or, as you were
guessing, you can use the -start()- option to do the same thing
(with several suboptions controlling the k starting groups), see
-help cluster kmeans- for details.

SR Millis <[email protected]> said:

> You're going to need more than 1 variable. Cluster
> analysis is a multivariable technique.  In addition, a
> sample size of only 21 is often too small for cluster
> analysis.

While cluster analysis is a multivariate technique, it will work
with a single variable also.  That is no problem.  Having only 21
observations might or might not be a problem.  It depends on the
data.  After you do your cluster analysis you might want to look
at some summaries or graphs of the resulting three groups.

    . set seed 12345
    . cluster kmeans myvar, k(3) name(myclus)
    . bysort myclus: summarize myvar
    . twoway dot myvar myclus

and possibly also

    . cluster stop

(or similarly -anova myvar myclus-) to get a feel for how
distinct the groups are.

Ken Higbee    [email protected]
StataCorp     1-800-STATAPC

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Cluster analysis - cluster kmeans-
  - From: SR Millis <[email protected]>

Prev by Date: Re: st: Cluster analysis - cluster kmeans-
Next by Date: Re: st: xtile and "by" question
Previous by thread: Re: st: Cluster analysis - cluster kmeans-
Next by thread: Re: st: Cluster analysis - cluster kmeans-
Index(es):
- Date
- Thread