Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: cluster analysis with foreach
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
st: RE: cluster analysis with foreach
Date
Wed, 9 Jun 2010 17:47:19 +0100
I have comments on three levels.
First, your example implies that you have two variables, -state- and
-power_city-. I don't know what real scientific purpose clustering would
serve here. Clustering on virtually any criterion will just tend to
group together lots of smaller values of consumption, necessarily more
similar than others, as larger cities will typically be more spread out
in what for most if not all states will be very skewed distributions.
Second, you set yourself up for the task of collating 50 or so cluster
analyses. ("or so" depending on DC, Puerto Rico, etc.)
Third, should you persist, your syntax might be something like
egen group = group(state), label
su group, meanonly
forval i = 1/`r(max)' {
cluster ... if group == `i'
}
Nick
[email protected]
Maximiliano Manuel Silva Correa
Im stuck trying to run a cluster analysis routine throuth diferent
sections of my data. Suppose we have power consumtion data about
different cities of the US. What I'd like to do is to run a cluster
analysis routine (cluster kmeans for example) by state, because i
would like to see in every state which cities have similar power
consumtion.
It would be something like
foreach s in states{
cluster(kmeans, power_city)
}
(states is a string variable)
Could someone show me the sintax here, or send similar examples?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/