David Reimer <[email protected]> asks:
> I am trying to reproduce a variable-based cluster analysis
> I ran in SPSS using Stata.
> The SPSS syntax is the follwing:
>
> proximities var1 to var8
> /view=variables
> /measure=euclid
> /matrix=out(*). /* This command produces a matrix, based on
> Euclidean distances */
> cluster var1 to var8
> /method=single
> /matrix=in(*) /*now I can use the matrix produced before
> for the cluster analysis */
>
> A relatively simple solution would be to just switch rows and columns of
> my dataset. Since my dataset contains 10.000 observations my version of
> Stata doesn't allow this.
> Any suggestions?
If I understand your question correctly (which I may not since I
am not familiar with SPSS syntax), you which to interchange the
role of observations and variables and then perform a single
linkage cluster analysis using euclidean distance.
The only way I know of doing that in Stata is to use the -xpose-
command and actually interchange the variables and observations
and then run the -cluster- command.
With 10,000 observations being changed into 10,000 variables, you
would need to use Stata/SE (Special Edition of Stata) that works
for large datasets. Intercooled Stata has a limit of 2,047
variables. Stata/SE will allow up to 32,766 variables. Within
Stata you can read about Stata/SE by viewing the help for
SpecialEdition.
Ken Higbee [email protected]
StataCorp 1-800-STATAPC
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/