[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Cluster analyis on hand made distance matrix

From	Ulrich Kohler <[email protected]>
To	[email protected]
Subject	st: Cluster analyis on hand made distance matrix
Date	Mon, 10 Mar 2008 16:12:30 +0100

I have two "hand made" distance matrizes, SQdist1 and SQdist2. Both
distance matrizes are essentially identical, with the exception that
they are differently ordered.

If I perform a cluster analysis using singlelinkage for the two distance
matrizes, I get identical results:

. clustermat single SQdist1, name(cluster1) add
. clustermat single SQdist2, name(cluster2) add
. sum *_hgt


    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
cluster1_hgt |        53    .2232704     .108128   .1666667   .6666667
cluster2_hgt |        53    .2232704     .108128   .1666667   .6666667

(The same is true for median-linkage and centroid linkage.)

However, if I use wards-linkage I get different results for the two
distance matrizes:

. clustermat wards SQdist1, name(cluster1) add
. clustermat wards SQdist2, name(cluster2) add
. sum *_hgt

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
cluster1_hgt |        53    .7051013     .861406   .1666667   4.414418
cluster2_hgt |        53    .7051013    .8751653   .1666667   4.645984

Although the difference doesn't seem large, it have led to quite
different groupings in a practical application. Unfortunately, I am not
an expert with cluster analysis. So, please, can anybody explain me why
this happens? If the order of distance matrix matter for
cluster-analysis, what is the "correct" order of the distance matrix,
then?

Many regards

Uli




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: Cluster analyis on hand made distance matrix
  - From: "Verkuilen, Jay" <[email protected]>

Prev by Date: RE: st: adding -input- in an ado file
Next by Date: st: ivprobit estimation with weak instruments
Previous by thread: st: Testing for Spatial Autocorrelation in Residuals
Next by thread: st: RE: Cluster analyis on hand made distance matrix
Index(es):
- Date
- Thread