Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Clustering with Sequence analysis/optimal matching
From
Ulrich Kohler <[email protected]>
To
[email protected]
Subject
Re: st: Clustering with Sequence analysis/optimal matching
Date
Thu, 22 Mar 2012 09:22:22 +0100
I am the author of the sq package that is involved here. However note
that -clustermat- and -cluster tree- are official Stata, and, frankly I
do not know much about cluster analysis. So I can only add that during
verification of -sqclusterdat- I quite regularly encounter the
"currently can't handle dendrogram reversals" from cluster tree (and I
don't remember whether it arises only with other methods than Ward's). I
usually solve the problem by simply changing the cutnumber to something
very close.
It is really the first time that I hear that dendrogram reversals should
not happen with Ward's method. If the problem arises due to some
features of distance matrix created by -sqom- I would be happy to hear
what I can do about it.
Many regards
Uli
Am Mittwoch, den 21.03.2012, 21:39 +0000 schrieb Brendan Halpin:
> On Wed, Mar 21 2012, Stefan Weih wrote:
>
> > Thanks for your comments, Brendan.
> >
> > However, as indicated in the correspondance earlier, I did use Ward's
> > method. Also no typo involved. For the complete syntax on my clustering
> > procedure, please see below:
> >
> > sqclusterdat
> > clustermat wardslinkage SQdist, name(wards) add
> > cluster tree wards, cutnumber(20)
> > sqclusterdat, return
>
>
> OK, that's fairly unambiguous. I'm puzzled. As far as I understand,
> reversals (where after combining clusters i and j, the distance from
> cluster k to the joint cluster is less than d(i,k) and/or d(j,k)) don't
> happen with Ward's method.
>
> Could your distance matrix be defective? For instance, could it be
> non-metric? Normally OM is guaranteed to generate metric distances, but
> if the substitution matrix is not metric, the distances are not
> guaranteed to be metric. That's a long shot, though -- I have no idea
> whether non-metric distances will give Ward's method indigestion.
>
>
> Another thing that might be helpful would be to post the output from the
> following:
>
> . sqclusterdat
> . clustermat wardslinkage SQdist, name(wards) add
> . cluster query wards
> . return list
> . cluster tree wards, cutnumber(20)
>
>
> Regards,
>
> Brendan
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/