Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Ulrich Kohler <kohler@wzb.eu> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Clustering with Sequence analysis/optimal matching |
Date | Thu, 22 Mar 2012 09:22:22 +0100 |
I am the author of the sq package that is involved here. However note that -clustermat- and -cluster tree- are official Stata, and, frankly I do not know much about cluster analysis. So I can only add that during verification of -sqclusterdat- I quite regularly encounter the "currently can't handle dendrogram reversals" from cluster tree (and I don't remember whether it arises only with other methods than Ward's). I usually solve the problem by simply changing the cutnumber to something very close. It is really the first time that I hear that dendrogram reversals should not happen with Ward's method. If the problem arises due to some features of distance matrix created by -sqom- I would be happy to hear what I can do about it. Many regards Uli Am Mittwoch, den 21.03.2012, 21:39 +0000 schrieb Brendan Halpin: > On Wed, Mar 21 2012, Stefan Weih wrote: > > > Thanks for your comments, Brendan. > > > > However, as indicated in the correspondance earlier, I did use Ward's > > method. Also no typo involved. For the complete syntax on my clustering > > procedure, please see below: > > > > sqclusterdat > > clustermat wardslinkage SQdist, name(wards) add > > cluster tree wards, cutnumber(20) > > sqclusterdat, return > > > OK, that's fairly unambiguous. I'm puzzled. As far as I understand, > reversals (where after combining clusters i and j, the distance from > cluster k to the joint cluster is less than d(i,k) and/or d(j,k)) don't > happen with Ward's method. > > Could your distance matrix be defective? For instance, could it be > non-metric? Normally OM is guaranteed to generate metric distances, but > if the substitution matrix is not metric, the distances are not > guaranteed to be metric. That's a long shot, though -- I have no idea > whether non-metric distances will give Ward's method indigestion. > > > Another thing that might be helpful would be to post the output from the > following: > > . sqclusterdat > . clustermat wardslinkage SQdist, name(wards) add > . cluster query wards > . return list > . cluster tree wards, cutnumber(20) > > > Regards, > > Brendan * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/