Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Cluster, new dissimilarity measures, and sequence analysis


From   Jean-Benoit Hardouin <[email protected]>
To   [email protected]
Subject   Re: st: Cluster, new dissimilarity measures, and sequence analysis
Date   Thu, 22 Jul 2004 20:42:09 +0200

I had this problem to create a process to construct clusters of dichotomous variables with measures of proximity equal to the average weighted correlations conditional to the scores of the individuals (the obtained clusters of variables must verify unidimensionality and local independence, two fundamental properties of the Item Response Theory (IRT)). This method is known as the HCA/CCPROX (Zhang and Stout 1999). My solution was to create my own Stata procedure to realize this method. The obtained module is named hcaccprox (ssc describe hcaccprox - information about this program on http://anaqol.free.fr). The problem of this procedure is that I have not work with the clusters tools of Stata, so it is not possible for example to construct a dendogram with "cluster dendogram"...). More, this procedure permit to cluster variables (and not the observations). But perhaps the code can help you...
I hope this helps

***************************************
Jean-Benoit Hardouin
Regional Health Observatory
Orléans - France
Email : [email protected]
***************************************

Matissa Hollister a écrit:


Im hoping someone can help me solve this problem,
although I’m beginning to think that it’s hopeless. Basically I’ve created my own special measure of
dissimilarity that I want to use for clustering, but
I’m finding that there is no way to get Stata to allow
me to use this new dissimilarity measure. Any ideas
of ways to get around this problem would be greatly
appreciated.

Basically, I am using a procedure called Optimal
Matching, an algorithm designed to create a measure of
dissimilarity between two sequences of data. I am
using it to identify people who have similar career
patterns. I’ve created a do-file that accomplishes
the most difficult and unusual part of Optimal
Matching, which is creating the measure of
dissimilarity between each pair of sequences. I now
want to run a clustering procedure to identify groups
based upon this dissimilarity measure.

I found a post in the listserv archives (dated
November 18, 2002) where someone wanted to do
something similar (she wanted to create a geographic
distance measure). From the response I gather the
calling and running of the dissimilarity algorithms
occurs within the built-in stata command _cluster and
is done within C, which is certainly beyond my
programming abilities. I’ve contemplated several
possibilities and would love help or advice on any of
them:

1)find a different software program that will allow me
to easily input a new dissimilarity measure into a
cluster command (preferably not expensive)

2)a way to alter Stata’s cluster command to allow for
this new dissimilarity measure

3)a way to get around this problem, e.g.:

A.use the ParseDist command within cluster.ado to
somehow cause the built-in command to call up a
different distance command

B.ways to enter the data so that a built-in Stata
dissimilarity measure will result in the same pairwise
distances (difficult because the pairwise
dissimilarities make up a multi-dimensional space, the
whole point is that they are difficult to summarize in
a few variables)

4) write my own clustering procedure

Please! Any help would be gratefully accepted. I
know that several other researchers have already used
Optimal Matching with clustering, so my guess is that
option #1 might be the most viable one, but I’m not
sure where to look.

Thanks,

Matissa





__________________________________
Do you Yahoo!?
Vote for the stars of Yahoo!'s next ad campaign!
http://advision.webevents.yahoo.com/yahoo/votelifeengine/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index