Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

[no subject]



Nick 
[email protected] 

Matissa Hollister
 
> Im hoping someone can help me solve this problem,
> although I'm beginning to think that it's hopeless. 
> Basically I've created my own special measure of
> dissimilarity that I want to use for clustering, but
> I'm finding that there is no way to get Stata to allow
> me to use this new dissimilarity measure.  Any ideas
> of ways to get around this problem would be greatly
> appreciated.
> 
> Basically, I am using a procedure called Optimal
> Matching, an algorithm designed to create a measure of
> dissimilarity between two sequences of data.  I am
> using it to identify people who have similar career
> patterns.  I've created a do-file that accomplishes
> the most difficult and unusual part of Optimal
> Matching, which is creating the measure of
> dissimilarity between each pair of sequences.  I now
> want to run a clustering procedure to identify groups
> based upon this dissimilarity measure.
> 
> I found a post in the listserv archives (dated
> November 18, 2002) where someone wanted to do
> something similar (she wanted to create a geographic
> distance measure).  From the response I gather the
> calling and running of the dissimilarity algorithms
> occurs within the built-in stata command _cluster and
> is done within C, which is certainly beyond my
> programming abilities.  I've contemplated several
> possibilities and would love help or advice on any of
> them:
> 
> 1)find a different software program that will allow me
> to easily input a new dissimilarity measure into a
> cluster command (preferably not expensive)
> 
> 2)a way to alter Stata's cluster command to allow for
> this new dissimilarity measure
> 
> 3)a way to get around this problem, e.g.:
> 
>    A.use the ParseDist command within cluster.ado to
> somehow cause the built-in command to call up a
> different distance command
> 
>    B.ways to enter the data so that a built-in Stata
> dissimilarity measure will result in the same pairwise
> distances (difficult because the pairwise
> dissimilarities make up a multi-dimensional space, the
> whole point is that they are difficult to summarize in
> a few variables)
> 
> 4)	write my own clustering procedure
> 
> Please!  Any help would be gratefully accepted.  I
> know that several other researchers have already used
> Optimal Matching with clustering, so my guess is that
> option #1 might be the most viable one, but I'm not
> sure where to look.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index