Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Frequency weighted cluster analysis


From   [email protected] (Brendan Halpin)
To   [email protected]
Subject   Re: st: Frequency weighted cluster analysis
Date   Wed, 11 Jan 2012 00:49:05 +0000

On Wed, Jan 11 2012, Nick Cox wrote:

> Why would a cluster analysis change because some observations are
> duplicated? The similarity or dissimilarity of objects is not affected
> by their frequency. What does this SAS statement do that should be
> replicated by Stata?

It depends on the linkage used. For linkages where the size of the
clusters matters in choosing which pair of clusters to agglomerate next,
having one versus many of a particular case will change the results.
Thus it doesn't matter for single or complete linkage, but it does for
Ward's.

I'm speculating that it's possible to write the low level code to take
advantage of the duplication. I'm guessing that the SAS command (which I
have only seen online -- I don't have access to SAS) exploits this
possibility. 

At the moment I'm just fishing for clues -- I have an idle idea[1] for
an analysis that would exploit high levels of duplication.

Regards,

Brendan

[1] Procrastination, in other words. 
-- 
Brendan Halpin,   Department of Sociology,   University of Limerick,   Ireland
Tel: w +353-61-213147  f +353-61-202569  h +353-61-338562;  Room F1-009 x 3147
mailto:[email protected]    ULSociology on Facebook: http://on.fb.me/fjIK9t
http://teaching.sociology.ul.ie/bhalpin/wordpress         twitter:@ULSociology
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index