Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Jaccard Similarity Measure


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: Jaccard Similarity Measure
Date   Mon, 26 Jul 2004 14:11:04 +0100

I can't see that this is anything to do 
with which dissimilarity measure you use. 
It depends on how you are setting up 
the analysis. 

If you enter 55 individuals into an analysis, 
40 from Australia and 15 from New Zealand, 
and then classify them, the fact that you 
have different numbers from each country 
is presumably one of many to be borne 
in mind, but no obvious technical problems 
arise thereby for cluster analysis. 

I am not, however, clear that this is 
what you want to do. 

Nick 
[email protected] 

D.Christodoulou
 
> I have two datasets of binary data with different number of 
> observation,
> e.g. an Australian dataset with 40 individuals, and a New Zeland 15
> individuals. I want to use the Jaccard coefficient to measure the
> similarity between these two datasets.
> 
> Leaving aside the assumptions for the similarities between 
> the two samples,
> is it a problem that the two datasets are not of the same size?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index