Dear all,
I am involved in preparing a cluster analysis for binary data with STATA. Based on the raw data set the similarity matrix for the objects is already available.
Is there any option to feed STATA with a (dis-)similarity matrix directly, by preventing STATA from interpreting the input matrix as raw matrix and preventing the software from calculating a (dis-)similarity matrix, before the clustering is started?
The problem has been that the raw data matrix is very large (> 1 million attributes) and the similarity matrix has been calculated by routines outside of STATA. Now when it comes to clustering, STATA does not seem to interpret the input matrix as similarity matrix and calculates a further distance matrix, before the clustering itself is performed. Calculating additionally a distance matrix of a similarity matrix may result in methodological difficulties.
Therefore my question: Do you have any idea how to deactivate the calculation of the distance matrix before the clustering is done?
Thanks a lot for your help.
Jochen Siegele
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/