Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE:cluster analysis

From   [email protected]
To   [email protected]
Subject   Re: st: RE:cluster analysis
Date   Wed, 22 Sep 2004 19:32:39 -0500

Carlos de Los Rios <[email protected]> asks:

> I am performing a "cluster kmedian" analysis, and I am wondering if
> there is any tool that can measure the "goodness of fit" of the number
> of groups predetermined.

The -cluster stop- command provides the Calinski & Harabasz
pseudo-F index.  Most people think of using -cluster stop- only
after doing one of the hierarchical cluster analysis methods, but
the default rule(calinski) is also allowed after -cluster kmeans-
and -cluster kmedians-.

As a fake example:

    . sysuse auto
    . set seed 123123
    . cluster kmedian mpg-gear, k(5) name(k5)
    . cluster kmedian mpg-gear, k(6) name(k6)
    . cluster kmedian mpg-gear, k(7) name(k7)

    . cluster stop k5

    |             |  Calinski/  |
    |  Number of  |  Harabasz   |
    |  clusters   |  pseudo-F   |
    |      5      |   232.59    |

    . cluster stop k6

    |             |  Calinski/  |
    |  Number of  |  Harabasz   |
    |  clusters   |  pseudo-F   |
    |      6      |   415.37    |

    . cluster stop k7

    |             |  Calinski/  |
    |  Number of  |  Harabasz   |
    |  clusters   |  pseudo-F   |
    |      7      |   598.36    |

Gives me the pseudo-F for the kmedian clustering for 5, 6, and 7

See the "[CL] cluster stop" manual entry (page 94) for a similar
example that you could run starting with

    . webuse physed

to obtain the data.

Ken Higbee    [email protected]
StataCorp     1-800-STATAPC

*   For searches and help try:

© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index