Two comments:
1. Squared correlation between observed and predicted
is one measure you can use here. There are other measures
called pseudo-R^2 which don't all necessarily agree numerically
or algebraically. There is some discussion at
http://www.stata.com/support/faqs/stat/rsquared.html
and since then there has been at least one
thread on Statalist. The matter arises quite
often on Statalist, so an archive search may be
fruitful.
2. If you have missing data on any of the
covariates, then the model will not fitted
on the corresponding observations even if
the response is non-missing in those observations.
In that circumstance -predict- will be predicting
out of sample. It's safer to stipulate -if e(sample)-.
This issue is touched upon in the FAQ above.
Nick
[email protected]
Antonio Rodrigues Andres
> I am estimating grouped data logistic models using glogit or simply
> using regress with analytical weights
>
> generate double lcrp=log(ntprop/(pop-ntprop))
> generate double wt1=ntprop*(pop-ntprop)/pop
>
>
> glogit ntprop pop clprop $xvars $zvars $tvars
> regress lcrp clprop $xvars $zvars $tvars [w=wt1]
>
> If I want to get a measure of goodness of fit I might use
> the squared
> correlation between y and yhat
>
> regress lcrp clprop $xvars $zvars $tvars
> predict yhat1
> corr yhat1 lcrp
>
> Is that correct?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/