Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: prob by using binreg or logit
From
Steve Samuels <[email protected]>
To
[email protected]
Subject
st: RE: prob by using binreg or logit
Date
Fri, 14 Jun 2013 11:48:06 -0400
But you shouldn't use -binreg- with survey data. Instead, you should
-svyset- with the design information: weights, clusters, strata.
Use one of:
• svy: logit or svy: logistic
• svy: glm with link(log) and family(binomial) options
• svy: regress, with 0-1 indicators, for risk differences
If you don't use these, then standard errors, p-values, and CIs will be
incorrect.
I use -margins- to compare grouped and predicted results on the probability scale.
This is one way of deciding which method to use. Another is to look at ROCs (below):
After -svy: logit- or -logistic-, you can test goodness of fit with the contributed
program -svylogitgof- ("findit")
If you intend to predict outcomes after -svy: logit-, you'll need receiver operating characteristics (ROCs).
To get these most easily, use plain, non-survey logit with frequency weights that agree with the
probability weights to the nearest integer:
. gen new_wt = round(old_weight,1)
. logit.... [fw = new_wt]
. lroc
. lsens
Steve
On Jun 13, 2013, at 4:06 AM, tshmak wrote:
Dear Carsten,
There certainly are differences. rr stands for risk ratio. or stands for odds ratio. I assume you know the difference between "odds" and "risk". -binreg- with the rr option assumes that the log of the risk (or probability) is a linear function of the covariates. -logit-, or binreg with the -or- option, assumes that the log of the odds is a linear function of the covariates. That should be enough to lead to differences.
HTH,
Tim
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of carsten hinrichsen
Sent: 12 June 2013 23:14
To: [email protected]
Subject: st: prob by using binreg or logit
Dear statalisters,
I am working with survey data and want to find the probability of participation by analyzing the binary variable of participation (yes/no).
As far as I know, I could use binreg with the rr option
or
I could use logit and use the odds to calculate the probability.
I've tried both and get slightly different results. I've been looking through the stata help but can't figure out what the difference is between these to methods.
So I'm wondering are there different assumptions behind these to methods that I should take into consideration?
And should I prefer one of the methods to the other?
Any help is appreciated.
Kind regards
Carsten Hinrichsen
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/