Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | daniel klein <klein.daniel.81@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Stuck in a logistic regression. Roadmap |
Date | Sat, 1 Dec 2012 10:48:23 +0100 |
Pedro, regarding stepwise regresison techniques, see: http://www.stata.com/support/faqs/statistics/stepwise-regression-problems/. If you decide to stick with (backward) stepwise regerssion anyway, have a look at -help stepwise-. You might also want to have a look into Landsey and Sheather (2010), even though the authors restrict their discussion on liner models. You also ask about holding the sample size constant between models. There are different ways of doing this, but I think the most basic way is to use e(sample). Stata stores the estimation sample in this "variable" after any estimation. Think of e(sample) as an indicator (dummy) variable, where 1 indicates an observartion has been used in the previous estimation. If you want to run two models, you need to run the "full" model first, then restric the sample in the second model using an -if- qualifier. Here is a short (nonsense) example sysuse auto ,clear // create some missing values in price replace price = . in 1/23 // note that price now has 23 missing values su foreign price mpg // run the "full" model logit foreign price mpg // note that only 51 out of the 74 observations are used in the model // now run the "reducded" model logit foreign mpg if e(sample) // Stata uses the same 51 observations, indicated by e(sample) // -if e(sample)- is just the short way of typing -if (e(sample) == 1) // if you need this specific sample another time, // but want to run other models, you can "copy" // the e(sample) variable to your dataset g byte my_sample = e(sample) ta my_sample Best Daniel Lindsey, Charles, Sheater, Simon (2010). Variable selection in linear regression. The Stata Journal, 10(4), 650-669. -- Dear Statalist, I am just arrived to Stata in the last month. Even thought I find it easier and more flexible than my previous software for standard statistics, I am stuck performing a logistic regression because I find the style is very different from SPSS. I would like to ask you some questions: 1) In the selecting variables phase, I performed lrtest of constant model and the model with the variable I try to test. If the number of observations are different the lrtest is not valid. What method do you recommend in this case? 2)I used SPSS where I did backstep logistic regression based on the LR. Can I perform this kind of analysis in Stata? Is the stepwise a recommended method to perform this kind of regression? 3) I used a macro called AllSetsReg in SPSS, in which I could obtain the best subsets based on Cp Mallows and AIC. I know there are some packages to do that in Stata, but I have more than 6 variables. Do you know any package or method to do that? 4) I am follow the Hosmer-Lemeshow way of performing the regression, but I don´t know if that´s the best way to do it. Is there a better way to perform a log regression? Do you have any suggestion or any roadmap to model which works for you. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/