Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | jaweria seth <jaweriaseth@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Looping over variables in more than one group |
Date | Wed, 7 Mar 2012 09:12:03 -0600 |
Thanks Nick, I understand this would result in a large number of models.. however, I wouldn't be combining variables of the same category/group, as this would bring up the issue of multicollinearity. for example, I know for sure I need to add one variable each from groups 1 and 2. group 1 contains variables that measure the size/production of a business, and I am wondering which of those variables would be most significant in a multi-variate model. I am looking at t-stats in the regression output: if even one of the variables included is not significant at the 10%, that model gets dropped..( and as im running the regressions manually, i find that the majority of the combos are not significant). Does this make sense? If so, how can I implement it? The way I am doing it right now: Holding one variable from group2 constant and looping through group 1/size variables to find significance. however, this gets tricky when I try to include a third variable. Thanks, On Wed, Mar 7, 2012 at 2:34 AM, Nick Cox <njcoxstata@gmail.com> wrote: > Before you even think of how to implement this, do the combinatorics > of how many models this implies. > > So, for example, > > . di 30^4 > 810000 > > . di 5^4 > 625 > > Then bump up those numbers adding in the null choices, i.e. no > variable from each group, as well. > > So you would need not only to do the looping but to ponder what it > implies in terms of gathering results from thousands of models, > finding the "best", whatever that means, including the implications > for how you think about the resulting P-values, etc. > > Nick > > On Tue, Mar 6, 2012 at 10:01 PM, jaweria seth <jaweriaseth@gmail.com> wrote: > >> I would like to run regressions with up to 4 different variables. My >> variables are separated into 4 groups with 5-30 variables in each >> group. I would like to run regression combos of different variables to >> find the best model: >> How do I regress my y variable on 1 variable from group 1 and 1 from >> group 2 and loop through different combos of each? >> for ex: >> regress Yvariable Group1 Group2 >> >> Then I would like to add a variable from group 3, and so on.. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/