Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <[email protected]> |
To | [email protected] |
Subject | Re: st: Change of reference group in linear multiple regression analysis |
Date | Thu, 9 Aug 2012 15:49:01 +0100 |
No magic. You pick one out of the air with a bit of guesswork. In this case, I assume that some residents could have immigrated ~100 years ago, but skewness implies that a midpoint near 70 would be quite wrong. If the guesswork looks crucial, a little sensitivity analysis is called for. I don't think literature references are always needed. If you think something is a good idea, do it and publish it. Nick On Thu, Aug 9, 2012 at 4:39 PM, Richard Williams <[email protected]> wrote: > At 08:38 AM 8/9/2012, Nick Cox wrote: >> >> Why not -recode- to say 1, 4, 8, 15, 25, 40 and use the quantitative >> information you have? Sure, there is some arbitrariness involved, but >> it is defensible. > > > I've done this myself. But does anybody know of any literature that defends > (or attacks) the practice? It seems like the most difficult part is the > open-ended upper category -- how should you choose a value for that? > >> Nick >> >> On Thu, Aug 9, 2012 at 2:30 PM, Richard Goldstein >> <[email protected]> wrote: >> >> > I assume you are using an up-to-date version of Stata; see -h fvvarlist- >> > and look under "base level" >> >> On 8/9/12 9:25 AM, Amal Khanolkar wrote: >> >> >> Hi, I'm currently running a linear multiple regression analysis where >> >> the principle explanatory variable is: >> >> >> >> Years since | >> >> immigration | Freq. Percent Cum. >> >> ------------+----------------------------------- >> >> <3 yrs | 106,794 3.57 3.57 >> >> 3-5 yrs | 91,311 3.05 6.62 >> >> 6-10 yrs | 96,657 3.23 9.85 >> >> 11-19 yrs | 82,344 2.75 12.61 >> >> 20-29 yrs | 50,954 1.70 14.31 >> >> >=30 yrs | 2,563,396 85.69 100.00 >> >> ------------+----------------------------------- >> >> Total | 2,991,456 100.00 >> >> >> >> - Currently the default group is the first category, <3yrs. I would >> >> however like to change this in the regression to the last group >=30 years. >> >> How do I do this without having to recode the above variable? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/