Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Matthew White <mwhite@poverty-action.org> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Creating dummy variables |
Date | Wed, 16 Nov 2011 11:28:15 -0500 |
I see... Maybe this: ***BEGIN*** clear input fips1 fips2 1001 1073 1001 1021 1001 1101 1003 12031 1003 1099 end levelsof fips1 foreach county1 in `r(levels)' { generate fips1_`county1' = fips1 == `county1' label variable fips1_`county1' "fips1==`county1'" levelsof fips2 if fips1 == `county1' foreach county2 in `r(levels)' { capture generate fips2_`county2' = fips2 == `county2' if !_rc label variable fips2_`county2' "fips2==`county2'" generate diff_`county1'_`county2' = fips1_`county1' - fips2_`county2' label variable diff_`county1'_`county2' "fips1_`county1' - fips2_`county2'" } } order fips1_* fips2_* diff*, after(fips2) ***END*** Best, Matt On Wed, Nov 16, 2011 at 10:52 AM, Michael Betz <betz.40@buckeyemail.osu.edu> wrote: > Thanks Matt, > > This is getting close but there is still a hang-up. The program you wrote differences all "fips1" dummies with all "fips2" dummies. I need to get difference dummies only for the pairs (i.e 1001-1073, 1001-1021, and 1001-1101, but not 1001-12031). Because I have 3,000 levels for each "fips" variable, this program would create 3,000 x 3,000 variables, which is where Stata runs into a problem. > > Mike > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Matthew White > Sent: Wednesday, November 16, 2011 9:14 AM > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: Creating dummy variables > > Hi Mike, > > There's probably a more efficient way to do this, but here's one way: > > ***BEGIN*** > clear > input fips1 fips2 > 1001 1073 > 1001 1021 > 1001 1101 > 1003 12031 > 1003 1099 > end > > forvalues i = 1/2 { > levelsof fips`i' > foreach county in `r(levels)' { > generate fips`i'_`county' = fips`i' == `county' > label variable fips`i'_`county' "fips`i'==`county'" > > local dummies`i' `dummies`i'' fips`i'_`county' > } > } > > foreach dummy1 of local dummies1 { > local num1 = substr("`dummy1'", strpos("`dummy1'", "_") + 1, .) > > foreach dummy2 of local dummies2 { > local num2 = substr("`dummy2'", strpos("`dummy2'", "_") + 1, .) > > generate diff_`num1'_`num2' = `dummy1' - `dummy2' > label variable diff_`num1'_`num2' "`dummy1' - `dummy2'" > } > } > ***END*** > > Best, > Matt > > On Tue, Nov 15, 2011 at 9:44 PM, Michael Betz > <betz.40@buckeyemail.osu.edu> wrote: >> Hi all, >> >> I have two categorical variables "fips1" and "fips2" that record the US county of the observation. For each "fips1" there are many "fips2" counties as below >> >> fips1 fips2 >> 1001 1073 >> 1001 1021 >> 1001 1101 >> 1003 12031 >> 1003 1099 >> >> I need to create dummy variables for each county in "fips1" and "fips2" and then create variables representing the difference between the two dummy variables as below: >> >> fips1 fips2 dum1_1 dum1_2 dum2_1 dum2_2 dum2_3 dum2_4 1_1-2_1 1_1-2_2 1_1-2_3 >> 1001 1003 1 0 1 0 0 0 0 1 1 >> 1001 1021 1 0 0 1 0 0 1 0 1 >> 1001 1101 1 0 0 0 1 0 1 1 0 >> 1003 1021 0 1 0 1 0 0 0 0 0 >> 1003 1001 0 1 0 0 0 1 0 0 0 >> >> One added constraint is that each of "fips1" and "fips2" creates 3,000 dummies, so Stata cannot hold variables representing the difference between all pairs of dummy variables. I need to only calculate the difference in dummies for the pairs that in the data (i.e. according to the example above I would not need the difference between the dummies for "fips1"=1001 and "fips2"=1001 because that pair doesn't exist in my data) >> >> I've been thinking all day trying to come up with a solution, but to no avail. I appreciate and help or suggestions. >> >> Thanks, >> Mike >> >> >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > > > > -- > Matthew White > Data Coordinator > Innovations for Poverty Action > 101 Whitney Avenue, New Haven, CT 06510 USA > +1 434-305-9861 > www.poverty-action.org > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Matthew White Data Coordinator Innovations for Poverty Action 101 Whitney Avenue, New Haven, CT 06510 USA +1 434-305-9861 www.poverty-action.org * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/