Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Patrick Lapid <patrick.lapid@gmail.com> |
To | statalist <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: using svyset with pooled cross-sections from IPUMS-CPS |
Date | Tue, 26 Apr 2011 13:02:31 -0700 |
Thank you, Stas. I'm not sure if declaring the survey design may be necessary, since I'm not necessarily generalizing the results to the total U.S. population. Should I just use regress with the vce(cluster) option? Would it be necessary to use pweight (sampling weights)? Best, Patrick ----Stas Kolenikov's <skolenik@gmail.com> reply---- I don't think this is right. CPS is a rotating design, with the same households appearing (say) in February, March, April and May in both (say) 2010 and 2011. With your -svyset-, they would be treated as if they belonged to separate strata, which is not right (and counterproductive, actually: this design is optimized to have small standard errors on the measures of change, with 3/4 overlap between consecutive months, and 1/2 overlap between consecutive years, which helps bring down the variances by probably 20% and 10%, respectively, off the top of my survey statistician's intuition). ---end reply--- On Tue, Apr 26, 2011 at 11:56 AM, Patrick Lapid <patrick.lapid@gmail.com> wrote: > I'm currently working on a labor economics project using the U.S. Current > Population Survey from 2006 to 2010, with the data downloaded from IPUMS. > I'm concerned if I've declared the survey design correctly. I am attempting to > analyze the data as pooled cross-sections, using survey estimation. I used > the following Statalist post as a guide: > > http://www.stata.com/statalist/archive/2008-10/msg00521.html > > I have the following lines of code to declare the survey design: > > . egen hhXyear = group(serial year) > . svyset hhXyear [pweight=perwt], strata(year) > > Since the original clusters (PSUs) were households, indexed by serial number, > I used egen to create new clusters of households in a given year. I then used > svyset to set the following: > > -clusters (PSUs): hhXyear > -strata: year (each separate year of the CPS) > -sampling weight: perwt (person weight, provided by IPUMS) * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/