Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: using svyset with pooled cross-sections from IPUMS-CPS
From
Stas Kolenikov <[email protected]>
To
[email protected]
Subject
Re: st: using svyset with pooled cross-sections from IPUMS-CPS
Date
Tue, 26 Apr 2011 12:22:46 -0500
Patrick Lapid is setting up his CPS data for survey estimation, and
provided his -svyset- index and description of the design as follows:
On Tue, Apr 26, 2011 at 11:56 AM, Patrick Lapid <[email protected]> wrote:
> I'm currently working on a labor economics project using the U.S. Current
> Population Survey from 2006 to 2010, with the data downloaded from IPUMS.
> I'm concerned if I've declared the survey design correctly. I am attempting to
> analyze the data as pooled cross-sections, using survey estimation. I used
> the following Statalist post as a guide:
>
> http://www.stata.com/statalist/archive/2008-10/msg00521.html
>
> I have the following lines of code to declare the survey design:
>
> . egen hhXyear = group(serial year)
> . svyset hhXyear [pweight=perwt], strata(year)
>
> Since the original clusters (PSUs) were households, indexed by serial number,
> I used egen to create new clusters of households in a given year. I then used
> svyset to set the following:
>
> -clusters (PSUs): hhXyear
> -strata: year (each separate year of the CPS)
> -sampling weight: perwt (person weight, provided by IPUMS)
I don't think this is right. CPS is a rotating design, with the same
households appearing (say) in February, March, April and May in both
(say) 2010 and 2011. With your -svyset-, they would be treated as if
they belonged to separate strata, which is not right (and
counterproductive, actually: this design is optimized to have small
standard errors on the measures of change, with 3/4 overlap between
consecutive months, and 1/2 overlap between consecutive years, which
helps bring down the variances by probably 20% and 10%, respectively,
off the top of my survey statistician's intuition).
--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/