Stephen,
I do not know anything about SIPP and I hope someone on this list can
provide you with a direct answer to your question, however, when in a
similar situation with regards to NHSDA data, I put into the Google search
engine the terms,
NHSDA 1997 Stata
And scoured the internet until I found a journal article where the authors
had used Stata to analyse 1997 NHSDA data (the ds I wanted to use). Then I
emailed the authors and one of them wrote back with the information I
needed.
This approach might work with SIPP. For example, backing off the path below
to the US Census Bureau home page (top level of path), I used the Google
engine there to look for
Stata SIPP
And one hit is a slide show with 2 email contacts on the last slide
(http://lehd.dsd.census.gov/led/library/presentations/Synthetic-Data-Census-
20041207.pdf).
I am not saying those 2 people will give you what you need, but the approach
may be useful.
Good luck,
Laurel Copeland
San Antonio VA
-----Original Message-----
From: Stephen Mennemeyer [mailto:[email protected]]
Sent: Thursday, February 10, 2005 5:17 PM
To: [email protected]
Subject: st: How to svyset for the SIPP
Dear Statalisters:
Can anyone give me some guidance on how to analyze the Survey of Income and
Program Participation ( SIPP), especially the 1996 version, to take account
of the complex sample design? I have access to both SAS/SUDAAN and Stata.
Assuming for the moment that I use the svy commands in Stata and that I want
to do longitudinal analysis, I think I want to use the svyset command as
follows:
svyset wpfinwgt, strata(gvarstr)
where wpfinwgt is the longitudinal weight for individuals and gvarstr is the
"variance stratum code".
I am confused about whether I can or should do anything with the options
for PSU and FPC.
According to the SIPP Manual
http://www.census.gov/apsd/techdoc/sipp/sipp96l.pdf page 8-1:
"The 1996 Panel of the SIPP sample is located in 322 Primary Sampling Units
(PSUs), each consisting of a county or a group of contiguous counties.
Within these PSUs, living quarters
(LQs) were systematically selected from lists of addresses...."
As far as I can tell, there is no SIPP variable for the PSU. The PSU code
is scrambled inside the ssuid variable (the household ID number) but I do
not think there is any way to tell which ssuids came from the same PSU.
However, from reading the Stata 8 manual at U [30] p. 346-347 I wonder if I
should use the command:
svyset wpfinwgt, strata(gvarstr) psu(ssuid) fpc(epppnum)
where epppnum is the individual person identifier within the household.
I think this is wrong but my logic here is that the SIPP is sampling
individuals who are "clustered" in households where every member of the
household is interviewed. I am particularly concerned about the remark in
the Stata Manual U 30.2.2 p. 347 "For example if our PSUs were were
households and we included every member of the household in our study, then
a finite population correction term would be appropriate where the
households are sampled using simple random sampling without replacement in
each stratum"
Guidance would be much appreciated.
--
Stephen T. Mennemeyer Ph.D.
University of Alabama at Birmingham
School of Public Health
Dept. of Health Care Organization and Policy
U.S. Mail:
1530 3rd Ave. South 330 RPHB
Birmingham, Al 35294-0022
Express Delivery:
330 Ryals Public Health Building
1665 University Blvd.
Birmingham, Al 35294-0022
Phone: (205) 975-8965
FAX (205) 934-3347
e-mail: [email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/