|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Question about svyset command
I�a beginner Stata user and have a question about the svyset command
in Stata that I hope someone can help me with.
For some background, I'm engaged in a logistic regression model that
examines the likelihood of either a plaintiff or defendant filing a
post trial motion. The database I'm working with is the Civil Justice
Survey of State Courts (CJSSC). The CJSSC provides case level data for
all tort, contract, and real property trials conclude in a sample of 46
of the nation's 75 most populous counties in 2005. Data are collected
on about 8,000 trials in these 46 counties which are weighted to
represent about 10,500 trials concluded in the nation's 75 most
populous counties. I understand that one of the nice features of Stata
is that it allows you to take into account the sampling structure of a
dataset when doing logistic regression modeling. Here is the Stata code
that I used to take in account the sampling structure of these civil
trial data:
svyset sitecode [pweight=bwgt0], strata(strata) fpc(fpc1) || su2,
fpc(fpc2)
Where
Sitecode = County where the civil trial took place
Bwgt0 = Weights to weight the data from 46 to the 75 most populous
counties
Strata = Strata where the counties are located. The dataset has 5 strata
fpc1 = The probability of a county appearing in the sample. For
example, a county with a weight of 2 would have a 50% probability of
appearing in the sampl
e
su2 = Unique identifier that identifies the trials that occurred in
each of the 46 counties
Fpc2 = 1 for all 8,000 trials disposed in the 46 counties. I gave fpc2
a value of 1 because I wanted to tell Stata that the trials had a 100%
probability of showing up in these 46 counties.
I think that I got the part of this programming that deals with the
first level of the sample design correct. It’s the second level that
I’m having some problems with At the second level of the sample design,
I'm trying to correct for the fact that I have data for every civil
trial concluded in the 46 counties. Basically, I want to tell Stata
that part of this sample is actually a census of all trials concluded
in the 46 counties in 2005. I understand Stata has a finite population
correction command that takes into account the census like format of
these data. The logistic regression results were the same irrespective
of whether I used the 1st or 2nd stages in the sample design. I think
this is telling me that Stata is not correcting for the census like
aspect of this sample. Can anyone give me some guidance as to whether
I'm correctly taking into account the sampling structure of these data.
In particular, I would like to know whether I'm using the fpc2 factor
correctly. Any assistance you could give on this matter would be very
much appreciated.
Thanks
Thomas Cohen
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/