[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Dealing with survey data when the entire population is also in the dataset

From	Margo Schlanger <[email protected]>
To	[email protected]
Subject	st: Dealing with survey data when the entire population is also in the dataset
Date	Fri, 24 Jul 2009 19:06:33 -0400

Hi --

I have a dataset in which the observation is a "case".  I started with
a complete census of the ~4000 relevant cases; each of them gets a
line in my dataset.  I have data filling a few variables about each of
them.  (When they were filed, where they were filed, the type of
outcome, etc.)

I randomly sampled them using 3 strata (for one strata, the sampling
probability was 1, for another about .5, and for a third, about .75).
I end up with a sample of about 2000.  I know much more about this
sample.

Ok, my question:

1) How do I use the svyset command to describe this dataset?  It would
be easy if I just dropped all the non-sampled observations, but I
don't want to do that, because of question 2:

2) How do I compare something about the sample to the entire
population, just to demonstrate that my sample isn't very different
from that entire population on any of the few variables I actually
have comprehensive data about. I could do this simply, if I didn't
have to worry about weighting:

tabulate year sample, chi2

But I need the weights.  In addition, I can't simply use weighting
commands, because in the population (when sample == 0), everything
should be weighted the same; the weights apply only to my sample (when
sample == 1).  And I can't (so far) use survey commands, because I
don't know the answer to (1), above.

NOTE: Nearly all the variables I care about are categorical:  year of
filing, type of case.  But it's easy enough to turn them into dummies,
if that's useful.


Thanks for any help with this.

Margo Schlanger

______________________
Professor of Law
University of Michigan Law School
Director, Civil Rights Litigation Clearinghouse
(http://clearinghouse.wustl.edu)

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Dealing with survey data when the entire population is also in the dataset
  - From: "Michael I. Lichter" <[email protected]>
- st: Factor models
  - From: kokootchke <[email protected]>

Prev by Date: Re: st: AW: For each fund-asset pair I have observations on some periods, and want have them on each possible period...
Next by Date: st: Factor models
Previous by thread: st: Creating a loop for bsample
Next by thread: st: Factor models
Index(es):
- Date
- Thread