Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Random Sample Selection in Panel Data
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: RE: Random Sample Selection in Panel Data
Date
Fri, 13 May 2011 14:57:54 +0100
One way to tackle this is that you perform sample selection on a dataset with one just one identifier per observation. Then you -merge- with the main dataset.
Equivalently, tag just one observation per identifier, sample within that subset, and then expand to include all observations for each identifier. -egen, max()- is one way to do the expansion.
In fact
. search sample, faq
shows that this is an FAQ, and that you could have identified relevant material directly within Stata, e.g.
FAQ . . . . . . . . . . . . . . . . . . Sampling clusters, not individuals
. . . . . . . . . . . . . . . . . . . . . . N. J. Cox and S. Merryman
5/06 How can I sample clusters, not individuals?
http://www.stata.com/support/faqs/data/sampleby.html
Nick
[email protected]
Dennis Kramer
I have a large panel data sets (4 years-- 250,000 + records per year)
and I want to generate four random sample groups to test the stability
of the estimates. However, I want to ensure that if a ID is selected
in Year 1 then are are subsequently selected into the sample random
sample for Years 2, 3, and 4.
I know for a cross-sectional random sampling the code is as follows:
generate rannum = uniform()
egen grp2 = cut(rannum), group(4)
Does anyone have any insight into modifying the above syntax to
automatically include years2, 3, 4, ids in the same sample as the
selected Year 1 ID??
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/