Hello all,
I'd like to get a simple random sample of X% (weighted) of different subsamples of my data without replacement and without dropping the observations that were not selected.
For example, using the auto dataset, I'd like to create a new variable called "sample" which is equal to 1 for a randomly selected 75% of foreign cars (weighted by weight) and 75% of domestic cars and equal to 0 for all other cars.
Here's my current attempt, which requires -xtile2- (SSC). Does anyone know if there is a way to do this in one line?
**** Start Code *****
sysuse auto
set seed 4635
gen random = uniform()
xtile2 rank = random [aw=weight], nq(4) by(foreign)
*Pick 75% for sample
gen sample = (rank<4)
****** End Code *********
Thanks for your consideration.
Howie
Howie Lempel
Research Assistant
The Brookings Institution | Economic Studies
1775 Massachusetts Ave NW | Washington DC 20036
[email protected] | p: (202) 238-3576
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/