Dear Susan
I think the trick is to divide the sampling process into two steps.
First, let's create an example dataset to work with...
* create example dataset
webuse highschool, clear
keep state id height
save example, replace
Now let's sample 20 states (clusters) from this dataset....
use example
* get one obs per state
duplicates drop state, force
* sample 20 states
sample 20, count
That we have 20 states sampled, let's merge that with the original
dataset and keep just the matching observations. This means that we will
have all of the persons from the 20 states.
merge 1:m state using example
keep if _merge == 3
* show that there are 20 states
tab state
Now let's sample 10 per state from within the 20 states.
* now sample 10 per state
sample 10, count by(state)
* show that there are 20 states, 10 per state
tab state
I hope that this helps!
Best regards,
Michael N. Mitchell
See the Stata tidbit of the week at...
http://www.MichaelNormanMitchell.com
Visit me on Facebook at...
http://www.facebook.com/MichaelNormanMitchell
On 2010-02-12 12.22 AM, Susan Olivia wrote:
Dear Stata listers,
I am wondering whether it is possible to randomly draw few
clusters from clustered sample in Stata?
Say my full sample consists of 10,000 observations (with 100
clusters and each cluster has 100 observations).
I want to randomly draw few clusters with 10 observations in
each cluster. I tried using the 'sample' command, but this
is not doing what I after. Below is my attempt, it still
gave me 100 clusters.
Any advice on this, much appreciated.
Thanks,
Susan
. summ
Variable | Obs Mean Std. Dev. Min
Max
-------------+--------------------------------------------------------
xcoord | 10000 48.35506 27.67569 -1.426747
100.8945
ycoord | 10000 47.60003 27.35285 -.4297747
99.15934
cluster_id | 10000 50.5 28.86751 1
100
. preserve
. sample 10, by(cluster_id)
(9000 observations deleted)
. summ
Variable | Obs Mean Std. Dev. Min
Max
-------------+--------------------------------------------------------
xcoord | 1000 48.36142 27.67134 -.768847
100.4548
ycoord | 1000 47.58521 27.34904 .2530959
98.3428
cluster_id | 1000 50.5 28.88051 1
100
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/