Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: sample selection (-gsample) in stata
From
Steven Samuels <[email protected]>
To
[email protected]
Subject
Re: st: sample selection (-gsample) in stata
Date
Wed, 6 Jul 2011 18:18:18 -0500
Shikka, In excel, you have drawn a stratified sample with proportional allocation. -gsample- has drawn a sample of 30 observations per stratum (as should be clear from the -help-).
You are using the term "PPS" without understanding what it means. I've already given you my best advice, so I don't think that I will add anything more.
Steve
[email protected]
On Jul 6, 2011, at 5:32 PM, Shikha Sinha wrote:
Thanks everyone for the response. I think two-stage PPS is complex.
However to understand the one-stage PPS in Stata, I still need your
inputs. I did it in excel, and results are below:
City No of companies Prob (no of companies/1397) Number to be selected
(300*prob)
Central 135 0.10 29
Copperbelt 184 0.13 40
Eastern 173 0.12 37
Luapula 136 0.10 29
Lusaka 87 0.06 19
North Western 130 0.09 28
Northern 173 0.12 37
Southern 231 0.17 50
Western 148 0.11 32
Total 1397 1 300
This is what I meant by PPS. From the sampling frame of 1397 companies
in 9 cities, I want to draw a random sample of 300 comapnies based on
PPS. Do you think I am doing it right in excel?
Next, I tried to generate the same in stata using -gsample.
bys City: gen freq= _N
. g pps=freq/1397
. gsample 30 [aw=pps], wor strata( pid)
(1127 observations deleted)
. tab Province
City Freq. Percent Cum.
Central 30 11.11 11.11
Copperbelt 30 11.11 22.22
Eastern 30 11.11 33.33
Luapula 30 11.11 44.44
Lusaka 30 11.11 55.56
North Western 30 11.11 66.67
Northern 30 11.11 77.78
Southern 30 11.11 88.89
Western 30 11.11 100.00
Total 270 100.00
The stata output is different from the excel output. -gsample draw 30
obs from each City, then how can it be based on PPS. Could you suggest
me the right code using -gsample to generate the excel output. or can
I use -samplepps, what would be the code for this?
Thanks,
Shikha
On Wed, Jul 6, 2011 at 12:30 PM, Stas Kolenikov <[email protected]> wrote:
> On Tue, Jul 5, 2011 at 4:48 PM, Shikha Sinha <[email protected]> wrote:
>> -gsample looks good, but I am still struggling. How do I calculate the
>> size for -gsample. I want the select companies from each cities and of
>> each type in each city.
>
> -gsample- will only produce appropriate PPS samples if you specify
> sampling with replacement (which is the approximation you would have
> to make at the analysis stage, anyway). PPS sampling without
> replacement is far more complicated, and if the phrase "Rao-Sampford
> algorithm" does not ring a bell, you will end up with wrong sampling
> weights.
>
> --
> Stas Kolenikov, also found at http://stas.kolenikov.name
> Small print: I use this email account for mailing lists only.
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/