Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Randomly picking observations based on a certain condition
From
Andrew Dyck <[email protected]>
To
[email protected]
Subject
Re: st: Randomly picking observations based on a certain condition
Date
Wed, 13 Apr 2011 15:16:36 -0700
After you consider the comments from Nick and J, you wish to proceed
with your analysis as you initially stated it, I think the following
should work. Here I create some sample data with 50 observations and 5
groups (quintiles). See if this might work for your data the way I
understood your question. I use the cutoff of 10 adults instead of 100
to keep the dataset small.
* sample data
set obs 50
egen group = seq(), from(1) to(5)
gen adults = round( runiform()*5, 2 )
* random variable for sorting
gen r = runiform()
* create a cumulative sum of adults
* sorting randomly within the group.
bysort group (r): gen cumul_adults = adults[1]
bysort group : replace cumul_adults = adults[_n] + cumul_adults[_n-1] if _n > 1
drop r
* keep all obs below the cutoff
keep if cumul_adults <= 10
Good luck,
Andrew
On Wed, Apr 13, 2011 at 2:02 PM, Nikhil Srivastava
<[email protected]> wrote:
>
> I am not trying to actually sample households. As I wrote in my rely
> to Nick,I am trying look at the effectiveness of a transfer program
> targeted to adults of a household which has a certain exclusion error.
> The exclusion error that we are assuming is that 1 percent of eligible
> participants within each expenditure quintile do not receive the
> benefits. In my sample within the first quintile 1 percent of the
> total adults comes to around 100. Thus for the first quintile I need
> to randomly assign non-beneficiary status to households so that the
> total number of adults for these households comes to 100. Similarly I
> have to pick randomly 1 percent of adults for each quintile and assign
> them non-beneficiary status. In my previous mail I used the number 100
> as an example. Thanks
>
> Nikhil
>
> On Wed, Apr 13, 2011 at 1:06 PM, Joerg Luedicke
> <[email protected]> wrote:
> > On Wed, Apr 13, 2011 at 3:17 PM, Nikhil Srivastava
> > <[email protected]> wrote:
> >> Hi,
> >>
> >> I have a dataset at the household level which contains the expenditure
> >> details of a sample of households. The dataset also records the number
> >> of adults within each household. I have divided this dataset into 5
> >> quintiles based on the level of expenditure. Now I need to randomly
> >> select a set of observations within each quintile so that the sum of
> >> the adults for those observations comes to 100. Could somebody please
> >> help me in writing a code for this part?
> >>
> >> I would really appreciate any help in this regard. Thanks
> >
> > Do I understand that right, you want to sample households, and within
> > each quintile of household expenditure, the number of household
> > members among sampled households is supposed to add up to 100? Why
> > would you do that? Why not just taking a random sample of households
> > or a stratified sample with respect to household size, if that is a
> > concern. That way, you would at least have a clear picture of the
> > population you are targeting, whereas in the other case, this picture
> > becomes pretty blurry, no?
> >
> > J.
> > *
> > * For searches and help try:
> > * http://www.stata.com/help.cgi?search
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/