[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: sampling problem

From	"join allfish" <[email protected]>
To	[email protected]
Subject	RE: st: RE: sampling problem
Date	Wed, 13 Jun 2007 11:31:13 +0000

Dear Nick,
Thanks for this suggestion - I did think of doing this. The problem is I have other variables, which are far more complicated and have many more values, which I want to use for the counterfactuals as well. I was hoping that there may be a program which could help - or at least some short cut I could use.
Thanks,
John

From: "Nick Cox" <[email protected]>
Reply-To: [email protected]
To: <[email protected]>
Subject: st: RE: sampling problem
Date: Wed, 13 Jun 2007 11:50:03 +0100

Focusing on this (typos corrected)

I want to draw individuals from 2007 according to the distribution
of health in 1985 so I draw individuals
with health=1 with prob=0.4,
health=2 with prob=0,
health=4 with prob=0.1
and health=5 with prob=0.5
(where the probabilities come from the health1985 distribution).

you can work out from your desired sample size the subsample
sizes you desire. Suppose you want a sample of 1000

use mydata
bsample 400 if health == 1
save cfsample

use mydata, clear
bsample 100 if health == 4
append using cfsample

use mydata, clear
bsample 500 if health == 5
append using cfsample

I would be happy to learn of a smarter solution. Naturally
you need do nothing about outcomes not to be included
in your sample. I can't comment on the status of samples
like this. Bootstrap experts may be able to help further.

Nick
[email protected]

join allfish (a.k.a. John)

> I want to sample data on the basis of counterfactuals - so
> what would the
> distribution of income in 2007 look like if individuals had
> the distribution
> of health of 1985.
>
> So imagine I have the following data
>
> id           income2007          health2007
> health1985
> wgt1985
> 1                 10                      1
>            1
>                  65.38
> 2                 10                      1
>            1
>                 153.91
> 3                 20                      1
>            1
>                 458.34
> 4                 20                      1
>            1
>                 484.2
> 5                 40                      2
>            1
>                 906.1
> 6                 40                      2
>            4
>                 943.96
> 7                 60                      4
>            5
>               1176.87
> 8                 60                      4
>            5
>               1389.91
> 9                100                     5
>           5
>              1716.93
> 10              100                     5
>          5
>             4067.68
>
> where weight is the sampling weights for the 1985 data (I
> also have sampling
> weights for the 2007 data). The order of the 1985 data makes
> no difference
> to the 2007 data it is just pasted in to obtain the health
> distribution.
> What I want to do is sample from the 2007 data to make the
> distribution of
> health in 2007 look like that in 1985. So I want to draw
> individuals from
> 2007 according to the distribution of health in 1985 so I
> draw individuals
> with health=1 with prob=0.4, health=2 with prob=0, health=4
> with prob=0.1
> and health=5 with prob=5 (where the probabilities comes from
> the health1985
> distribution). This should give me a hypothetical
> distribution of income in
> 2007 if the distribution of health was as in 1985.
> I cannot see how to do this with the bsample command. Further
> I am not sure
> then how to incorporate the sampling weights to ensure that
> my samples
> correctly represent the population distributions.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

_________________________________________________________________
Txt a lot? Get Messenger FREE on your mobile. https://livemessenger.mobile.uk.msn.com/

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: RE: sampling problem
  - From: "Nick Cox" <[email protected]>
- Re: st: RE: sampling problem
  - From: "Ben Jann" <[email protected]>

References:
- st: RE: sampling problem
  - From: "Nick Cox" <[email protected]>

Prev by Date: st: bug?
Next by Date: st: Encryption of data
Previous by thread: st: RE: sampling problem
Next by thread: Re: st: RE: sampling problem
Index(es):
- Date
- Thread