[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: RE: st: RE: Random assignment

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	RE: RE: st: RE: Random assignment
Date	Mon, 5 Jan 2004 12:44:47 -0000

Richard Williams
> 
> So, when you give the command, gen y=uniform()<=0.6, about 
> 60% of the time 
> the statement will be true, returning a value of 1; about 
> 40% of the time 
> it will be false, returning a value of 0.
> 
> So, the result is that y is a 0-1 dichotomy with a mean of 
> around .6 (i.e. 
> about 60% of the cases are coded 1, the rest are 0).  It 
> won't be exactly 
> .6 because of random variability; the bigger your sample 
> size is, the more 
> likely it is to be about .6.
> 
> In the original problem the author had 50 missing cases 
> that he wanted to 
> replace with 0s and 1s, proportionate to the numbers of 0s 
> and 1s that were 
> present in the nonmissing data.  So if, say, 60% of the 
> nonmissing cases 
> were 1s, about 60% of the missing cases could be randomly 
> assigned a value 
> of 1 while the rest were assigned 0s with the above command.
> 
> This is a nice way to create a dichotomy, but it might be 
> nicer still to 
> have a command something like gen y = dichot(.6), which 
> would mean create a 
> 0-1 dichotomy where 60% of the cases are 1s.  I suppose it 
> would be easy 
> enough to create such a command yourself if you had 
> frequent need for it.
> 

I am not sure that we need -dichot()- given other 
functions. Getting exactly 60% of the values to equal
1 is easy enough if 0.6 * _N is an exact integer, but 
naturally that's not true in general. 

Also, for problems like this, check out the FAQ 
http://www.stata.com/support/faqs/data/trueorfalse.html

and the -egen- function -rndsub()- in -egenmore- from 
SSC. 

Nick 
[email protected] 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: RE: st: RE: Random assignment
  - From: Richard Williams <[email protected]>

Prev by Date: st: RE: Better way to -graph- -tabulate varname1 varname2, column nofreq- (can't find better way in manual)
Next by Date: st: what statistical method should i use?
Previous by thread: st: Label already defined
Next by thread: st: Long arguments passed to programs
Index(es):
- Date
- Thread