Richard Williams
>
> So, when you give the command, gen y=uniform()<=0.6, about
> 60% of the time
> the statement will be true, returning a value of 1; about
> 40% of the time
> it will be false, returning a value of 0.
>
> So, the result is that y is a 0-1 dichotomy with a mean of
> around .6 (i.e.
> about 60% of the cases are coded 1, the rest are 0). It
> won't be exactly
> .6 because of random variability; the bigger your sample
> size is, the more
> likely it is to be about .6.
>
> In the original problem the author had 50 missing cases
> that he wanted to
> replace with 0s and 1s, proportionate to the numbers of 0s
> and 1s that were
> present in the nonmissing data. So if, say, 60% of the
> nonmissing cases
> were 1s, about 60% of the missing cases could be randomly
> assigned a value
> of 1 while the rest were assigned 0s with the above command.
>
> This is a nice way to create a dichotomy, but it might be
> nicer still to
> have a command something like gen y = dichot(.6), which
> would mean create a
> 0-1 dichotomy where 60% of the cases are 1s. I suppose it
> would be easy
> enough to create such a command yourself if you had
> frequent need for it.
>
I am not sure that we need -dichot()- given other
functions. Getting exactly 60% of the values to equal
1 is easy enough if 0.6 * _N is an exact integer, but
naturally that's not true in general.
Also, for problems like this, check out the FAQ
http://www.stata.com/support/faqs/data/trueorfalse.html
and the -egen- function -rndsub()- in -egenmore- from
SSC.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/