Barth,
I did not understand exactly what you are looking for. If you want to
generate two independent, normally distributed variables, but, at the same
time, want to specify an effect size, I think you can only vary the
difference between observations (mu1-mu2) to have also a specific pooled
standard deviation.
For example, suppose you want to simulate two normally distributed
variables, whose effect size [based on your conjecture] is
((mu1-mu2)/sd_pooled) = 0.2 with a pooled SD of 2.
d = mu1-mu2 = sd_pooled*effect_size
d = 2*0.2 = 0.4
Since both groups are of equal size, sd_pooled == sd of mu1 == sd of mu2
*/----- example 1 -----------------
clear
set memory 600m
set obs 1000000
scalar sd1 = 2
scalar sd2 = 2
matrix SDS = (sd1,sd2)
drawnorm mu1 mu2, means(120 119.6) sds(SDS)
scalar sd_pooled = sqrt((((_N-1)*(sd1^2))+((_N-1)*(sd2^2)))/(_N+_N-2))
gene effect_size = (mu1-mu2)/sd_pooled
sum effect_size, detail
*/----- example 1 -----------------
*/----- example 2 -----------------
clear
set memory 600m
set obs 1000000
scalar sd1 = 2
scalar sd2 = 2
matrix SDS = (sd1,sd2)
drawnorm mu1 mu2, means(-0.6 -1) sds(SDS)
scalar sd_pooled = sqrt((((_N-1)*(sd1^2))+((_N-1)*(sd2^2)))/(_N+_N-2))
gene effect_size = (mu1-mu2)/sd_pooled
sum effect_size, detail
*/----- example 2 -----------------
In the examples below, it does not matter the values of mu1 and mu2 as
long as their mean difference is 0.4
If you want that effect_size==d, it will be necessary to set one of the
values equal to zero and to have sd_pooled equal to 1.
Furthermore, you have to check if you are already simulating data
correctly, but get some variation due to the fact you are not working with
the population (i.e. you have actually
the sampling variation), so your effect size will rarely be equal to the
population effect size that you input.
Hope this helps.
Cheers!
Tiago
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/