I have a small dataset that I'd like to use to conduct a power analysis for
an IV regression analysis on a much larger dataset that I hope to obtain.
The Y I have in my current dataset isn't on the same scale as the Y I hope
to have data on, but I have all the relevant right-hand-side variables in my
current dataset.
My plan has been to use the IV variance formula, var(b-hat) =
var(e)*var(Z)/(n*cov(X,Z)^2), with my covariates partialled out of the
variance and covariance terms. I can obtain the estimates of var(Z) and the
covariance term from my data (though part of my question regards the easiest
way to do so in Stata). Am I correct that if my (future) Y will have mean 0
and SD 1, I can try out different values for R-squared -- translating them
into values of var(e) -- and then examine what n is needed to detect an
effect of b standard deviations or what b I can detect with an n of a given
size?
If yes (and even if no), then I'm hoping for some help understanding the
mechanics of how Stata computes SEs in iv regression so I can conduct the
power analyses. My equations are essentially:
Y=b1*X+b2*T+b3*W+e
T=c1*X+c2*Z+v
If no (or maybe even if yes), any suggestions on how to conduct the power
analysis would be greatly appreciated. I don't do much matrix algebra on my
own, but I can usually figure out what I need to from examples. Thanks.
Scott Winship
Ph.D. Candidate in
Sociology & Social Policy
Harvard University
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/