I would like to 'automate' the estimation of a series of OLS regressions.
The regressors can be divided into two sets of variables called, say, x1 -
x200 and x201 - x400 (it's a large data set).
I would like x1 - x200 to be included in every regression. However, I would
like to reduce the number of regressors from the set x201 - x400 so that,
after a number of re-estimations, only those with an absolute t-ratio
greater than 2.00 remain.
I would like to start with all regressors (x1 - x400) present and then
re-estimate having excluded those variables from x201 - x400 whose
(absolute) t-value in the first regression was less than, say, 0.25. On the
basis of the second regression result I would then like to re-estimate again
but excluding those variables from x2001 - x400 whose t-value in the third
regression was less than, say, 0.50. I'd like this process -- of
estimation, dropping variables from x201 - x400 on the basis of their
t-value, re-estimation, and dropping further variables from x201 - x400
using a gradually increasing t-value threshold -- to continue until, after
eight iterations, I am left with x1 - x200 and only those variables from
x201 - x400 whose t-value in the most recent regression is greater than
2.00.
Is it possible to program this?
Thanks.
Steve
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/