Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Computing the proportion of significant variables after running numerous regressions

From	George Murray <[email protected]>
To	[email protected]
Subject	Re: st: Computing the proportion of significant variables after running numerous regressions
Date	Mon, 14 May 2012 18:30:10 +1000

Phil,

Thank you so much for your help, this worked perfectly.

I have one more query, however.

I also need a vector of the bias-corrected confidence intervals (which
can be obtained with the -estat bootstrap- command). I replace two of
the commands you suggested with these two commands as follows:

-postfile `postfile' foreign _b_cons _se_cons _ci_bc_cons _b_mpg
_se_mpg using "`results'"- .............(all I did was add
"_ci_bc_cons")

-post `postfile' (`level') (_b[_cons]) (_se[_cons]) (_ci_bc[_cons])
(_b[mpg]) (_se[mpg])- .............(all I did was add
"(_ci_bc[_cons])")

and I also wrote -estat boostrap- after the bootstrap, rep(10)... command

However, I get the following error:

_ci_bc not found
post:  above message corresponds to expression 3, variable _ci_bc_cons
r(111);

Does anyone know how to solve this problem?

Thanks in advance,

George.



On Mon, May 14, 2012 at 12:05 AM, Phil Clayton
<[email protected]> wrote:
> George,
>
> There are various ways to do this. One is to use -post- after each bootstrapped regression to store the results of that regression in a "results" dataset, similar to a Monte Carlo simulation. You can then access the results dataset and manipulate it however you like.
>
> Here's a basic example that uses the auto dataset and loops over the levels of "foreign" (ie 0 and 1), runs a bootstrapped regression of price on mpg for each level, and displays the resulting coefficients and standard errors.
>
> --------- begin example ---------
> * load dataset
> sysuse auto, clear
>
> * set up temporary file for results
> tempfile results
> tempname postfile
> postfile `postfile' foreign _b_cons _se_cons _b_mpg _se_mpg using "`results'"
>
> * run bootstrapped regression for each level of foreign
> set seed 1 // so that you can repeat your analysis
> levelsof foreign, local(levels)
> foreach level of local levels {
>        bootstrap, rep(10): regress price mpg if foreign==`level'
>        post `postfile' (`level') (_b[_cons]) (_se[_cons]) (_b[mpg]) (_se[mpg])
> }
> postclose `postfile'
>
> * display results
> use "`results'", clear
> list
> --------- end example ---------
>
> Since you're running ~1000 models you may wish to change "foreach" to "qui foreach", and monitor the iterations using the _dots command (see Harrison DA. Stata tip 41: Monitoring loop iterations. Stata Journal 2007;7(1):140, available at http://www.stata-journal.com/article.html?article=pr0030)
>
> Phil
>
>
> On 13/05/2012, at 10:06 PM, George Murray wrote:
>
>> Dear Statalist,
>>
>> I am using the -foreach- command to run approximately 1000
>> (bootstrapped) regression models, however I require an efficient way
>> of calculating the proportion of the regression models which have a
>> statistically significant constant at the 5% level; and of the
>> constants which are statistically significant, the proportion which
>> are positive.  Below each of the 1000 regressions I run, a table is
>> displayed with the following format:
>>
>> ---------------------------------------------------------------------------------------------------
>>             |    Observed                         Bootstrap
>>        V0 |       Coef.             Bias         Std. Err.
>> [95% Conf. Interval]
>> -------------+------------------------------------------------------------------------------------
>>         V1 |   .00968169  -.0000537   .00057051     .008721   .0111218  (BC)
>>         V2 |  -.00110469   .0000782     .000691   -.0023101    .000459  (BC)
>>         V3 |   .00468313  -.0001562   .00084971    .0031954   .0064538  (BC)
>>         _cons |  -.00076976   .0001811   .00176677   -.0044496   .0025584  (BC)
>> --------------------------------------------------------------------------------------------------
>>
>> I would be *very* grateful if someone knew the commands which would
>> allow me calculate this. In the past, I have used (a highly tedious
>> and embarrassing approach on) Excel where I filtered every Nth row,
>> and wrote a command to display 1 if the coefficient lies within the
>> confidence interval, and 0 if not. This time, however, I am running
>> numerous models and require a quicker approach.
>>
>> One more question -- is there a way to create a new variable where the
>> coefficients of V1 (for example) are saved, so I can calculate the
>> mean, standard deviation etc.of V1?
>>
>> If someone could answer at least one of these two questions, it would
>> be very much appreciated.
>>
>> George Murray.
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Computing the proportion of significant variables after running numerous regressions
  - From: Nick Cox <[email protected]>

References:
- st: Computing the proportion of significant variables after running numerous regressions
  - From: George Murray <[email protected]>
- Re: st: Computing the proportion of significant variables after running numerous regressions
  - From: Phil Clayton <[email protected]>

Prev by Date: Re: st: Keep variables if a value is available for a specific date
Next by Date: st: Thread-Index: Ac0vkoq1llKC2ARGRdCKzJ3Uorh9sgCG7a8+
Previous by thread: Re: st: Computing the proportion of significant variables after running numerous regressions
Next by thread: Re: st: Computing the proportion of significant variables after running numerous regressions
Index(es):
- Date
- Thread