Peter Muhlberger replied to Krishna D Rao:
> You could also try bootstrapping your coefficients from a random effects
> model, which would eliminate the small sample bias in your variance
> estimates.
A nice idea to Krishna's original poser, which I've been able to simulate
whilst incorporating Roger Newson's suggestion to fit a fixed effects
model to his data. I've followed Wood's (2004) suggestion in running 2000
bootstrapped simulations.
Since, as Roger points out, Krishna doesn't give us any detailed
information on his variables, I've assumed that the response variable in
the dataset simulated below is a uniformly distributed and continuous
variable ranged from 0-100:
. clear
. set more off
. set seed `=date("2005-01-07", "ymd")'
. set obs 360
obs was 0, now 360
. g id=_n
. g group=ceil(uniform()*12)
. tab group
group | Freq. Percent Cum.
------------+-----------------------------------
1 | 50 6.94 6.94
2 | 72 10.00 16.94
3 | 82 11.39 28.33
4 | 60 8.33 36.67
5 | 54 7.50 44.17
6 | 58 8.06 52.22
7 | 60 8.33 60.56
8 | 50 6.94 67.50
9 | 62 8.61 76.11
10 | 54 7.50 83.61
11 | 54 7.50 91.11
12 | 64 8.89 100.00
------------+-----------------------------------
Total | 720 100.00
. expand 2
(360 observations created)
. g y=uniform()*100
. g x1=uniform()
. g x2=uniform()*5
. g x3=invnorm(uniform())
. by id: gen time=_n
. sort id
. l id group y x1 x2 x3 time in 1/10
+----------------------------------------------------------------+
| id group y x1 x2 x3 time |
|----------------------------------------------------------------|
1. | 1 6 24.03016 .1445585 .6673999 -1.081646 1 |
2. | 1 6 90.45777 .6376978 3.680331 -1.410077 2 |
3. | 2 3 12.22887 .989752 1.70654 -1.028015 1 |
4. | 2 3 95.63952 .6426014 4.66782 -.621906 2 |
5. | 3 11 20.20495 .9287896 4.912792 -1.984773 1 |
|----------------------------------------------------------------|
6. | 3 11 57.8842 .6628636 3.113226 -.3708619 2 |
7. | 4 10 50.49711 .3376878 2.095944 .2773025 1 |
8. | 4 10 12.29717 .9934924 .8407423 .6124281 2 |
9. | 5 12 88.63429 .2615153 .8085947 .1702638 1 |
10. | 5 12 31.2398 .1373881 .556083 .4438728 2 |
+----------------------------------------------------------------+
. tsset id time
. bs "areg y x1 x2 x3, absorb(id) cluster(group)" _b _se, size(360)
reps(2000) saving(kris) dots
command: areg y x1 x2 x3 , absorb(id) cluster(group)
statistics: b_x1 = _b[x1]
b_x2 = _b[x2]
b_x3 = _b[x3]
b_cons = _b[_cons]
se_x1 = _se[x1]
se_x2 = _se[x2]
se_x3 = _se[x3]
se_cons = _se[_cons]
[...]
Bootstrap statistics Number of obs = 720
Replications = 2000
----------------------------------------------------------------------------
Variable | Reps Observed Bias Std. Err. [95% Conf. Interval]
-----------+----------------------------------------------------------------
b_x1 | 2000 -4.629344 -.0557279 11.65842 -27.49327 18.23458 (N)
| -28.52434 17.79685 (P)
| -28.4749 17.84085 (BC)
b_x2 | 2000 1.275952 -.0159533 2.563146 -3.750765 6.302669 (N)
| -3.645857 6.343257 (P)
| -3.524056 6.556383 (BC)
b_x3 | 2000 .5331025 .1028936 3.621918 -6.570027 7.636232 (N)
| -6.388688 7.618769 (P)
| -6.421975 7.581013 (BC)
b_cons | 2000 49.81735 .001473 8.255104 33.62784 66.00686 (N)
| 33.13453 65.7711 (P)
| 32.77082 65.22783 (BC)
se_x1 | 2000 8.710757 11.37133 5.44168 -1.961202 19.38272 (N)
| 11.2756 32.2606 (P)
| 5.797402 5.797402 (BC)
se_x2 | 2000 1.648332 2.705572 1.060601 -.4316664 3.72833 (N)
| 2.447499 6.681056 (P)
| 1.554579 1.554579 (BC)
se_x3 | 2000 2.375892 3.526444 1.52015 -.6053518 5.357137 (N)
| 3.307063 9.179192 (P)
| 1.835775 1.835775 (BC)
se_cons | 2000 5.117283 8.791256 3.831319 -2.396513 12.63108 (N)
| 7.339995 22.68564 (P)
| 4.573055 4.573055 (BC)
----------------------------------------------------------------------------
Note: N = normal
P = percentile
BC = bias-corrected
In order to fire up -areg-, I was forced to take the liberty of
-expand-ing the dataset by at least 2 and creating a -time- variable (thus
simulating repeated observations for each individual; I've also assumed
that this panel dataset is balanced). Otherwise, -areg- returns an
"insufficient observations r(2000)" error.
Note that in order to control for fixed effects at different levels, both
-absorb- and -cluster- should be switched on for the _individual_ and
_group_ fixed effects respectively.
Unfortunately, I cannot simulate a dependent variable which induces
heteroscedasticity, but this example should now give Krishna enough
ammunition to solve his dilemma.
CLIVE NICHOLAS |t: 0(044)7903 397793
Politics |e: [email protected]
Newcastle University |http://www.ncl.ac.uk/geps
Reference:
Wood M (2004) "Statistical Inference Using Bootstrap Confidence Intervals"
SIGNIFICANCE 1(4): 180-2.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/