dear all,
here there are the outputs I obtain when I drop observations (after the
heading OUTLIERS) in two different ways, which should be equivalent in terms
of the regression sample I use, but in fact are not...
thank you very much for your attention
Mariarosaria
-------------------------------------------------------------------------------
--------------------------------------------------------
. clear
. set memory 400m
. use c:\data\prin\stockingdatapanel\stockdata
. tsset id year
panel variable: id (strongly balanced)
time variable: year, 2000 to 2006
.
.
. *OUTLIERS
. centile(earn_shar bookvalshar closp_jun), centile(0.5, 99.5)
-- Binom. Interp. --
Variable | Obs Percentile Centile [95% Conf. Interval]
-------------+-------------------------------------------------------------
earn_shar | 535 .5 -155.0676 -1435.97 -3.818947*
| 99.5 702.09 407.5342 1013.94*
bookvalshar | 542 .5 0 -.7 .36*
| 99.5 5846.003 5143.451 10543.75*
closp_jun | 1214 .5 .05 .04 .3109532
| 99.5 4660.25 2092.229 5164.371
Lower (upper) confidence limit held at minimum (maximum) of sample
.
. drop if earn_shar< -155.0676
(2 observations deleted)
. drop if earn_shar> 702.09
(860 observations deleted)
. drop if bookvalshar< 0
(1 observation deleted)
. drop if bookvalshar> 5846.003
(1 observation deleted)
. drop if closp_jun< .05
(0 observations deleted)
. drop if closp_jun> 4660.25
(26 observations deleted)
.
*******************************************************************************
*
. ** Dep Var= closing price of june ****
.
*******************************************************************************
*
. g p_6mafter=f.closp_jun
(189 missing values generated)
.
. ************************************
. *******FE model
. ************************************
.
. xi: xtreg p_6mafter earn_shar bookvalshar i.year , fe
i.year _Iyear_2000-2006 (naturally coded; _Iyear_2000 omitted)
Fixed-effects (within) regression Number of obs = 314
Group variable (i): id Number of groups = 157
R-sq: within = 0.1391 Obs per group: min = 1
between = 0.4608 avg = 2.0
overall = 0.4804 max = 6
F(7,150) = 3.46
corr(u_i, Xb) = -0.6242 Prob > F = 0.0018
------------------------------------------------------------------------------
p_6mafter | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
earn_shar | 2.463018 1.20279 2.05 0.042 .08642 4.839617
bookvalshar | .3692797 .1047816 3.52 0.001 .1622412 .5763183
_Iyear_2001 | 17.42329 36.93282 0.47 0.638 -55.55247 90.39905
_Iyear_2002 | 58.12118 36.98952 1.57 0.118 -14.96661 131.209
_Iyear_2003 | 38.97863 36.85583 1.06 0.292 -33.84499 111.8023
_Iyear_2004 | 44.2795 36.24992 1.22 0.224 -27.34691 115.9059
_Iyear_2005 | 46.22128 42.66067 1.08 0.280 -38.07216 130.5147
_Iyear_2006 | (dropped)
_cons | -55.7415 36.0336 -1.55 0.124 -126.9405 15.45748
-------------+----------------------------------------------------------------
sigma_u | 221.96966
sigma_e | 129.6183
rho | .74571609 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(156, 150) = 4.32 Prob > F = 0.0000
.
.
end of do-file
. clear
. set memory 400m
.
. use c:\data\prin\stockingdatapanel\stockdata
. tsset id year
panel variable: id (strongly balanced)
time variable: year, 2000 to 2006
.
.
. *OUTLIERS
. centile(earn_shar bookvalshar closp_jun), centile(0.5, 99.5)
-- Binom. Interp. --
Variable | Obs Percentile Centile [95% Conf. Interval]
-------------+-------------------------------------------------------------
earn_shar | 535 .5 -155.0676 -1435.97 -3.818947*
| 99.5 702.09 407.5342 1013.94*
bookvalshar | 542 .5 0 -.7 .36*
| 99.5 5846.003 5143.451 10543.75*
closp_jun | 1214 .5 .05 .04 .3109532
| 99.5 4660.25 2092.229 5164.371
Lower (upper) confidence limit held at minimum (maximum) of sample
.
. drop if earn_shar< -155.0676 & earn_shar!=.
(2 observations deleted)
. drop if earn_shar> 702.09 & earn_shar!=.
(2 observations deleted)
. drop if bookvalshar< 0 & bookvalshar!=.
(1 observation deleted)
. drop if bookvalshar> 5846.003 & bookvalshar!=.
(1 observation deleted)
. drop if closp_jun< .05 & closp_jun!=.
(4 observations deleted)
. drop if closp_jun> 4660.25 & closp_jun!=.
(6 observations deleted)
.
.
*******************************************************************************
*
. ** Dep Var= closing price of june ****
.
*******************************************************************************
*
. g p_6mafter=f.closp_jun
(335 missing values generated)
.
.
. ************************************
. *******FE model
. ************************************
.
. xi: xtreg p_6mafter earn_shar bookvalshar i.year , fe
i.year _Iyear_2000-2006 (naturally coded; _Iyear_2000 omitted)
Fixed-effects (within) regression Number of obs = 463
Group variable (i): id Number of groups = 185
R-sq: within = 0.1076 Obs per group: min = 1
between = 0.4208 avg = 2.5
overall = 0.4480 max = 6
F(7,271) = 4.67
corr(u_i, Xb) = -0.2414 Prob > F = 0.0001
------------------------------------------------------------------------------
p_6mafter | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
earn_shar | .3493612 .5254214 0.66 0.507 -.6850655 1.383788
bookvalshar | .335608 .0774586 4.33 0.000 .1831109 .4881051
_Iyear_2001 | 8.676075 27.0212 0.32 0.748 -44.52208 61.87423
_Iyear_2002 | 51.34743 27.08677 1.90 0.059 -1.979827 104.6747
_Iyear_2003 | 31.29411 26.56825 1.18 0.240 -21.01229 83.60052
_Iyear_2004 | 38.94315 24.84464 1.57 0.118 -9.969898 87.8562
_Iyear_2005 | 41.47083 25.08507 1.65 0.099 -7.915569 90.85723
_Iyear_2006 | (dropped)
_cons | -23.16047 23.71827 -0.98 0.330 -69.85596 23.53502
-------------+----------------------------------------------------------------
sigma_u | 169.01447
sigma_e | 99.859362
rho | .74124375 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(184, 271) = 7.52 Prob > F = 0.0000
.
end of do-file
-------------------------------------------------------------------------------
--------------------------------------------------------
Citazione Nick Cox <[email protected]>:
> I think we need to see some precise evidence of
> what you think is problematic. Thus, we need to
> see _exactly_ what you typed and _exactly_ what
> Stata did -- as the FAQ advises.
>
> Nick
> [email protected]
>
> [email protected]
>
> > thank you for your answers,
> > evidently my question was not clear:
> > I am aware that STATA does not drop missing observations when
> > using the
> > regress command, it just does not use them, but I expect the
> > same regression
> > sample:
> > if I drop the missing observations before the regression as
> > if I just run the regression without dropping missing values...
> > thus, I don't understand why stata runs the regression on
> > different samples
> > when I use the two commands described before
>
> Maarten buis
>
> > > --- [email protected] wrote:
> > > > when I use the following command:
> > > > drop if x>450
> > > > STATA drops a lot of observations, while when I exclude missing
> > > > values as follows:
> > > > drop if x>450 & x!=.
> > > > STATA eliminates just a couple of observations
> > >
> > > This is well known behaviour: In Stata missing values are
> > the largest
> > > possible values, so a missing value will be larger than
> > 450. As result
> > > if you type -drop if x>450- the missing values will also be dropped.
> > >
> > > > I realized this when I run a regression including x as
> > regressor. If
> > > > STATA drops missing data with the first command,
> > shouldn't drop the
> > > > same observations when I run the regression after using the second
> > > > command?
> > >
> > > I don't think I understand the question. Do you think that -regress-
> > > should influence the way -drop- behaves?
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/