Dear Statalist members,
I have a somewhat weird problem while calculating summary stats with
sub-samples.
I created a dummy variable, called sme = 1 for small, and sme = 0 for large.
However, when repeating the creation of another dummy variable, Stata 9.0/SE
seemingly made two different sets of sub-samples. Please see the result from
the log-files below.
/* ************ */
Three dummy variables are created with identical conditions, but the first
two
are calculated by Stata with identical sub-samples. But the third is a
different
set of subsample. Hence, the summary stats are different !!
The program was run several times, but they all turned up the same
difference.
Can anyone enlighten me about this ?
/* ***** From Log Files ******** */
. /* ------------------------------------------------------------ */
generate SaleAsst_us=0; /* -- First dummy -- */
. replace SaleAsst_us=1 if (turn_us < 50000) | (toas_us < 43000);
(318680 real changes made)
/* ------------------------------------------------------------ */
generate sale_sme=0; /* -- Second dummy -- */
. replace sale_sme=1 if (turn_us < 50000) | (toas_us < 43000);
(318680 real changes made)
/* ------------------------------------------------------------ */
/* ----- Check with sub-samples ----- */
generate SME=0; /* -- Third dummy -- */
. replace SME=1 if (turn_us < 50000) | (toas_us < 43000);
(117151 real changes made)
. /*
/* ************* */
. /* ------------------------------------------------------------ */
generate SaleAsst_us = 0; /* -- First dummy -- */ /* --- space around
= --- */
. replace SaleAsst_us = 1 if (turn_us < 50000) | (toas_us < 43000);
(318680 real changes made)
/* ------------------------------------------------------------ */
generate sale_sme = 0; /* -- Second dummy -- */
. replace sale_sme = 1 if (turn_us < 50000) | (toas_us < 43000);
(318680 real changes made)
/* ----- Check with sub-samples ----- */
generate SME=0; /* -- Third dummy -- */
. replace SME=1 if (turn_us < 50000) | (toas_us < 43000);
(117151 real changes made)
. /* ------------------------------------------------------------ */
/* ---------- For checking sub-sample stats --------------- */
tabstat `varlist1' if (year==2003) & (SME==1) & (quoted=="No"),
by(country) stats(n median) column(v
ariable) save;
Summary statistics: N, p50
by categories of: country
country | Var1 Var2 Var3 Var4 Var5
Var6 Var7 Var8
-----------------+--------------------------------------------------------------------------------
Austria | 286 286 286 319 278
318 319 319
| 1.267237 .4993207 0 .4867293 100
.0693649 .2346598 .0029079
/* ********* */
/* ---------- For checking sub-sample stats --------------- */
tabstat `varlist1' if (year==2003) & (sale_sme==1) & (quoted=="No"),
by(country) stats(n median) col
umn(variable) save;
Summary statistics: N, p50
by categories of: country
country | Var1 Var2 Var3 Var4 Var5
Var6 Var7 Var8
-----------------+--------------------------------------------------------------------------------
Austria | 280 280 280 313 272
312 313 313
| 1.300142 .5070484 0 .4955109 100
.0687543 .2352267 .0028752
-
/* ********* */
TIA. --- Akihito
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/