Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: -moments- available from SSC

From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: -moments- available from SSC
Date   Tue, 28 Sep 2004 10:41:29 +0100

Thanks to Kit Baum, a program -moments- is now 
available in a package of the same name from 
SSC. Stata 8.2 is required. 

-moments- has been mentioned a couple of times
in recent postings to Statalist. The point was 
made that if you are doing something like -sktest-, 
you should also look at the skewness and kurtosis
(a graph too, naturally). 

-moments- calculates number of observations, mean, 
standard deviation, skewness and kurtosis for a list 
of variables.  

Your reaction to that is likely to be one or both of 
two things: 

(1) Surely -summarize- does that already. 

(2) Surely -tabstat- is already available for customised
tables of summary statistics. 

If you thought that, you are correct. The merits 
of -moments- are purely matters of convenience or presentation. 

-summarize- produces these measures, but together with 
a lot of other stuff: 

. su price, detail 

      Percentiles      Smallest
 1%         3291           3291
 5%         3748           3299
10%         3895           3667       Obs                  74
25%         4195           3748       Sum of Wgt.          74

50%       5006.5                      Mean           6165.257
                        Largest       Std. Dev.      2949.496
75%         6342          13466
90%        11385          13594       Variance        8699526
95%        13466          14500       Skewness       1.653434
99%        15906          15906       Kurtosis       4.819188

-tabstat- is the obvious answer to that problem. 

. tabstat price-foreign, c(s) s(n mean sd skew kurt) 

    variable |         N      mean        sd  skewness  kurtosis
       price |        74  6165.257  2949.496  1.653434  4.819188
         mpg |        74   21.2973  5.785503  .9487176  3.975005
       rep78 |        69  3.405797  .9899323 -.0570331  2.678086
    headroom |        74  2.993243  .8459948  .1408651  2.208453
       trunk |        74  13.75676  4.277404  .0292034  2.192052
      weight |        74  3019.459  777.1936  .1481164  2.118403
      length |        74  187.9324  22.26634 -.0409746   2.04156
        turn |        74  39.64865  4.399354  .1238259  2.229458
displacement |        74  197.2973  91.83722  .5916565  2.375577
  gear_ratio |        74  3.014865  .4562871  .2191658  2.101812
     foreign |        74  .2972973  .4601885  .8869686  1.786713

When I see a table like that, I want fewer decimal places. I 
tend to go for 3, and on some criteria that is way too many: 

. tabstat price-foreign, c(s) s(n mean sd skew kurt)  format(%4.3f) 

    variable |         N      mean        sd  skewness  kurtosis
       price |    74.000  6165.257  2949.496     1.653     4.819
         mpg |    74.000    21.297     5.786     0.949     3.975
       rep78 |    69.000     3.406     0.990    -0.057     2.678
    headroom |    74.000     2.993     0.846     0.141     2.208
       trunk |    74.000    13.757     4.277     0.029     2.192
      weight |    74.000  3019.459   777.194     0.148     2.118
      length |    74.000   187.932    22.266    -0.041     2.042
        turn |    74.000    39.649     4.399     0.124     2.229
displacement |    74.000   197.297    91.837     0.592     2.376
  gear_ratio |    74.000     3.015     0.456     0.219     2.102
     foreign |    74.000     0.297     0.460     0.887     1.787

That is clearly better, but some small details are irritating. 

1. If I use a non-default -format()-, I get it everywhere. (My 
punishment is that I got what I asked for.) In the case of 
number of observations, this looks a little silly. As -tabstat- 
accepts at most frequency or analytical weights, that column N 
is always going to contain integers. I've previously suggested
that -tabstat- be modified to ignore -format()- in the case of N, 
but to no effect. 

2. That's the only control over small details of 
presentation that you get. (You can transpose the table, which 
is on occasion very useful.) 

The default output of -moments- is like this: 

. moments 

                n = 69 |       mean          SD    skewness    kurtosis
                 Price |   6146.043    2912.440       1.688       5.032
         Mileage (mpg) |     21.290       5.866       0.995       3.997
    Repair Record 1978 |      3.406       0.990      -0.057       2.678
        Headroom (in.) |      3.000       0.853       0.197       2.144
 Trunk space (cu. ft.) |     13.928       4.343      -0.044       2.159
         Weight (lbs.) |   3032.029     792.851       0.118       2.073
          Length (in.) |    188.290      22.747      -0.076       2.000
     Turn Circle (ft.) |     39.797       4.441       0.071       2.228
Displacement (cu. in.) |    198.000      93.148       0.581       2.354
            Gear Ratio |      2.999       0.463       0.279       2.109
              Car type |      0.304       0.464       0.850       1.723

The default is now %9.3f. Well, I like that. 

Also, by default casewise deletion is used: statistics are computed for 
the sample that is not missing for any of the variables.  The constant 
n = 69 can thus be tucked away in a corner. That's the other way 
round from -summarize- or -tabstat-. Naturally, you can get the opposite 
behaviour if you wish: 

. moments, allobs

              Variable |          n        mean          SD    skewness    kurtosis
                 Price |         74    6165.257    2949.496       1.653       4.819
         Mileage (mpg) |         74      21.297       5.786       0.949       3.975
    Repair Record 1978 |         69       3.406       0.990      -0.057       2.678
        Headroom (in.) |         74       2.993       0.846       0.141       2.208
 Trunk space (cu. ft.) |         74      13.757       4.277       0.029       2.192
         Weight (lbs.) |         74    3019.459     777.194       0.148       2.118
          Length (in.) |         74     187.932      22.266      -0.041       2.042
     Turn Circle (ft.) |         74      39.649       4.399       0.124       2.229
Displacement (cu. in.) |         74     197.297      91.837       0.592       2.376
            Gear Ratio |         74       3.015       0.456       0.219       2.102
              Car type |         74       0.297       0.460       0.887       1.787

The number of observations remains shown as an integer. You can specify up 
to four numeric formats, to control display of 
mean (standard deviation (skewness (kurtosis))). 

. moments, format(%2.1f %2.1f) 

                n = 69 |       mean          SD    skewness    kurtosis
                 Price |     6146.0      2912.4       1.688       5.032
         Mileage (mpg) |       21.3         5.9       0.995       3.997
    Repair Record 1978 |        3.4         1.0      -0.057       2.678
        Headroom (in.) |        3.0         0.9       0.197       2.144
 Trunk space (cu. ft.) |       13.9         4.3      -0.044       2.159
         Weight (lbs.) |     3032.0       792.9       0.118       2.073
          Length (in.) |      188.3        22.7      -0.076       2.000
     Turn Circle (ft.) |       39.8         4.4       0.071       2.228
Displacement (cu. in.) |      198.0        93.1       0.581       2.354
            Gear Ratio |        3.0         0.5       0.279       2.109
              Car type |        0.3         0.5       0.850       1.723

You'll notice the variable labels, shown by default. You can override 
that too: 

. moments, format(%2.1f %2.1f)  variablenames

      n = 69 |       mean          SD    skewness    kurtosis
       price |     6146.0      2912.4       1.688       5.032
         mpg |       21.3         5.9       0.995       3.997
       rep78 |        3.4         1.0      -0.057       2.678
    headroom |        3.0         0.9       0.197       2.144
       trunk |       13.9         4.3      -0.044       2.159
      weight |     3032.0       792.9       0.118       2.073
      length |      188.3        22.7      -0.076       2.000
        turn |       39.8         4.4       0.071       2.228
displacement |      198.0        93.1       0.581       2.354
  gear_ratio |        3.0         0.5       0.279       2.109
     foreign |        0.3         0.5       0.850       1.723

-moments- is also just smart enough to filter out any string variables
fed to it, rather than choking on them (-tabstat-) or giving a line 
of output flagging 0 observations (-summarize-). 

There are some other features too, but that's enough on -moments-. 

[email protected] 

*   For searches and help try:

© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index