Dear all,
I am working on a descriptive survey about nutritional habits in Ukraine.
This survey was carried out in two different seasons and in different places
where levels of contamination following the Chernobyl accident varied.
My first goal is to provide detailled tables in order to describe the
intakes by age, level of contamination and season. The following example
concerns salofat
. table age period zone, c(mean salofat sd salofat) format(%9.3f) center
----------------------------------------------------------------------------
----------------------------------------------
| level of contamination
and period
| --------- <1 Ci -------- -------- 1-5 Ci --------
-------- 5-15 Ci ------- -------- >15 Ci --------
Age | spring 2000 fall 2001 spring 2000 fall 2001 spring
2000 fall 2001 spring 2000 fall 2001
------------+---------------------------------------------------------------
----------------------------------------------
18-29 years | 47.000 51.818 35.000
53.125 87.500 45.357 52.143
| 21.095 0.000 25.620 17.321
22.510 38.891 18.341 21.185
|
30-39 years | 38.125 57.143 56.667
71.818 59.286 35.769 24.167
| 26.449 0.000 29.277 5.774
37.899 19.242 18.802 21.866
|
40-59 years | 55.000 58.000 75.000
60.769 55.417 53.971 37.143
| 25.071 0.000 22.345 37.796
28.929 11.958 28.326 23.077
|
60-74 years | 50.833 46.364 58.333
53.654 70.000 95.000 51.000
| 18.005 0.000 28.468 11.690
22.607 39.306 63.640 24.083
|
75 + | 50.000 40.000
62.500 50.000
| 7.071 0.000 12.649 0.000
34.776 0.000
----------------------------------------------------------------------------
----------------------------------------------
Note that in this example, nobody has eaten salofat in fall 2001 in the less
contaminated region (< 1Ci).
In a second step, I need to provide adjusted means. Here is what I did:
xi: regress salofat period i.age i.zone
i.age _Iage_1-5 (_Iage_1 for age==18-29 years omitted)
i.zone _Izone_0-4 (naturally coded; _Izone_0 omitted)
Source | SS df MS Number of obs =
271
-------------+------------------------------ F( 8, 262) =
2.86
Model | 15878.8172 8 1984.85216 Prob > F =
0.0046
Residual | 182138.019 262 695.183277 R-squared =
0.0802
-------------+------------------------------ Adj R-squared =
0.0521
Total | 198016.836 270 733.395688 Root MSE =
26.366
----------------------------------------------------------------------------
--
salofat | Coef. Std. Err. t P>|t| [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
period | -.958352 3.642396 -0.26 0.793 -8.130448
6.213744
_Iage_2 | -3.164714 5.099201 -0.62 0.535 -13.20535
6.875917
_Iage_3 | 4.166338 4.687264 0.89 0.375 -5.063164
13.39584
_Iage_4 | 2.707066 5.308515 0.51 0.611 -7.745717
13.15985
_Iage_5 | -2.127529 7.024452 -0.30 0.762 -15.95909
11.70404
_Izone_2 | 5.593309 5.739183 0.97 0.331 -5.707485
16.8941
_Izone_3 | 13.24345 5.624438 2.35 0.019 2.168599
24.31831
_Izone_4 | -3.979932 5.78657 -0.69 0.492 -15.37403
7.41417
_cons | 1964.247 7284.588 0.27 0.788 -12379.54
16308.04
----------------------------------------------------------------------------
--
. adjust _Iage_2 _Iage_3 _Iage_4 _Iage_5 , by(period zone) se format(%9.3f)
center
----------------------------------------------------------------------------
-------------------------------------- Dependent variable: salofat
Command: regress
Variables left as is: _Izone_2, _Izone_3, _Izone_4
Covariates set to mean: _Iage_2 = .20557491, _Iage_3 = .29268292, _Iage_4 =
.20092915, _Iage_5 = .07665505
----------------------------------------------------------------------------
--------------------------------------
------------------------------------------------
| level of contamination
period | <1 Ci 1-5 Ci 5-15 Ci >15 Ci
------------+-----------------------------------
spring 2000 | 48.493 54.086 61.736 44.513
| (4.696) (3.317) (3.181) (3.356)
|
fall 2001 | 47.535 53.128 60.778 43.555
| (5.900) (4.066) (3.706) (3.664)
------------------------------------------------
Key: Linear Prediction
(Standard Error)
The problem is that the mean adjusted on age is not zero has it should have
been in the <1Ci region according to my first table.
Was I wrong in my analysis?
* If yes, how should have I done?
* If not, how to explain this result, and how to avoid it?
Thanks in advance
Bertrand Gagni�re
Laboratoire d'�pid�miologie
Institut de Radioprotection et de S�ret� Nucl�aire
BP 17
92 262 Fontenay aux Roses cedex
Tel: + 33 1 58 35 95 56
Fax: + 33 1 46 57 86 03
<mailto:[email protected]>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/