I have a three-level hierarchical dataset, and have run both xtlogit and
gllamm, with similar results apart from the condition number and group level
variance, and I do not know what these large numbers are telling me.
I have a dataset with various life history and social network attributes for
women over their reproductive lives. The dataset contains one record for
each woman-year combination, and has variables for whether they gave birth
in that year to a child that survived (birth5), whether they were married
polygnously or monogamously (mstat0-4plus), how many cowives they had
(cowives), and whether and what sort of relatives they lived near in that
year (frel). The years vary from 1921 to 1995, with any particular woman
having a maximum of 40 records from her age 15 to 55 if she is 55 or older.
There are 3226 records, for 225 women. I analysed this as a panel dataset
with a binary dependent variable using xtlogit, This regression (output
below) looks at the effect of different attributes of marital status
(mstat1-4plus, and cowives), and presence of kin (frel) on the probability
of giving birth to a surviving child in a particular year (birth5),
controlling for age (using centred age, agex, and centred age squared,
agexsq), year and number of previous marriages (prevmno).
But a colleague pointed out that some of these 225 women are related to each
other, as daughter, sisters, mothers: and indeed 92 women are related to
each other, in clusters ranging from 1 to 8, mean 2.4. This is captured in
the data through the variable oldmum, which is the id of the most senior
related female, equal to self if the woman is not related to anyone else in
the village sample.
I thought that the best way to incorporate these relationships was to use
gllamm, as described in chapter 3 of the gllamm manual. (reference below)
So I ran gllamm, and the coefficients and significance of the variables are
quite close to that obtained by xtlogit, but the condition number is large,
310220 and the variance for pno and oldmum is 5.863e-25 (9.488e-14) and
3.308e-24 (2.274e-13)respectively. The commands and results are listed
below.
I have 2 questions, plus 2 follow ups.
1. is gllamm the right tool to use? And if not, what should I do?
2. if yes, then should I worry about the condition number and variances? And
what could I do to improve on them?
Any help much appreciated,
Alexandra Wilson
Commands and results
XTLOGIT
iis pno
xtlogit birth5 agex agexsq year cowives prevmno mstat1 mstat2 mstat3
mstat4plus frel,re
Random-effects logistic regression Number of obs =
3266
Group variable (i): pno Number of groups =
225
Random effects u_i ~ Gaussian Obs per group: min =
1
avg =
14.5
max =
40
Wald chi2(10) =
73.09
Log likelihood = -1712.1297 Prob > chi2 =
0.0000
----------------------------------------------------------------------------
--
birth5 | Coef. Std. Err. z P>|z| [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
agex | .0318623 .0073886 4.31 0.000 .0173809
.0463436
agexsq | -.0048922 .0006102 -8.02 0.000 -.0060881
-.0036963
year | .0027409 .0034771 0.79 0.431 -.0040741
.0095559
cowives | .0102273 .1178123 0.09 0.931 -.2206805
.2411351
prevmno | .067444 .0962329 0.70 0.483 -.1211691
.256057
mstat1 | -.0539663 .180829 -0.30 0.765 -.4083846
.3004519
mstat2 | -.1910877 .1933577 -0.99 0.323 -.5700619
.1878866
mstat3 | -.2002851 .2967272 -0.67 0.500 -.7818598
.3812896
mstat4plus | .26714 .574198 0.47 0.642 -.8582674
1.392547
frel | .2079005 .0923417 2.25 0.024 .026914
.388887
_cons | -6.402292 6.862936 -0.93 0.351 -19.8534
7.048815
-------------+--------------------------------------------------------------
--
/lnsig2u | -3.826084 .2686081 -4.352547
-3.299622
-------------+--------------------------------------------------------------
--
sigma_u | .1476306 .0198274 .1134636
.1920862
rho | .0065812 .0017561 .003898
.011091
----------------------------------------------------------------------------
--
Likelihood-ratio test of rho=0: chibar2(01) = 6.79 Prob >= chibar2 =
0.005
GLLAMM
gllamm birth5 agex agexsq year cowives prevmno mstat1 mstat2 mstat3
mstat4plus frel, i(pno oldmum) family(binomial) link(logit) nip(5) adapt
trace
last output:
number of level 1 units = 3266
number of level 2 units = 225
number of level 3 units = 142
Condition Number = 310220
gllamm model
log likelihood = -1708.7352
----------------------------------------------------------------------------
--
birth5 | Coef. Std. Err. z P>|z| [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
agex | .0316857 .0072937 4.34 0.000 .0173902
.0459812
agexsq | -.0048713 .000607 -8.03 0.000 -.0060609
-.0036816
year | .0027582 .0033604 0.82 0.412 -.0038281
.0093445
cowives | .0065394 .1154256 0.06 0.955 -.2196907
.2327694
prevmno | .0663426 .0928466 0.71 0.475 -.1156334
.2483186
mstat1 | -.0529302 .1776275 -0.30 0.766 -.4010736
.2952133
mstat2 | -.1850067 .1896384 -0.98 0.329 -.5566912
.1866779
mstat3 | -.1957051 .2900493 -0.67 0.500 -.7641912
.372781
mstat4plus | .2568191 .5623738 0.46 0.648 -.8454132
1.359051
frel | .2073378 .0889529 2.33 0.020 .0329933
.3816823
_cons | -6.431496 6.631985 -0.97 0.332 -19.42995
6.566956
----------------------------------------------------------------------------
--
Variances and covariances of random effects
----------------------------------------------------------------------------
--
***level 2 (pno)
var(1): 5.863e-25 (9.488e-14)
***level 3 (oldmum)
var(1): 3.308e-24 (2.274e-13)
references:
Rabe-Hesketh, Sophia, Anders Skrondal and Andrew Pickles 2004 GLLAMM manual.
Berkeley Electronic Press: University of California,Berkeley Division of
Biostatistics Working papers no 160.
http:/www.bepress.com/ucbiostat/paper160
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/