[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: request for help - multi-level modelling with a big dataset using xtlogit

From	"Alves, Bernadette" <[email protected]>
To	"'[email protected]'" <[email protected]>
Subject	st: request for help - multi-level modelling with a big dataset using xtlogit
Date	Fri, 19 Jul 2002 13:41:05 +0100

I'm a student looking for help with my MSc dissertation looking at factors
associated with delivery by caesarean section. It's an analysis of a
database of about half a million records of women who gave birth in
hospital.   I am using logistic regression and because my data are naturally
grouped, I'm using a multi-level approach to take account of the correlation
between women in the same hospital.  I am therefore using xtlogit (rather
than logit).   I find that I cannot run xtlogit with my entire 500,000
records - stata comes back with an error saying that it needs to be able to
set matsize to approximately 18,000.  Unfortunately the matsize limit for
stata 7.0 is 800.  

I then took a 4% sample (approximately 20,000 records ) which is the largest
that stata can cope with at a matsize of 800.  But, and here's the weird
thing that I need help with.... The parameter estimates are very dependent
on the sample I take. Sometimes I get a p-value of 0.05, for other samples I
get a p-value of 0.7.  Here's an example of what I do to test whether
xdelmid is a predictor of emergency caesarean section.

        sample 4  /* this give me the 4% sample */

        xi: xtlogit emerg i.gestat i.age i.xdelmid, pa corr(exch) robust
i(provid) 

        testparm _Ixdel*  /* this does a wald test on xdelmid */

Taking 10 different 4% sample, I find my estimates differ considerably and
my p-values range from 0.04 to 0.71.

Why can't stata cope with the full dataset and why are the parameter
estimates so sensitive to the sample taken?

I would be extremely grateful if someone could help me with this.

Bernadette

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: request for help - multi-level modelling with a big datasetusing xtlogit
  - From: SamL <[email protected]>
- Re: st: multi-level modelling with a big dataset usingxtlogit
  - From: David Harless <[email protected]>
- Re: st: request for help - multi-level modelling with a big dataset usingxtlogit
  - From: Chris Bojke <[email protected]>

Prev by Date: st: displaying date AND time of a var containing exact time of acertain date
Next by Date: st: RE: request for help - multi-level modelling with a big dataset using xtlogit
Previous by thread: st: displaying date AND time of a var containing exact time of acertain date
Next by thread: Re: st: request for help - multi-level modelling with a big dataset usingxtlogit
Index(es):
- Date
- Thread