Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Abnormal logistic results
From
Maarten Buis <[email protected]>
To
[email protected]
Subject
Re: st: Abnormal logistic results
Date
Mon, 15 Oct 2012 09:55:02 +0200
On Mon, Oct 15, 2012 at 4:13 AM, Ras Dondo wrote:
> I run logistic regression on my data and got an abnormal results and I wanted to ask for advice on how to rectify the problem. I have a dataset containing five variables:
> 1. condition in a child disease (binary 1/0), 2. mother's age (grouped by 5 year intervals), 3. state (12 states), 4. child's year of birth (grouped into 5 levels), and 5. drug X (binary). My objective was to calculate the OR and associated 95% CI interval of the baby having the disease when the it was exposed to drug in X in the womb, adjusting for maternal age, state, and child's year of birth.
> I had a sample size of 6,168 children of which 89 had the disease with 1 child exposed to the drug, and 6,079 children without the disease with also 1 child exposed to the drug in the womb.
The exposure to the drug is just too rare. That means you have
virtually no information in the dataset. The information in a dataset
that we use in these models comes from comparing groups. The group
exposed to drug X is very small (2 observations), so we know very
little about that group. Even though you have more than 6,000
observations, these observations only contain a lot of information
about the group that is not exposed to drug X. To do a comparison of
those exposed and not exposed you need to know a lot about both
groups.
So I am not surprised that you cannot adjust for mother's age, state,
child's year of birth. You need to simplify your model: e.g. adjust
for rougher groupings of states (in the US context you can think of
south versus non-south) instead of state, adjust for mother's age with
a linear spline with one knot instead of 5 categories, same with
child's year of birth. For linear splines see: -help mkspline-. Also
make sure you center mother's age and child's year of birth at a
meaningful value within the range of the data. Even if you do all
that, I would still not be surprised if you still will not get a
meaningful answer; your data is very extreme with so few observations
that used drug X(*).
Hope this helps,
Maarten
(*) I suspect that this is one of those cases where as a researcher
you would want this to be less rare, but as a person you are glad this
happens so rarely.
---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany
http://www.maartenbuis.nl
---------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/