Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Abnormal logistic results
From
Ras Dondo <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Abnormal logistic results
Date
Tue, 16 Oct 2012 03:42:42 -0700 (PDT)
Thanks Maarten. But will it make any difference if instead of comparing those exposed and those not exposed to drug X, I decided to compare those with the disease to those without the disease in an attempt to determine the odds of exposure to the drug?
Thanks
----- Original Message -----
From: Maarten Buis <[email protected]>
To: [email protected]
Cc:
Sent: Monday, October 15, 2012 8:55 AM
Subject: Re: st: Abnormal logistic results
On Mon, Oct 15, 2012 at 4:13 AM, Ras Dondo wrote:
> I run logistic regression on my data and got an abnormal results and I wanted to ask for advice on how to rectify the problem. I have a dataset containing five variables:
> 1. condition in a child disease (binary 1/0), 2. mother's age (grouped by 5 year intervals), 3. state (12 states), 4. child's year of birth (grouped into 5 levels), and 5. drug X (binary). My objective was to calculate the OR and associated 95% CI interval of the baby having the disease when the it was exposed to drug in X in the womb, adjusting for maternal age, state, and child's year of birth.
> I had a sample size of 6,168 children of which 89 had the disease with 1 child exposed to the drug, and 6,079 children without the disease with also 1 child exposed to the drug in the womb.
The exposure to the drug is just too rare. That means you have
virtually no information in the dataset. The information in a dataset
that we use in these models comes from comparing groups. The group
exposed to drug X is very small (2 observations), so we know very
little about that group. Even though you have more than 6,000
observations, these observations only contain a lot of information
about the group that is not exposed to drug X. To do a comparison of
those exposed and not exposed you need to know a lot about both
groups.
So I am not surprised that you cannot adjust for mother's age, state,
child's year of birth. You need to simplify your model: e.g. adjust
for rougher groupings of states (in the US context you can think of
south versus non-south) instead of state, adjust for mother's age with
a linear spline with one knot instead of 5 categories, same with
child's year of birth. For linear splines see: -help mkspline-. Also
make sure you center mother's age and child's year of birth at a
meaningful value within the range of the data. Even if you do all
that, I would still not be surprised if you still will not get a
meaningful answer; your data is very extreme with so few observations
that used drug X(*).
Hope this helps,
Maarten
(*) I suspect that this is one of those cases where as a researcher
you would want this to be less rare, but as a person you are glad this
happens so rarely.
---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany
http://www.maartenbuis.nl
---------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/