Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: strange and differing results for mi vs. ice mlogit

From	Maarten buis <[email protected]>
To	[email protected]
Subject	Re: st: strange and differing results for mi vs. ice mlogit
Date	Mon, 18 Oct 2010 16:32:07 +0100 (BST)

--- On Mon, 18/10/10, Mary E. Mackesy-Amiti wrote:
> information, add an "unknown" category to the occupation variable.

I guess that part of your message got "eaten by the monster 
that lives on the statalist server and eats the first line of every
statalist post".

I interpret your partial message as follows: Why not avoid multiple
imputation and add an extra category "unknown occupation" instead.
This is a very intuitive, but unfortuantly often also a very wrong 
suggestions.

Consider the following example: We are interested in the effect
of x on y while controling for occupation. We have two occupation
categories high, and low. We follow your suggestion and add a 
category unknown for those with missing values on occupation. 
Next we create two dummies, one for high occupation and one for
the unknowns (so the reference category is low). 

The following happens for complete observations:
y = b0 + b1*x + b2*high + b3*unknown
y = b0 + b1*x + b2*high + b3*0
y = b0 + b1*x + b2*high 

So b1 is the effect of x while controling for occupation.

The following hapens for observations with missing values on
occupation:
y = b0 + b1*x + b2*high + b3*unknown
y = b0 + b1*x + b2*0 + b3*1
y = b0* + b1*x              (b0* = b0 + b3)

So b1 is now the effect of x while _not_ controling for 
occupation.

To make things worse, in our model we constrain the two b1s to
be equal, so it becomes some sort of unknown mixture between the
effect of x while controling and not controling for occupation.
So now we made things worse, by adding this category.

There is one exception, this approach does make sense when a 
missing value is itself a substantially meaningfull value. For 
example, say our observations are women and the missing values 
are the homemakers. Mary's solution would in effect be 
equivalent to adding the unpaid "occupation" homemaker to our 
occupation variable, which in many instances would make perfect 
sense.

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------



      

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: strange and differing results for mi vs. ice mlogit
  - From: "Mary E. Mackesy-Amiti" <[email protected]>

Prev by Date: Antwort: RE: Antwort: st: -esttab- label and -estpost- Spearman
Next by Date: st: reg3 option -robust-
Previous by thread: Re: st: strange and differing results for mi vs. ice mlogit
Next by thread: st: Standard error for correlation coefficient in "biprobit"
Index(es):
- Date
- Thread