Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Areg, absorb

From	Maarten buis <[email protected]>
To	[email protected]
Subject	Re: st: Areg, absorb
Date	Mon, 11 Apr 2011 14:03:16 +0100 (BST)

--- On Mon, 11/4/11, emanuele mazzini wrote:
> do you know a way to not omit the variables that the
> command xi i.varname generates? I tried with the option
> noomit, but it seems that it does not work, i.e. it
> still keeps on omitting the first country of my sample.

Imagine you have two countries Aistan and Bland and that we 
want to predict a variable y. Lets first understand what 
happens when we omit one of the dummies. In this case 
assume we use one dummy variable called bland, which is 1
 when the country is Bland and 0 when it is not Bland (and 
thus Aistan). In that case we ommited the dummy aistan.

In this case  we have the following equation:
y_hat = b0 + b1 * bland 

If the country is Bland than its predicted values is
y_hat = b0 + b1 * 1 = b0 + b1

If the country is Aistan than its predicted value is
y_hat = b0 + b1 * 0 = b0 

So the constant is the predicted y for Aistan and b1
is the difference in predicted y between Aistan and 
Bland.

What will happen when we also include the dummy aistan?
In this case  we have the following equation:
y_hat = b0 + b1 * bland + b2 * aistan

If the country is Bland than its predicted values is
y_hat = b0 + b1 * 1 + b2 * 0 = b0 + b1

If the country is Aistan than its predicted value is
y_hat = b0 + b1 * 0 + b2 * 1 = b0 + b2

So now there are three parameters to represent two 
predicted values, which means that one of these is
unidentified. For example we could think that b0 is
2, than b1 is the predicted y - 2 for Bland and b2
is the predicted y - 2 for Aistan. Or we could think
that b0 is 3, than b1 is the predicted y - 3 for
Bland and b2 is the predicted y - 3 for Aistan. You
can see that you can get exactly the same 
predictions for different values of b0, just by 
adjusting the two remaining parameters. There is
thus no way to distinguish the fit of these 
different models. 

In order to be able to estimate the model you must
constrain one of the parameters. Be default we 
constrain the parameter of one of the dummies to
be 0 (i.e. we just exclude that variable from our
model). Alternatively we could constrain the 
constant to be 0, with the -nocons- option.

Anyhow, from your previous question I gathered
that you are not interested in these effects, you
even want to suppress the display of these variables.
In that case I would just stick to the default, all
these models are mathematically equivalent anyhow.
But if you are substantively interested in the 
effects of these variables, than this can sometimes
be a really nice trick that can help the interpretation
of your model. Notice however, that this does not 
change your model, just the way it is displayed.

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Areg, absorb
  - From: emanuele mazzini <[email protected]>

References:
- Re: st: Areg, absorb
  - From: emanuele mazzini <[email protected]>

Prev by Date: Re: st: plotting a variable for the whole data
Next by Date: Re: st: question on local macros ina graph title
Previous by thread: Re: st: Areg, absorb
Next by thread: Re: st: Areg, absorb
Index(es):
- Date
- Thread