Javier,
Everything is right here. var4 and var7 are the reference (or comparison)
categories of your two dummy-sets var2-var4 and var5-var7.
Normally you would (at least I would) omit one of each set already in the
regression command. I would choose either the one with the lowest or highest
coefficient or the one which makes most theoretical sense to compare with.
However, as Kit already pointed out, you NEED to exclude one dummy variable
for each categorical variable.
The reason is very simple: if you know the value for var2 and var3 in your
sample data, you also know the value for var4, hence you have perfect
multicollinearity which violates your OLS assumptions.
Hope you're not offended by the suggestion to consult a textbook again on
the use of categorical variables in regression analysis; those are often
more helpful than a discussion group like that.
Best from wet West Scotland
Christian
-----Urspr�ngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Javier
Bacarreza
Gesendet: 17 April 2005 23:40
An: [email protected]
Betreff: Re: st: re: WLS and dummies
Thanks for the answer Kit and all, I tried to use the
command lincom in the following way:
The data set is (a sample):
var1 var2 var3 var4 var5 var6 var7
.576 1 0 0 1 0 0
.354 0 0 1 0 1 0
.542 1 0 0 0 0 1
.6354 0 1 0 1 0 0
.875 1 0 0 0 1 0
.524 0 1 0 0 0 1
.324 0 0 1 0 1 0
.643 1 0 0 1 0 0
.523 0 1 0 0 0 1
.089 0 0 1 0 1 0
and I regress :
reg var1 var2 var3 var4 var5 var6 var7
and it drops var4 and var7 due multicollinearity. And
I use lincom:
lincom var4
But still I can not get the estimation of this
parameter and the std error since it says dropped. Do
you have any clue how can I solve this?..
Thanks in advance,
Javier
--- Kit Baum <[email protected]> wrote:
> Javier wrote
>
> I would like to ask a question that I had before but
> I
> couldnt solve. The problem I have is the following:
> Im
> aplliying shif-share method in a regression of data
> panel. My dependent variable is the growth in
> employment and I have as independent variables
> dummies
> for industries, dummies for years and for regions.
> The
> problem arises because I loose one region, one year
> and one industrie because of the dummies use. I dont
> want to loose this since they are useful for the
> analysis I'm doing. Does anyone have an idea how to
> do
> this?.
>
> You can always include ONE complete (mutually
> exclusive and exhaustive)
> set of dummies in a regression by excluding the
> constant term. You gain
> nothing by doing so; the coefficients are just
> measured vs. zero rather
> than vs. the excluded class. But you cannot include
> TWO or more
> complete sets of dummies in a regression,
> irregardless of the treatment
> of the constant term, since each set of dummies sums
> to an iota vector
> of length N, and you can't have two of those in your
> regression. But
> then you don't need to; you can always calculate all
> the coefficients,
> in point and interval form, from the (G-1) dummies'
> coefficients. The
> lincom command is useful here.
>
> Kit Baum, Boston College Economics
> http://ideas.repec.org/e/pba1.html
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/