Eric Neumayer <[email protected]>, asked about how to normalize dummy
variables. David M. Drukker <[email protected]> gave an example. And now
Eric has written:
> thanks for your help. However, I tried to replicate your example and
> I get stuck at
> mat dums=bfdum[1,2,3,4,5,6,7,8,9,10]
> which returns "bfdum not found r(111);". (Btw: I guess it should
> probably read bfdum[2,3,...11], right?]). Why can't I replicate your
> example? Also: What do you do if some country dummies are dropped
> due to missing observations? It seems that that your method only
> works if you can be sure that all cdum1-cdum10 are estimated.
There are two questions here. First, let me address the error question.
Actually, I suggested the lines were
. mat bfdum = e(b)
. mat dums = bfdum[1,2...]
The line that Eric suggested
> mat dums=bfdum[1,2,3,4,5,6,7,8,9,10]
is not a valid command. If Eric wants to get the columns 2-10 from the
first row of a matrix called -bfdum- the command is
. mat dums = bfdum[1,2...10]
(See [P] matrix define for more details about how to get parts of a matrix).
I have appended a copy of the do file that I used to produce my example. I
suggest that Eric run it, copy it to a new a file and then modify that new
file to fit his problem.
Now let me address the second question about dropped dummies. There are
number of solutions to the problems caused by dropped dummies. The one that
requires the least programming is not to include collinear dummy variables.
If Eric does not want to prune down his model, then a programming solution
is available. Basically, he should iterate over the estimated dummies and
adjust the denominator used in calculating the average before calculating
the dependent variable in the auxilary regression. Recall that
. gen newdep = original_depvar - ave_effect
. regress newdep <indepvars> <dummies> , nocons
produced the correct normalization. The same result still applies.
With dropped dummy variables Eric needs to take this into account when
calculating the ave_effect. I have appended an example of how to this to
the original do file which is included below.
--David
[email protected]
use http://www.stata-press.com/data/r7/grunfeld.dta, clear
tab company, gen(cdum)
regress invest mvalue cdum1-cdum10, nocons
mat bfdum = e(b)
mat dums = bfdum[1,2...]
mat list dums
mat dums_t = dums - J(1,10,1)*(dums*J(10,1,1)/10)
mat list dums_t
mat nsum = dums_t*J(10,1,1)
mat list nsum
mat ave_effect = (dums*J(10,1,1)/10)
mat list ave_effect
gen double invest2 = invest-ave_effect[1,1]
regress invest2 mvalue cdum1-cdum10, nocons
regress invest mvalue cdum2-cdum10
/* Simple repeat with a dropped dummy */
gen dropme = cdum2
regress invest mvalue cdum1-cdum10 dropme, nocons
mat bfdum = e(b)
mat dums = bfdum[1,2...]
mat list dums
local cols = colsof(dums)
local notd
forvalues i = 1/`cols' {
if reldif(dums[1,`i'], 0) > 1e-16 {
local notd = `notd' + 1
}
}
di "number of not dummy variables included in model is `notd'"
mat ave_effect = (dums*J(`cols',1,1)/`notd')
mat list ave_effect
replace invest2 = invest-ave_effect[1,1]
regress invest2 mvalue cdum1-cdum10 dropme, nocons
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/