Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Dummy Variable Trap, urgent
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Dummy Variable Trap, urgent
Date
Fri, 6 Sep 2013 14:38:36 +0100
Your urgency is your problem, not ours. Do you think the other people
asking questions don't want answers as soon as possible? Trying this
kind of pressure also shows that you have not read the Statalist FAQ
carefully, which does advise against this.
Nick
[email protected]
On 6 September 2013 14:27, Salikhov, Talgat
<[email protected]> wrote:
> Dear All,
>
> I need some help with my model. This is for my dissertation, which is due very soon, so I would greatly appreciate if anyone could reply asap.
>
> Context:
>
> I have a panel data. I am using STATA 11. I am running an employment model with fixed effects. I have a number of various variables to control for various factors and area characteristics, including 6 categorical dummy variables to control for the area type according to the level of urbanization. I also introduced year 7 dummies.
>
> Problem:
>
> When I run the model with fixed effects specification the coefficients for area type dummies get omitted because of collinearity. I realise this is a dummy variable trap. Note that coefficients for year dummies are estimated withoutany problems (with one year omitted as expected). However even though I drop one of the area type dummy variables, it still shows as omitted. I don't know what is the problem. I tried to check the data set for potential collinearity with other variables (possible 'doubling' of fixed effects) and was deleting one variable by one from the model, but did not help.
>
> The list of my commands with the results is as follows:
>
> clear
>
> . *(26 variables, 1050 observations pasted into data editor)
>
> . summarize output
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> output | 1050 6636.304 5337.696 1497 33800.75
>
> . sum
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> country | 0
> year | 1050 2007 2.000953 2004 2010
> region | 0
> uacountyname | 0
> tot_emp | 1050 150599.6 117570 12800 632963
> -------------+--------------------------------------------------------
> priv_tot_emp | 1050 120319.6 96998.76 10000 541053
> road_density | 1050 6.399432 4.106369 .1963984 18.20378
> output | 1050 6636.304 5337.696 1497 33800.75
> propertytax | 1050 1068.173 197.2193 552.77 1782.42
> expen_edu | 1050 28032.39 23728.64 0 207409
> -------------+--------------------------------------------------------
> expen_pss | 1050 2208.517 2677.185 0 30889
> expen_transp | 1050 16999.31 15949.01 23 132291
> expen_hous~g | 1050 23994.46 38973.29 0 612599
> expen_libc~r | 1050 2840.263 4349.239 -30 54474
> unemployment | 1050 6.361048 2.550556 1.2 16.3
> -------------+--------------------------------------------------------
> nvq3 | 1050 47.42981 7.48426 27.6 71.9
> nvq4 | 1050 27.90019 8.680573 12 63.6
> under16 | 1050 64679.43 48988.63 7100 274400
> over65 | 1050 54866 47158.07 6400 258500
> benefitcla~s | 1050 29992.23 20152.8 1230 147780
> -------------+--------------------------------------------------------
> majorurban | 1050 .38 .4856177 0 1
> largeurban | 1050 .1733333 .3787156 0 1
> otherurban | 1050 .1466667 .3539419 0 1
> significan~l | 1050 .1466667 .3539419 0 1
> rural50 | 1050 .1266667 .3327577 0 1
> -------------+--------------------------------------------------------
> rural80 | 1050 .0266667 .1611841 0 1
>
> . replace expen_edu = 1 if (expen_edu == 0)
> (20 real changes made)
>
> . replace expen_pss = 1 if (expen_pss == 0)
> (20 real changes made)
>
> . replace expen_transp = 1 if (expen_transp == 0)
> (0 real changes made)
>
> . replace expen_housing = 1 if (expen_housing == 0)
> (187 real changes made)
>
> . replace expen_libculher = 1 if (expen_libculher == 0)
> (19 real changes made)
>
> . tabulate year, gen(y)
>
> year | Freq. Percent Cum.
> ------------+-----------------------------------
> 2004 | 150 14.29 14.29
> 2005 | 150 14.29 28.57
> 2006 | 150 14.29 42.86
> 2007 | 150 14.29 57.14
> 2008 | 150 14.29 71.43
> 2009 | 150 14.29 85.71
> 2010 | 150 14.29 100.00
> ------------+-----------------------------------
> Total | 1,050 100.00
>
> . gen log_priv_tot_emp = ln(priv_tot_emp)
>
> . gen log_road_density = ln(road_density)
>
> . gen log_output = ln(output)
>
> . gen log_propertytax = ln(propertytax)
>
> . gen log_expen_edu = ln(expen_edu)
>
> . gen log_expen_pss = ln(expen_pss)
>
> . gen log_expen_transp = ln(expen_transp)
>
> . gen log_expen_housing = ln(expen_housing)
>
> . gen log_expen_libculher = ln(expen_libculher)
> (1 missing value generated)
>
> . replace log_expen_libculher = 0 if (log_expen_libculher == .)
> (1 real change made)
>
> . gen log_under16 = ln(under16)
>
> . gen log_over65 = ln(over65)
>
> . gen log_benefitclaimants = ln(benefitclaimants)
>
> . bysort uacountyname : gen county_id = _n == 1
>
> . replace county_id = sum(county_id)
> (1049 real changes made)
>
> . xtset county_id year, yearly
> panel variable: county_id (strongly balanced)
> time variable: year, 2004 to 2010
> delta: 1 year
>
> . xtreg log_priv_tot_emp log_road_density log_output log_propertytax log_expen_edu log_expen_pss log_expen_transp log_exp
>> en_housing log_expen_libculher unemployment nvq3 nvq4 log_under16 log_over65 log_benefitclaimants majorurban largeurban
>> otherurban significantrural rural50 rural80 y1 y2 y3 y4 y5 y6 y7, fe vce(robust)
> note: majorurban omitted because of collinearity
> note: largeurban omitted because of collinearity
> note: otherurban omitted because of collinearity
> note: significantrural omitted because of collinearity
> note: rural50 omitted because of collinearity
> note: rural80 omitted because of collinearity
> note: y1 omitted because of collinearity
>
> Fixed-effects (within) regression Number of obs = 1050
> Group variable: county_id Number of groups = 150
>
> R-sq: within = 0.3287 Obs per group: min = 7
> between = 0.7905 avg = 7.0
> overall = 0.7888 max = 7
>
> F(20,149) = 11.83
> corr(u_i, Xb) = 0.5373 Prob > F = 0.0000
>
> (Std. Err. adjusted for 150 clusters in county_id)
> ------------------------------------------------------------------------------
> | Robust
> log_priv_t~p | Coef. Std. Err. t P>|t| [95% Conf. Interval]
> -------------+----------------------------------------------------------------
> log_road_d~y | -.0308976 .0190012 -1.63 0.106 -.0684442 .006649
> log_output | .2778278 .0626751 4.43 0.000 .1539811 .4016746
> log_proper~x | -.2212403 .1382738 -1.60 0.112 -.4944711 .0519905
> log_expen_~u | .0002099 .0024071 0.09 0.931 -.0045467 .0049664
> log_expen_~s | -.0005283 .002037 -0.26 0.796 -.0045535 .0034969
> log_expen_~p | -.0037324 .0038912 -0.96 0.339 -.0114214 .0039566
> log_expen_~g | .0048323 .0014797 3.27 0.001 .0019083 .0077563
> log_expen_~r | -.0013391 .0009824 -1.36 0.175 -.0032803 .0006021
> unemployment | -.003091 .0012582 -2.46 0.015 -.0055772 -.0006049
> nvq3 | .0002088 .0009244 0.23 0.822 -.0016178 .0020353
> nvq4 | .0006803 .0011502 0.59 0.555 -.0015925 .002953
> log_under16 | .0060861 .0982632 0.06 0.951 -.1880832 .2002555
> log_over65 | .4174694 .0960725 4.35 0.000 .2276289 .6073099
> log_benefi~s | -.0309202 .0774981 -0.40 0.690 -.1840575 .1222171
> majorurban | (omitted)
> largeurban | (omitted)
> otherurban | (omitted)
> significan~l | (omitted)
> rural50 | (omitted)
> rural80 | (omitted)
> y1 | (omitted)
> y2 | .0067527 .0071768 0.94 0.348 -.0074287 .0209341
> y3 | .012156 .0135523 0.90 0.371 -.0146236 .0389355
> y4 | .0122614 .0202253 0.61 0.545 -.0277041 .0522268
> y5 | .0143446 .0249928 0.57 0.567 -.0350415 .0637307
> y6 | .0523563 .028217 1.86 0.066 -.0034009 .1081135
> y7 | .0167271 .0312961 0.53 0.594 -.0451144 .0785686
> _cons | 6.445186 1.679938 3.84 0.000 3.125607 9.764764
> -------------+----------------------------------------------------------------
> sigma_u | .36708293
> sigma_e | .03455236
> rho | .99121795 (fraction of variance due to u_i)
> ------------------------------------------------------------------------------
> Sincerely,
> Talgat
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/