Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Dummy Variable Trap, urgent
From
Stephen Martin <[email protected]>
To
[email protected]
Subject
Re: st: Dummy Variable Trap, urgent
Date
Fri, 6 Sep 2013 16:09:04 +0100
Hi Talgat,
I tried to
>>>> check the data set for potential collinearity with other variables
>>>> (possible 'doubling' of fixed effects) and was deleting one variable by
>>>> one from the model, but did not help.
That's because the problem is not with your urban/rural dummies and
other covariates, but between your urban/rural dummies and the fixed
effects.
I think that your problem is that there is no within county variation
in your urban/rural dummies (and it is this within variation that is
used by the FE estimator).
You might want to try re-estimating your model using OLS but adding
dummies for the counties. Then try re-estimating this adding your
urban/rural dummies. I'd guess that the latter will be kicked out and
this will illustrate that the problem is with your county and
urban/rural dummies.
Hope this helps.
Steve
On 06/09/2013, Maarten Buis <[email protected]> wrote:
> I mean what I said, I cannot be clearer than that.
>
> On Fri, Sep 6, 2013 at 4:10 PM, Salikhov, Talgat
> <[email protected]> wrote:
>> Maarten,
>>
>> Do you mean completely removing these dummy variable from the model? Or
>> just substituting all binary values by zero?
>>
>> Thanks
>>
>> Regards,
>> Talgat
>> ________________________________________
>> From: [email protected]
>> [[email protected]] on behalf of Maarten Buis
>> [[email protected]]
>> Sent: 06 September 2013 15:00
>> To: [email protected]
>> Subject: Re: st: Dummy Variable Trap, urgent
>>
>> If you have fixed effects, you automatically include area effects. So,
>> dropping the variables does not mean you no longer adjust for them.
>> So, just drop them, and you will adjust for region through the fixed
>> effects.
>>
>> -- Maarten
>>
>>
>> On Fri, Sep 6, 2013 at 3:53 PM, Salikhov, Talgat
>> <[email protected]> wrote:
>>> Maarten,
>>>
>>> Thank you for the comment. My theory strongly recommends including area
>>> type variables so I cannot refuse them. If possible could you recommend
>>> any tricks you've mentioned to be able to estimate correctly?
>>>
>>> Thanks for the advise as well.
>>>
>>> Regards,
>>> Talgat
>>> ________________________________________
>>> From: [email protected]
>>> [[email protected]] on behalf of Maarten Buis
>>> [[email protected]]
>>> Sent: 06 September 2013 14:38
>>> To: [email protected]
>>> Subject: Re: st: Dummy Variable Trap, urgent
>>>
>>> Your units appear to be counties, so it is no surprise that the region
>>> is constant. With a fixed effects you filter out anything that is
>>> constant within the unit, regardless of whether it is observed or not.
>>> This is why fixed effects models are so popular. But it also means
>>> that you cannot estimate (at least not without resorting to some
>>> tricks) the effects of variables that are fixed within units, as you
>>> noticed. If you added region because you wanted to adjust your
>>> estimates for it, but are not substantively interested in it, then you
>>> can just leave those variables out and let the fixed effects take care
>>> of the adjusting (as it is already doing automatically). If you are
>>> substantively interested in these region effects you will need to do
>>> something else.
>>>
>>> -- Maarten
>>>
>>> Ps. I realize that your plea for urgency is sincere, but I would
>>> strongly advise against it. To quote the Statalist FAQ:
>>> "Urgency is only your concern. Pleas of urgency, desperation, and the
>>> like are widely deprecated by Statalist members. What is urgent for
>>> you is unlikely to translate into urgency for other members of the
>>> list. It is simplest and best to just ask your question directly."
>>>
>>> On Fri, Sep 6, 2013 at 3:27 PM, Salikhov, Talgat
>>> <[email protected]> wrote:
>>>> Dear All,
>>>>
>>>> I need some help with my model. This is for my dissertation, which is
>>>> due very soon, so I would greatly appreciate if anyone could reply
>>>> asap.
>>>>
>>>> Context:
>>>>
>>>> I have a panel data. I am using STATA 11. I am running an employment
>>>> model with fixed effects. I have a number of various variables to
>>>> control for various factors and area characteristics, including 6
>>>> categorical dummy variables to control for the area type according to
>>>> the level of urbanization. I also introduced year 7 dummies.
>>>>
>>>> Problem:
>>>>
>>>> When I run the model with fixed effects specification the coefficients
>>>> for area type dummies get omitted because of collinearity. I realise
>>>> this is a dummy variable trap. Note that coefficients for year dummies
>>>> are estimated withoutany problems (with one year omitted as expected).
>>>> However even though I drop one of the area type dummy variables, it
>>>> still shows as omitted. I don't know what is the problem. I tried to
>>>> check the data set for potential collinearity with other variables
>>>> (possible 'doubling' of fixed effects) and was deleting one variable by
>>>> one from the model, but did not help.
>>>>
>>>> The list of my commands with the results is as follows:
>>>>
>>>> clear
>>>>
>>>> . *(26 variables, 1050 observations pasted into data editor)
>>>>
>>>> . summarize output
>>>>
>>>> Variable | Obs Mean Std. Dev. Min Max
>>>> -------------+--------------------------------------------------------
>>>> output | 1050 6636.304 5337.696 1497 33800.75
>>>>
>>>> . sum
>>>>
>>>> Variable | Obs Mean Std. Dev. Min Max
>>>> -------------+--------------------------------------------------------
>>>> country | 0
>>>> year | 1050 2007 2.000953 2004 2010
>>>> region | 0
>>>> uacountyname | 0
>>>> tot_emp | 1050 150599.6 117570 12800 632963
>>>> -------------+--------------------------------------------------------
>>>> priv_tot_emp | 1050 120319.6 96998.76 10000 541053
>>>> road_density | 1050 6.399432 4.106369 .1963984 18.20378
>>>> output | 1050 6636.304 5337.696 1497 33800.75
>>>> propertytax | 1050 1068.173 197.2193 552.77 1782.42
>>>> expen_edu | 1050 28032.39 23728.64 0 207409
>>>> -------------+--------------------------------------------------------
>>>> expen_pss | 1050 2208.517 2677.185 0 30889
>>>> expen_transp | 1050 16999.31 15949.01 23 132291
>>>> expen_hous~g | 1050 23994.46 38973.29 0 612599
>>>> expen_libc~r | 1050 2840.263 4349.239 -30 54474
>>>> unemployment | 1050 6.361048 2.550556 1.2 16.3
>>>> -------------+--------------------------------------------------------
>>>> nvq3 | 1050 47.42981 7.48426 27.6 71.9
>>>> nvq4 | 1050 27.90019 8.680573 12 63.6
>>>> under16 | 1050 64679.43 48988.63 7100 274400
>>>> over65 | 1050 54866 47158.07 6400 258500
>>>> benefitcla~s | 1050 29992.23 20152.8 1230 147780
>>>> -------------+--------------------------------------------------------
>>>> majorurban | 1050 .38 .4856177 0 1
>>>> largeurban | 1050 .1733333 .3787156 0 1
>>>> otherurban | 1050 .1466667 .3539419 0 1
>>>> significan~l | 1050 .1466667 .3539419 0 1
>>>> rural50 | 1050 .1266667 .3327577 0 1
>>>> -------------+--------------------------------------------------------
>>>> rural80 | 1050 .0266667 .1611841 0 1
>>>>
>>>> . replace expen_edu = 1 if (expen_edu == 0)
>>>> (20 real changes made)
>>>>
>>>> . replace expen_pss = 1 if (expen_pss == 0)
>>>> (20 real changes made)
>>>>
>>>> . replace expen_transp = 1 if (expen_transp == 0)
>>>> (0 real changes made)
>>>>
>>>> . replace expen_housing = 1 if (expen_housing == 0)
>>>> (187 real changes made)
>>>>
>>>> . replace expen_libculher = 1 if (expen_libculher == 0)
>>>> (19 real changes made)
>>>>
>>>> . tabulate year, gen(y)
>>>>
>>>> year | Freq. Percent Cum.
>>>> ------------+-----------------------------------
>>>> 2004 | 150 14.29 14.29
>>>> 2005 | 150 14.29 28.57
>>>> 2006 | 150 14.29 42.86
>>>> 2007 | 150 14.29 57.14
>>>> 2008 | 150 14.29 71.43
>>>> 2009 | 150 14.29 85.71
>>>> 2010 | 150 14.29 100.00
>>>> ------------+-----------------------------------
>>>> Total | 1,050 100.00
>>>>
>>>> . gen log_priv_tot_emp = ln(priv_tot_emp)
>>>>
>>>> . gen log_road_density = ln(road_density)
>>>>
>>>> . gen log_output = ln(output)
>>>>
>>>> . gen log_propertytax = ln(propertytax)
>>>>
>>>> . gen log_expen_edu = ln(expen_edu)
>>>>
>>>> . gen log_expen_pss = ln(expen_pss)
>>>>
>>>> . gen log_expen_transp = ln(expen_transp)
>>>>
>>>> . gen log_expen_housing = ln(expen_housing)
>>>>
>>>> . gen log_expen_libculher = ln(expen_libculher)
>>>> (1 missing value generated)
>>>>
>>>> . replace log_expen_libculher = 0 if (log_expen_libculher == .)
>>>> (1 real change made)
>>>>
>>>> . gen log_under16 = ln(under16)
>>>>
>>>> . gen log_over65 = ln(over65)
>>>>
>>>> . gen log_benefitclaimants = ln(benefitclaimants)
>>>>
>>>> . bysort uacountyname : gen county_id = _n == 1
>>>>
>>>> . replace county_id = sum(county_id)
>>>> (1049 real changes made)
>>>>
>>>> . xtset county_id year, yearly
>>>> panel variable: county_id (strongly balanced)
>>>> time variable: year, 2004 to 2010
>>>> delta: 1 year
>>>>
>>>> . xtreg log_priv_tot_emp log_road_density log_output log_propertytax
>>>> log_expen_edu log_expen_pss log_expen_transp log_exp
>>>>> en_housing log_expen_libculher unemployment nvq3 nvq4 log_under16
>>>>> log_over65 log_benefitclaimants majorurban largeurban
>>>>> otherurban significantrural rural50 rural80 y1 y2 y3 y4 y5 y6 y7, fe
>>>>> vce(robust)
>>>> note: majorurban omitted because of collinearity
>>>> note: largeurban omitted because of collinearity
>>>> note: otherurban omitted because of collinearity
>>>> note: significantrural omitted because of collinearity
>>>> note: rural50 omitted because of collinearity
>>>> note: rural80 omitted because of collinearity
>>>> note: y1 omitted because of collinearity
>>>>
>>>> Fixed-effects (within) regression Number of obs =
>>>> 1050
>>>> Group variable: county_id Number of groups =
>>>> 150
>>>>
>>>> R-sq: within = 0.3287 Obs per group: min =
>>>> 7
>>>> between = 0.7905 avg =
>>>> 7.0
>>>> overall = 0.7888 max =
>>>> 7
>>>>
>>>> F(20,149) =
>>>> 11.83
>>>> corr(u_i, Xb) = 0.5373 Prob > F =
>>>> 0.0000
>>>>
>>>> (Std. Err. adjusted for 150 clusters in
>>>> county_id)
>>>> ------------------------------------------------------------------------------
>>>> | Robust
>>>> log_priv_t~p | Coef. Std. Err. t P>|t| [95% Conf.
>>>> Interval]
>>>> -------------+----------------------------------------------------------------
>>>> log_road_d~y | -.0308976 .0190012 -1.63 0.106 -.0684442
>>>> .006649
>>>> log_output | .2778278 .0626751 4.43 0.000 .1539811
>>>> .4016746
>>>> log_proper~x | -.2212403 .1382738 -1.60 0.112 -.4944711
>>>> .0519905
>>>> log_expen_~u | .0002099 .0024071 0.09 0.931 -.0045467
>>>> .0049664
>>>> log_expen_~s | -.0005283 .002037 -0.26 0.796 -.0045535
>>>> .0034969
>>>> log_expen_~p | -.0037324 .0038912 -0.96 0.339 -.0114214
>>>> .0039566
>>>> log_expen_~g | .0048323 .0014797 3.27 0.001 .0019083
>>>> .0077563
>>>> log_expen_~r | -.0013391 .0009824 -1.36 0.175 -.0032803
>>>> .0006021
>>>> unemployment | -.003091 .0012582 -2.46 0.015 -.0055772
>>>> -.0006049
>>>> nvq3 | .0002088 .0009244 0.23 0.822 -.0016178
>>>> .0020353
>>>> nvq4 | .0006803 .0011502 0.59 0.555 -.0015925
>>>> .002953
>>>> log_under16 | .0060861 .0982632 0.06 0.951 -.1880832
>>>> .2002555
>>>> log_over65 | .4174694 .0960725 4.35 0.000 .2276289
>>>> .6073099
>>>> log_benefi~s | -.0309202 .0774981 -0.40 0.690 -.1840575
>>>> .1222171
>>>> majorurban | (omitted)
>>>> largeurban | (omitted)
>>>> otherurban | (omitted)
>>>> significan~l | (omitted)
>>>> rural50 | (omitted)
>>>> rural80 | (omitted)
>>>> y1 | (omitted)
>>>> y2 | .0067527 .0071768 0.94 0.348 -.0074287
>>>> .0209341
>>>> y3 | .012156 .0135523 0.90 0.371 -.0146236
>>>> .0389355
>>>> y4 | .0122614 .0202253 0.61 0.545 -.0277041
>>>> .0522268
>>>> y5 | .0143446 .0249928 0.57 0.567 -.0350415
>>>> .0637307
>>>> y6 | .0523563 .028217 1.86 0.066 -.0034009
>>>> .1081135
>>>> y7 | .0167271 .0312961 0.53 0.594 -.0451144
>>>> .0785686
>>>> _cons | 6.445186 1.679938 3.84 0.000 3.125607
>>>> 9.764764
>>>> -------------+----------------------------------------------------------------
>>>> sigma_u | .36708293
>>>> sigma_e | .03455236
>>>> rho | .99121795 (fraction of variance due to u_i)
>>>> ------------------------------------------------------------------------------
>>>> Sincerely,
>>>> Talgat
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>>
>>>
>>> --
>>> ---------------------------------
>>> Maarten L. Buis
>>> WZB
>>> Reichpietschufer 50
>>> 10785 Berlin
>>> Germany
>>>
>>> http://www.maartenbuis.nl
>>> ---------------------------------
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>>
>>
>> --
>> ---------------------------------
>> Maarten L. Buis
>> WZB
>> Reichpietschufer 50
>> 10785 Berlin
>> Germany
>>
>> http://www.maartenbuis.nl
>> ---------------------------------
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
>
>
> --
> ---------------------------------
> Maarten L. Buis
> WZB
> Reichpietschufer 50
> 10785 Berlin
> Germany
>
> http://www.maartenbuis.nl
> ---------------------------------
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/