Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Dummy Variable Trap, urgent


From   Stephen Martin <[email protected]>
To   [email protected]
Subject   Re: st: Dummy Variable Trap, urgent
Date   Fri, 6 Sep 2013 16:09:04 +0100

Hi Talgat,

I tried to
>>>> check the data set for potential collinearity with other variables
>>>> (possible 'doubling' of fixed effects) and was deleting one variable by
>>>> one from the model, but did not help.

That's because the problem is not with your urban/rural dummies and
other covariates, but between your urban/rural dummies and the fixed
effects.

I think that your problem is that there is no within county variation
in your urban/rural dummies (and it is this within variation that is
used by the FE estimator).

You might want to try re-estimating your model using OLS but adding
dummies for the counties.  Then try re-estimating this adding your
urban/rural dummies.  I'd guess that the latter will be kicked out and
this will illustrate that the problem is with your county and
urban/rural dummies.

Hope this helps.

Steve


On 06/09/2013, Maarten Buis <[email protected]> wrote:
> I mean what I said, I cannot be clearer than that.
>
> On Fri, Sep 6, 2013 at 4:10 PM, Salikhov, Talgat
> <[email protected]> wrote:
>> Maarten,
>>
>> Do you mean completely removing these dummy variable from the model? Or
>> just substituting all binary values by zero?
>>
>> Thanks
>>
>> Regards,
>> Talgat
>> ________________________________________
>> From: [email protected]
>> [[email protected]] on behalf of Maarten Buis
>> [[email protected]]
>> Sent: 06 September 2013 15:00
>> To: [email protected]
>> Subject: Re: st: Dummy Variable Trap, urgent
>>
>> If you have fixed effects, you automatically include area effects. So,
>> dropping the variables does not mean you no longer adjust for them.
>> So, just drop them, and you will adjust for region through the fixed
>> effects.
>>
>> -- Maarten
>>
>>
>> On Fri, Sep 6, 2013 at 3:53 PM, Salikhov, Talgat
>> <[email protected]> wrote:
>>> Maarten,
>>>
>>> Thank you for the comment. My theory strongly recommends including area
>>> type variables so I cannot refuse them. If possible could you recommend
>>> any tricks you've mentioned to be able to estimate correctly?
>>>
>>> Thanks for the advise as well.
>>>
>>> Regards,
>>> Talgat
>>> ________________________________________
>>> From: [email protected]
>>> [[email protected]] on behalf of Maarten Buis
>>> [[email protected]]
>>> Sent: 06 September 2013 14:38
>>> To: [email protected]
>>> Subject: Re: st: Dummy Variable Trap, urgent
>>>
>>> Your units appear to be counties, so it is no surprise that the region
>>> is constant. With a fixed effects you filter out anything that is
>>> constant within the unit, regardless of whether it is observed or not.
>>> This is why fixed effects models are so popular. But it also means
>>> that you cannot estimate (at least not without resorting to some
>>> tricks) the effects of variables that are fixed within units, as you
>>> noticed. If you added region because you wanted to adjust your
>>> estimates for it, but are not substantively interested in it, then you
>>> can just leave those variables out and let the fixed effects take care
>>> of the adjusting (as it is already doing automatically). If you are
>>> substantively interested in these region effects you will need to do
>>> something else.
>>>
>>> -- Maarten
>>>
>>> Ps. I realize that your plea for urgency is sincere, but I would
>>> strongly advise against it. To quote the Statalist FAQ:
>>> "Urgency is only your concern. Pleas of urgency, desperation, and the
>>> like are widely deprecated by Statalist members. What is urgent for
>>> you is unlikely to translate into urgency for other members of the
>>> list. It is simplest and best to just ask your question directly."
>>>
>>> On Fri, Sep 6, 2013 at 3:27 PM, Salikhov, Talgat
>>> <[email protected]> wrote:
>>>> Dear All,
>>>>
>>>> I need some help with my model. This is for my dissertation, which is
>>>> due very soon, so I would greatly appreciate if anyone could reply
>>>> asap.
>>>>
>>>> Context:
>>>>
>>>> I have a panel data. I am using STATA 11. I am running an employment
>>>> model with fixed effects. I have a number of various variables to
>>>> control for various factors and area characteristics, including 6
>>>> categorical dummy variables to control for the area type according to
>>>> the level of urbanization. I also introduced year 7 dummies.
>>>>
>>>> Problem:
>>>>
>>>> When I run the model with fixed effects specification the coefficients
>>>> for area type dummies get omitted because of collinearity. I realise
>>>> this is a dummy variable trap. Note that coefficients for year dummies
>>>> are estimated withoutany problems (with one year omitted as expected).
>>>> However even though I drop one of the area type dummy variables, it
>>>> still shows as omitted. I don't know what is the problem. I tried to
>>>> check the data set for potential collinearity with other variables
>>>> (possible 'doubling' of fixed effects) and was deleting one variable by
>>>> one from the model, but did not help.
>>>>
>>>> The list of my commands with the results is as follows:
>>>>
>>>>  clear
>>>>
>>>> . *(26 variables, 1050 observations pasted into data editor)
>>>>
>>>> . summarize output
>>>>
>>>>     Variable |       Obs        Mean    Std. Dev.       Min        Max
>>>> -------------+--------------------------------------------------------
>>>>       output |      1050    6636.304    5337.696       1497   33800.75
>>>>
>>>> . sum
>>>>
>>>>     Variable |       Obs        Mean    Std. Dev.       Min        Max
>>>> -------------+--------------------------------------------------------
>>>>      country |         0
>>>>         year |      1050        2007    2.000953       2004       2010
>>>>       region |         0
>>>> uacountyname |         0
>>>>      tot_emp |      1050    150599.6      117570      12800     632963
>>>> -------------+--------------------------------------------------------
>>>> priv_tot_emp |      1050    120319.6    96998.76      10000     541053
>>>> road_density |      1050    6.399432    4.106369   .1963984   18.20378
>>>>       output |      1050    6636.304    5337.696       1497   33800.75
>>>>  propertytax |      1050    1068.173    197.2193     552.77    1782.42
>>>>    expen_edu |      1050    28032.39    23728.64          0     207409
>>>> -------------+--------------------------------------------------------
>>>>    expen_pss |      1050    2208.517    2677.185          0      30889
>>>> expen_transp |      1050    16999.31    15949.01         23     132291
>>>> expen_hous~g |      1050    23994.46    38973.29          0     612599
>>>> expen_libc~r |      1050    2840.263    4349.239        -30      54474
>>>> unemployment |      1050    6.361048    2.550556        1.2       16.3
>>>> -------------+--------------------------------------------------------
>>>>         nvq3 |      1050    47.42981     7.48426       27.6       71.9
>>>>         nvq4 |      1050    27.90019    8.680573         12       63.6
>>>>      under16 |      1050    64679.43    48988.63       7100     274400
>>>>       over65 |      1050       54866    47158.07       6400     258500
>>>> benefitcla~s |      1050    29992.23     20152.8       1230     147780
>>>> -------------+--------------------------------------------------------
>>>>   majorurban |      1050         .38    .4856177          0          1
>>>>   largeurban |      1050    .1733333    .3787156          0          1
>>>>   otherurban |      1050    .1466667    .3539419          0          1
>>>> significan~l |      1050    .1466667    .3539419          0          1
>>>>      rural50 |      1050    .1266667    .3327577          0          1
>>>> -------------+--------------------------------------------------------
>>>>      rural80 |      1050    .0266667    .1611841          0          1
>>>>
>>>> . replace expen_edu = 1 if (expen_edu == 0)
>>>> (20 real changes made)
>>>>
>>>> . replace expen_pss = 1 if (expen_pss == 0)
>>>> (20 real changes made)
>>>>
>>>> . replace expen_transp = 1 if (expen_transp == 0)
>>>> (0 real changes made)
>>>>
>>>> . replace expen_housing = 1 if (expen_housing == 0)
>>>> (187 real changes made)
>>>>
>>>> . replace expen_libculher = 1 if (expen_libculher == 0)
>>>> (19 real changes made)
>>>>
>>>> . tabulate year, gen(y)
>>>>
>>>>        year |      Freq.     Percent        Cum.
>>>> ------------+-----------------------------------
>>>>        2004 |        150       14.29       14.29
>>>>        2005 |        150       14.29       28.57
>>>>        2006 |        150       14.29       42.86
>>>>        2007 |        150       14.29       57.14
>>>>        2008 |        150       14.29       71.43
>>>>        2009 |        150       14.29       85.71
>>>>        2010 |        150       14.29      100.00
>>>> ------------+-----------------------------------
>>>>       Total |      1,050      100.00
>>>>
>>>> . gen log_priv_tot_emp = ln(priv_tot_emp)
>>>>
>>>> . gen log_road_density = ln(road_density)
>>>>
>>>> . gen log_output = ln(output)
>>>>
>>>> . gen log_propertytax = ln(propertytax)
>>>>
>>>> . gen log_expen_edu = ln(expen_edu)
>>>>
>>>> . gen log_expen_pss = ln(expen_pss)
>>>>
>>>> . gen log_expen_transp = ln(expen_transp)
>>>>
>>>> . gen log_expen_housing = ln(expen_housing)
>>>>
>>>> . gen log_expen_libculher = ln(expen_libculher)
>>>> (1 missing value generated)
>>>>
>>>> . replace log_expen_libculher = 0 if (log_expen_libculher == .)
>>>> (1 real change made)
>>>>
>>>> . gen log_under16 = ln(under16)
>>>>
>>>> . gen log_over65 = ln(over65)
>>>>
>>>> . gen log_benefitclaimants = ln(benefitclaimants)
>>>>
>>>> . bysort  uacountyname : gen county_id = _n == 1
>>>>
>>>> . replace county_id = sum(county_id)
>>>> (1049 real changes made)
>>>>
>>>> . xtset county_id year, yearly
>>>>        panel variable:  county_id (strongly balanced)
>>>>         time variable:  year, 2004 to 2010
>>>>                 delta:  1 year
>>>>
>>>> . xtreg log_priv_tot_emp log_road_density log_output log_propertytax
>>>> log_expen_edu log_expen_pss log_expen_transp log_exp
>>>>> en_housing log_expen_libculher unemployment nvq3 nvq4 log_under16
>>>>> log_over65 log_benefitclaimants majorurban largeurban
>>>>>  otherurban significantrural rural50 rural80 y1 y2 y3 y4 y5 y6 y7, fe
>>>>> vce(robust)
>>>> note: majorurban omitted because of collinearity
>>>> note: largeurban omitted because of collinearity
>>>> note: otherurban omitted because of collinearity
>>>> note: significantrural omitted because of collinearity
>>>> note: rural50 omitted because of collinearity
>>>> note: rural80 omitted because of collinearity
>>>> note: y1 omitted because of collinearity
>>>>
>>>> Fixed-effects (within) regression               Number of obs      =
>>>>  1050
>>>> Group variable: county_id                       Number of groups   =
>>>>   150
>>>>
>>>> R-sq:  within  = 0.3287                         Obs per group: min =
>>>>     7
>>>>        between = 0.7905                                        avg =
>>>>   7.0
>>>>        overall = 0.7888                                        max =
>>>>     7
>>>>
>>>>                                                 F(20,149)          =
>>>> 11.83
>>>> corr(u_i, Xb)  = 0.5373                         Prob > F           =
>>>> 0.0000
>>>>
>>>>                             (Std. Err. adjusted for 150 clusters in
>>>> county_id)
>>>> ------------------------------------------------------------------------------
>>>>              |               Robust
>>>> log_priv_t~p |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
>>>> Interval]
>>>> -------------+----------------------------------------------------------------
>>>> log_road_d~y |  -.0308976   .0190012    -1.63   0.106    -.0684442
>>>> .006649
>>>>   log_output |   .2778278   .0626751     4.43   0.000     .1539811
>>>> .4016746
>>>> log_proper~x |  -.2212403   .1382738    -1.60   0.112    -.4944711
>>>> .0519905
>>>> log_expen_~u |   .0002099   .0024071     0.09   0.931    -.0045467
>>>> .0049664
>>>> log_expen_~s |  -.0005283    .002037    -0.26   0.796    -.0045535
>>>> .0034969
>>>> log_expen_~p |  -.0037324   .0038912    -0.96   0.339    -.0114214
>>>> .0039566
>>>> log_expen_~g |   .0048323   .0014797     3.27   0.001     .0019083
>>>> .0077563
>>>> log_expen_~r |  -.0013391   .0009824    -1.36   0.175    -.0032803
>>>> .0006021
>>>> unemployment |   -.003091   .0012582    -2.46   0.015    -.0055772
>>>> -.0006049
>>>>         nvq3 |   .0002088   .0009244     0.23   0.822    -.0016178
>>>> .0020353
>>>>         nvq4 |   .0006803   .0011502     0.59   0.555    -.0015925
>>>> .002953
>>>>  log_under16 |   .0060861   .0982632     0.06   0.951    -.1880832
>>>> .2002555
>>>>   log_over65 |   .4174694   .0960725     4.35   0.000     .2276289
>>>> .6073099
>>>> log_benefi~s |  -.0309202   .0774981    -0.40   0.690    -.1840575
>>>> .1222171
>>>>   majorurban |  (omitted)
>>>>   largeurban |  (omitted)
>>>>   otherurban |  (omitted)
>>>> significan~l |  (omitted)
>>>>      rural50 |  (omitted)
>>>>      rural80 |  (omitted)
>>>>           y1 |  (omitted)
>>>>           y2 |   .0067527   .0071768     0.94   0.348    -.0074287
>>>> .0209341
>>>>           y3 |    .012156   .0135523     0.90   0.371    -.0146236
>>>> .0389355
>>>>           y4 |   .0122614   .0202253     0.61   0.545    -.0277041
>>>> .0522268
>>>>           y5 |   .0143446   .0249928     0.57   0.567    -.0350415
>>>> .0637307
>>>>           y6 |   .0523563    .028217     1.86   0.066    -.0034009
>>>> .1081135
>>>>           y7 |   .0167271   .0312961     0.53   0.594    -.0451144
>>>> .0785686
>>>>        _cons |   6.445186   1.679938     3.84   0.000     3.125607
>>>> 9.764764
>>>> -------------+----------------------------------------------------------------
>>>>      sigma_u |  .36708293
>>>>      sigma_e |  .03455236
>>>>          rho |  .99121795   (fraction of variance due to u_i)
>>>> ------------------------------------------------------------------------------
>>>> Sincerely,
>>>> Talgat
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>>
>>>
>>> --
>>> ---------------------------------
>>> Maarten L. Buis
>>> WZB
>>> Reichpietschufer 50
>>> 10785 Berlin
>>> Germany
>>>
>>> http://www.maartenbuis.nl
>>> ---------------------------------
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>>
>> --
>> ---------------------------------
>> Maarten L. Buis
>> WZB
>> Reichpietschufer 50
>> 10785 Berlin
>> Germany
>>
>> http://www.maartenbuis.nl
>> ---------------------------------
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
>
>
> --
> ---------------------------------
> Maarten L. Buis
> WZB
> Reichpietschufer 50
> 10785 Berlin
> Germany
>
> http://www.maartenbuis.nl
> ---------------------------------
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index