Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Agnese Romiti <romitiagnese@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: First stage F stats - xtivreg |
Date | Tue, 21 Jun 2011 19:06:07 +0200 |
Dear Austin, When I used as cluster unit region-year or also only region I had to run ivreg2 on the data that I have previously transformed in deviation to the mean (within trasformation) because the xtivreg2 requires that no panel overlaps more than one cluster. So panels should be uniquely assigned to clusters. I tried to run instead xtivreg2 with two clusters as you suggested but I received an error message "cluster(): too many variables specified", apparently because I don't have the latest version of the commands. I have just done an update all and my stata seems to be updated to 30March 2011 (exe and ado), and to 1Sept 2010 , the utilities. Is there a reason whereby I still get the error? Thanks Agnese 2011/6/21 Austin Nichols <austinnichols@gmail.com>: > Agnese Romiti <romitiagnese@gmail.com>: > I don't see how it matters that individuals move across clusters, > unless you want to cluster by individual as well, and -xtivreg2- > allows two dimensions of clustering. When you cluster by region-year, > you assume that a draw from the dgp of person i in year t is > independent from a draw from the dgp of person i in year t+1, which is > clearly problematic. You should try clustering by individual, by > region, and then try two dimensions of clustering. Let us know how > the first stage diagnostic statistics and SEs on main variables of > interest, in each of those 3 cases, compare to your > region-year-clustered version. > > On Tue, Jun 21, 2011 at 10:47 AM, Agnese Romiti <romitiagnese@gmail.com> wrote: >> Austin, >> >> The reason whereby I have chosen the region-year as cluster unit was >> due to the fact that individuals - around 8 percent of them - move >> across regions over time, so the region was not unique for them. >> >> Many thanks again for your help and the ref. >> Agnese >> >> 2011/6/21 Austin Nichols <austinnichols@gmail.com>: >>> Agnese Romiti <romitiagnese@gmail.com> >>> In that case the cluster-robust SE will be biased downward slightly, >>> resulting in overrejection and your first-stage F stat overstated, but >>> I expect it will still outperform the SE and F clustering by >>> region-year. You would have to do simulations matching your exact >>> setup to be sure; see e.g. >>> http://www.stata.com/meeting/13uk/nichols_crse.pdf >>> >>> On Tue, Jun 21, 2011 at 3:27 AM, Agnese Romiti <romitiagnese@gmail.com> wrote: >>>> Hi, >>>> Thanks again >>>> In my data I have 19 regions, and around 18 percent of the data in the >>>> largest region. >>>> >>>> Agnese >>>> >>>> >>>> 2011/6/21 Austin Nichols <austinnichols@gmail.com>: >>>>> Agnese Romiti <romitiagnese@gmail.com>: >>>>> No, you should cluster by region to correctly account for possible >>>>> serial correlation, >>>>> assuming you have sufficiently many regions in your data; how many are there? >>>>> What percent of the data is in the largest region? >>>>> >>>>> On Mon, Jun 20, 2011 at 5:19 PM, Agnese Romiti <romitiagnese@gmail.com> wrote: >>>>>> Many thanks Austin, >>>>>> >>>>>> I'm actually clustering the standard errors at region-year level >>>>>> rather than at region because I have one regressor with variability at >>>>>> region-year level. Is that correct? >>>>>> Do you think that the high first stage F stats might be a signal of a >>>>>> bad instrument?Like a failure of the exogeneity requirement? >>>>>> >>>>>> Agnese >>>>>> >>>>>> >>>>>> 2011/6/20 Austin Nichols <austinnichols@gmail.com>: >>>>>>> Agnese Romiti <romitiagnese@gmail.com>: >>>>>>> Are you clustering by region to account for the likely correlation of >>>>>>> errors within region? >>>>>>> Also see >>>>>>> http://www.stata.com/meeting/boston10/boston10_nichols.pdf >>>>>>> for an alternative model that allows your dep var to be nonnegative. >>>>>>> >>>>>>> On Mon, Jun 20, 2011 at 3:49 AM, Agnese Romiti <romitiagnese@gmail.com> wrote: >>>>>>>> Dear Statalist users, >>>>>>>> >>>>>>>> I'm running a fixed effect model with IV (xtivreg2) , my dependent >>>>>>>> variable is a measure of labor supply at the individual level (working >>>>>>>> hours). Whereas I have an endogenous variable with variation only at >>>>>>>> regional-year level. >>>>>>>> My question is about the First stage statistics, the Weak >>>>>>>> identification test results in an F statistics extremely high which >>>>>>>> makes me worry about something wrong, i.e. F=3289. >>>>>>>> Do you have any clue about potential reasons driving this odd result? >>>>>>>> >>>>>>>> Many thanks in advance for your help. >>>>>>>> >>>>>>>> Agnese > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/