Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | George Shoukry <gshoukry@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | st: Three Fixed Effects with Millions of Observations |
Date | Wed, 19 Mar 2014 16:06:20 -0500 |
I have a data set with over 10 million observations and each observation is uniquely identified by three variables (say time, firm, county). I would like to include fixed effects for the three identifying variables, cluster the standard errors at the firm level, and run OLS and Poisson regressions for some variables in the data. I have two questions: 1. Ideally I want to do "reg y x i.firm i.time i.county, vce(cluster firm)", but this takes too long (not sure exactly how long because I stopped it after a while). So far I've been able to get OLS estimates on my computer using the undocumented _regress command with the absorb() option. The county identifier has the most number of values, so I do something like "_regress y x i.firm i.time, absorb(county)". The problem is that I cannot seem to cluster the errors at the firm level with the _regress command and I can't find documentation for it. Any ideas on the fastest way in Stata to obtain OLS estimates in this case with clustered errors? Note: I tried some other options but they seem to take too long (how long do you leave commands running before you stop them?). 2. Any experience with the best way to run a fixed-effects Poisson regression with a large dataset and several fixed effects? Thanks! * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/