Re: st: Efficient way to run regressions with many dummy variables?

From   David Jacobs <[email protected]>
To   [email protected]
Subject   Re: st: Efficient way to run regressions with many dummy variables?
Date   Mon, 27 Apr 2009 16:58:52 -0400

Look up the Stata routine called -areg-.

Dave Jacobs

At 03:32 PM 4/27/2009, you wrote:
Dear Stata-list,

I am using a data set of 963,966 observations, with 26 variables (after
dropping all variables not needed for my estimation). The observations are
dyadic observations, I have in fact (1400 squared)/2 pairs of observations
 (divided by 2 because the relationship is non directional) and so in the
regressions, I need to control for 1400*2 dummy variables. I run a
regression of the form:
xi: reg y x1 x2 x3 i.observation1 i.observation2
where my dataset consists of dyadic relationships between each
observation1 and each observation2.

The problem I run into is that each regression takes an incredibly long
time (and the server crashes regularly).

In an alternative regression, I use Fafchamps and Gubert NGREG: I run:
xi: ngreg  y x1 x2 x3, id(observation1 observation2)
This also takes an incredibly long time.

My question is: Is there a more efficient way to run regressions in stata
with such an enormous amount of dummy variables?

PS: I do not care about the coefficient on the dummies per se.

Thank you very much in advance for your response.


Pauline Grosjean
Ciriacy Wantrup Fellow, Department of Agricultural and Resource Economics
University of California Berkeley
Web page:
Mobile: 510 384 0141

*   For searches and help try:

