Mark,
Thanks for your response. Maybe I should have made it clear, that each firm does not have the same number of employees. Actually, not at all. The largest firms will be represented by more observations in the first stage regression if it is performed on 300000 obs (individual level). When the largest firms "dominate" the first stage regression, they will also dominate the coefficients and therefore also the predictions. I don't think it is smart to let the largest firms dominate, but I'm not sure about this.
What do you think?
Jens
________________________________
Fra: [email protected] p� vegne af Mark Schaffer
Sendt: ma 13-09-2004 12:59
Til: [email protected]
Emne: Re: st: matched employer-employee panel data, IV-estimation, first stage: employer level, second stage: employee level
Jens,
Subject: st: matched employer-employee panel data, IV-estimation, first stage: employer level, second stage: employee level
Date sent: Mon, 13 Sep 2004 11:57:08 +0200
From: "Jens Therkelsen" <[email protected]>
To: <[email protected]>
Send reply to: [email protected]
> I have a matched panel with 500 firms (j) and 300000 employees (i) .
>
> I want to do a regression like this
>
> on the left:
> wage(it)
>
> on the right:
> individual variables such as age(it), education(it) and gender(i),
> firm variables such as profits_per_employee(jt), firmsize(jt) and
> fixed_assets_per_employee(jt) and industri-dummies(jt).
>
> But profits_per_employee is endogenous is the model as wages are
> costs. And firm variables (j) are clustered.
>
> If I use "ivreg" or "xtivreg" the first stage regression seems to be
> performed on 300000 observations and that must be wrong.(?)
I don't think this is "wrong", at least not in the sense you suggest.
There are at least two ways to make this point. One is to think of
IV as a one-step estimator and the requirements to make it
consistent. The issue you've raised isn't one that violates these
requirements.
Another way way to think about it is to ask what would be wrong with
your first stage regression. I think the answer is that the standard
errors would be too small; but you don't need the first-stage SEs
when you do the second-stage of IV.
You do have a problem, though, with the clustering by firm. This
will affect your first stage regression diagnostics (unless you
adjust for clustering by firm or something like that) as well as your
main regression.
Hope this helps.
--Mark
> An other option is to do the two steps seperately
>
> profits_per_employee = instruments........ on 500 oberservation and
> save predictions
>
> wage = age education gender predicted_profits_per_employee
> firmsize...,cluster(firm) on 300000 observations
>
> But am I getting it right this way?
>
> Any suggestions?
>
> Thanks for you time!
>
> Jens
>
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
Prof. Mark E. Schaffer
Director
Centre for Economic Reform and Transformation
Department of Economics
School of Management & Languages
Heriot-Watt University, Edinburgh EH14 4AS UK
44-131-451-3494 direct
44-131-451-3008 fax
44-131-451-3485 CERT administrator
http://www.som.hw.ac.uk/cert
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
<<winmail.dat>>