I have a matched panel with 500 firms (j) and 300000 employees (i) .
I want to do a regression like this
on the left:
wage(it)
on the right:
individual variables such as age(it), education(it) and gender(i),
firm variables such as profits_per_employee(jt), firmsize(jt) and fixed_assets_per_employee(jt)
and industri-dummies(jt).
But profits_per_employee is endogenous is the model as wages are costs. And firm variables (j) are clustered.
If I use "ivreg" or "xtivreg" the first stage regression seems to be performed on 300000 observations and that must be wrong.(?)
An other option is to do the two steps seperately
profits_per_employee = instruments........ on 500 oberservation and save predictions
wage = age education gender predicted_profits_per_employee firmsize...,cluster(firm) on 300000 observations
But am I getting it right this way?
Any suggestions?
Thanks for you time!
Jens
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/