Dear Stata-Users,
I'm working with a large (40,000 observations), 3 dimensional,
unbalanced panel data and I've got some methodological issues that I
am not very sure about. I would really appreciate if you have any
suggestions.
Practically, I'm exploring how different firm and analysts'
characteristics influence analysts' forecast error. I've got
information regarding individual analysts' forecast error for
particular firms in particular years. The information is available
for 1000 firms, 1,700 analysts over 13 years. The number of
observations differs from firm to firm and from year to year: (e.g.
1995 - 989 obs, 2000 - 5111 obs). The observations for each firm are
not consecutive in time (missing data problem). Besides this, I've
got a simultaneity problem in my model yijt=f(x1it, x2it, X1jt,
x2jt), where i=1..n (firms) and j=1..m (analysts), X1jt - endogenous
variable.
If I had a two dimensional panel data, I would analyse the data in
this way: (1) run a fixed/random effects model; (2) 2sls,fe and check
if by doing this the results changed. But in my case, I've got a
problem with a third dimension.
I've tried to address the issue by creating a new variable that
incorporates the firm and analysts' effects at the same time,
declaring the data as tsset id time and running a fe and 2sls,fe
ID1 Firm1 Analyst 1 Year 1998
ID1 Firm1 Analyst 1 Year 2000
ID2 Firm2 Analyst 1 Year 1997
ID3 Firm2 Analyst 2 Year 1999
However, I've got some doubts about my assumptions:
(1) Am I right in assuming that by running a fixed effect on ID, I
capture both firm and analysts' effect? And if so, is an average
number of observation per group of 1.7 (max 7) sufficient for this
purpose? Is this in a way a "kind of" two-way error model that
allows me to run a 2sls, fe (because it has only 2 dimension)?
(2) I could run a two-way error component model (xi: reg depvar
i.firm i.analyst indvar), but is there a way of addressing the
simultaneity issue and testing for heteroskedasticity and
autocorrelation in this context?
(3) Would a nested error component model be other options of
exploring my data? And if so, would gllamm be the command that I have
to use? Or may be you could suggest a bbeter way of exploring my 3
dimensional data.
Any suggestion would be very much appreciated and thanks a lot in
advance.
Svetlana
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/