Using panel data, I published a paper in which I claim that universities with a patent policy produce more patents.
Then, I submitted a companion paper where I wanted to show that higher royalties to inventors and their department lead to more patents.
The model was something like:
brev = f (inv dip [...] brev1 size south med time)
where brev is the dependent variable (patent counts), inv & dip are my most important independent variables (% of the revenues shared with the inventor and its dipartment), [...] stands for additional nine dummies (not so important), brev1 stands for 1-year lagged dependent variable (not so important), and then there are four exogenous variables.
In such new paper, I focused on universities and years where a patent policy is available (about 20% of the observation). Therefore I described my situation as:
(1) if a dummy patent policy = 0, then inv, dip and the dummies are missing
(2) if a dummy patent policy = 1, then inv, dip and the dummies are non missing (in particular, inv > 0). Please note that, once a university issues a patent policy, it is unlikely that it is changed in the period of observation.
The reviewers asked me to account for potential endogeneity (royalties may be determined by some structural characteristics of the university, and maybe by past patenting activity) and to include all observations. They suggested that my situation could be described as well as: There are two regimes:
(1) if inv = 0, then dip and the additional nine dummies are all zero
(2) if inv > 0, then dip and the additional nine dummies may be zero or not
The solution sounded to me like requesting some Heckman's model for panel data. Supposing that I am right (i.e. that Heckman's model is the best way to manipulate my data), and that -xtheckman- does not exist, can you suggest a second-best solution? I guess I can use -heckman- (thus ignoring the panel structure) or -xtivreg brev (inv dip = size south med time)- in which I can add one or two of the nine dummies to be instrumented and I ignore the lagged dependent variable. But maybe you know something better (or help me to decide between the two).
Many thanks for your suggestions,
Nicola
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/