Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Interpretation of interaction term in log linear (non linear) model
From
Suryadipta Roy <[email protected]>
To
[email protected]
Subject
Re: st: Interpretation of interaction term in log linear (non linear) model
Date
Tue, 18 Jun 2013 18:05:48 -0400
Dear David,
Thank you very much for the comments and the wonderful suggestions!
These almost read like a referee report! I had to take some time to
reply to your comments since the issues that you have raised are
substantive. Theoretical work in the gravity model of trade literature
mainly suggest the importance of structural factors that prevent
countries from trading with each other. Some of the important papers
have used -heckman- selection models but that model is more suitable
to explain why countries export (or do not export), while my research
question focuses on the propensity to import. Moreover, the exclusion
restrictions in the selection equation are still not very well
founded. The papers that have used Poisson models have not reported
the goodness of fit. -poisson- by itself reports a pseudo-rsquare
which is not comparable to the linear r-square, while -xtpoisson- or
-xtpqml- that I have implemented does not report any r-square. The
cluster-robust standard errors address both the problems of
overdispersion and serial correlation (Cameron and Trivedi,
Microeconometrics using Stata, 2010). It is theory here that guides
the use of fixed effects, e.g. I need to incorporate 5638 trading pair
fixed effects (since I have 76 countries with complete data, I can
have a maximum of 76*75 = 5700 trading pair relationships).
Your suggestions on model building by splitting the data have been
extremely illuminating. However, I was wondering if you could give me
a bit more concrete suggestions as to how to go about it, e.g. could
you kindly give me the reference to the paper where you undertook the
data splitting approach so that I could read a bit more about it?
Best regards,
Suryadipta.
On Sun, Jun 16, 2013 at 9:13 PM, David Hoaglin <[email protected]> wrote:
> Dear Suryadipta,
>
> Thanks for the further explanation.
>
> If your dependent variable (Trade) is zero in about 15% of the
> observations, I am skeptical that it would be adequate to use a
> fixed-effects Poisson model without explicitly accounting for the
> source of the zeros. If a "zero" is due to non-reporting, shouldn't
> it be a missing value? With such a continuous dependent variable, it
> would be unlikely to observe zero by chance. If a pair of countries
> is not able to trade, that would need to be accounted for as a
> "structural zero," either in a separate part of the model or by
> omitting those observations from the analysis. That huge literature
> may argue in favor of Poisson, but what is the empirical evidence on
> how well the models fit?
>
> I agree that it would be problematic to have 5000 explicit fixed
> effects. What is the source of such a large number of fixed effects?
>
> As a model-building strategy it might be instructive to set aside the
> idea of fixed effects and see what happens when you use random effects
> instead.
>
> Another useful strategy when you have a large amount of data is to
> split the data into parts (usually at random, with appropriate
> stratification if needed), perhaps two halves. Set one of the parts
> aside (out of sight) for use later in validating the final model. Do
> the model-building on the other part. In one project that I worked on
> several years ago, we used 50% of the data for no-holds-barred model
> building, another 25% for fine-tuning the "final" model, and the other
> 25% to get an clean estimate of prediction error.
>
> If the data come from time series, what does the analysis do about
> serial correlation?
>
> Regards,
>
> David Hoaglin
>
> On Fri, Jun 14, 2013 at 9:59 AM, Suryadipta Roy <[email protected]> wrote:
>> Dear David,
>> Thank you for the suggestions! I have cross country time series data
>> (unbalanced panel) where the dependent variable is zero for about 15%
>> of the observations. Many papers have recorded more zero-s, e.g. the
>> paper by Silva and Tenreryo that I mentioned in the previous email
>> reports about 50% of zero observations for the dependent variable
>> (Bilateral Import/Export). I started with a fixed effects log-linear
>> model (more traditional in the trade literature) and moved on to fixed
>> effects Poisson (following Bill Gould's Stata Blog suggestions and the
>> Stata meeting presentation by Austin Nichols :
>> http://www.stata.com/meeting/boston10/boston10_nichols.pdf , as well
>> as some other papers in the literature). I have indeed tried Negative
>> Binomial and might report the results in the paper (but Stata does not
>> have a true fixed effects NB model since the coefficients of the time
>> invariant explanatory variables are reported (Paul Allison, "Fixed
>> effects regression models", Sage, 2009, and some other issues
>> discussed here: Guimarães, P., (2008), The fixed effects negative
>> binomial model revisited, Economics Letters, 99, pp63–66), and the
>> bootstrap standard errors in NB is taking forever to run with my data.
>> Based on the theoretical development in the literature, I must control
>> for fixed effects in my regressions. I have also tried -zip- and
>> -zinb- but there is no conditional fixed effects model in Stata. I did
>> not venture to introduce about 5000 fixed effects in my regressions
>> with -zip- / -zinb- ; most likely these would take forever to run
>> (with more 100,000 observations) , will not converge, and suffer from
>> incidental parameters problem. I have also looked into hurdle models,
>> but the question is if the zero-s are due to non-reporting of data or
>> if countries are not able to trade for some other reasons- there is a
>> huge literature in this area which have argued in favor of Poisson.
>> Thank you very much for all the comments and helpful suggestions!
>>
>> Best regards,
>> Suryadipta.
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/