Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Dependent variable in a tobit model
From
Austin Nichols <[email protected]>
To
[email protected]
Subject
Re: st: Dependent variable in a tobit model
Date
Tue, 16 Nov 2010 08:08:15 -0500
Estrella Gomez <[email protected]> :
How to trick -tobit- is there in my original email, and there is much
more detail in the text referenced. Stata's -tobit- will not include
missings, of course, which is why some people would have you replace
the log of zero (missing) with a negative number (< any observed
value). Note that the answer you get will depend on the number you
choose, and there is in most cases no clear justification for any
particular number. Others would have you use -heckman- where there is
a probit of y>0 on the same regressors X estimated together with OLS
of ln(y) on X, which strikes me as an even worse idea.
The graph is -twoway kdensity- but see also -kdens- on SSC.
The one case where -tobit- is a fairly good solution is where the log
link is appropriate but there is rounding of the depvar, in which case
there is a justification for replacing the log of zero with a
particular number, namely the log of the largest number that can be
rounded to zero. I say "fairly" good because -intreg- is the correct
solution for a rounded depvar with a homosk. normal error, so -tobit-
is still not right, but -tobit- provides a very good approximation to
-intreg-. Neither works for the heterosk. or non-normal case.
drawnorm x e, n(100) clear seed(1)
g y=round(exp(x-5+e),.01)
g lny=ln(y)
g ly=max(lny,ln(.005))
tobit ly x, ll(`=ln(.005)')
heckman lny x, sel(x)
g y2=ln(y+.005)
g y1=ln(y-.005)
intreg y1 y2 x
On Tue, Nov 16, 2010 at 4:00 AM, Estrella Gomez <[email protected]> wrote:
> Dear Austin
>
> The thing is that I am comparing estimation methods to stduy the
> properties of each of these methods, that's why I am interested in
> estimating a Tobit model, as well as Poisson and others. But I am not
> sure if Stata automatically drops out the observations of the
> dependent variable that are missings (those are the observations that
> are zeros in the original variable in levels) and hence, I should
> specify instead an equation with the dependent variable in levels
> instead of logs. Is that correct?
>
> Thank you very much for your advices and your ppt. Could I ask you
> which is the command to plot the graphic in which you compare the
> estimators?
>
> Best regards,
> Estrella
>
> 2010/11/15 Austin Nichols <[email protected]>:
>> Estrella Gomez <[email protected]> :
>> Why not use -poisson- or -glm- with cluster-robust standard errors
>> (and fixed effects if you like)? Cameron and Trivedi would have you
>> "trick" Stata's -tobit- command using a negative y in place of
>> ln(zero), but I argue in
>> http://repec.org/bost10/nichols_boston2010.pdf
>> that tricking Stata this way is a bad idea; see also
>> Santos Silva and Tenreyro. 2006. "The Log of Gravity." Review of
>> Economics and Statistics, 88(4): 641-658.
>> Cameron and Trivedi. 2009. Microeconometrics Using Stata. Stata Press,
>> College Station TX.
>>
>> Note that a lot of zeros is not a problem in general, since those
>> observations may also have very low predicted values conditional on
>> observables.
>>
>> On Mon, Nov 15, 2010 at 5:19 AM, Estrella Gomez <[email protected]> wrote:
>>> Dear Statalist:
>>>
>>> I am estimating a gravity equation of exports and I have decided to
>>> use tobit model, since the number of zeros in the dataset is high.
>>> This model is usually estimated in its loglinear version. However, if
>>> I use the logarithm of trade as dependent variable, the zeros are
>>> dropped (since the logarithm of zero is unfeasible).
>>>
>>> This is the command I have introduced in Stata 11:
>>>
>>> xttobit lxi lGDP_i lGDP_j contig comla smctry ldist RTAboth i.exporter
>>> i.importer i.year, ll(0)
>>>
>>> where "lxi" is the logarithm of exports and the independent variables
>>> are the standard variables in a gravity equation
>>>
>>> Of course, the number of censored observations is zero, since the
>>> dependent variable is introduced in logs. I think this is not correct,
>>> but I have seen this specification in most of the gravity-related
>>> articles
>>>
>>> Could somebody help me with this?
>>>
>>> In addition, I do not know if it is correct to introduce exporter,
>>> importer and time effects in the model
>>>
>>> Thank you in advance,
>>> Estrella Gómez
>>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/