Well then generally selection on the regressor does not lead to
anything other than efficiency losses. If you believe that the same
model applies to the big firms and small firms, then you can estimate
your model with the existing sample without any worries. It does not
matter whether the dependent variable is discrete or continuous,
either. Am I totally off on this? That's what Rose wrote, too.
One other thing you might want to try out if you want the results for
other types of firms desperately is to reweight your data giving
higher weights for underrepresented firms. But you also said that you
have the population of firms, with some firms missing information on
certain variables, right? In the linear regression context, you could
have estimated the overall covariance matrix of (y,X) using both long
lists of firms where all variables were observed, or short lists for
big firms, for different part of the covariance matrix, and then
recover your regression parameters as b=(X'X)^{-1} X'y, where X'X is
the lower right block of the overall covariance, and X'y is the lower
left block (sub-column). The standard errors get out of control, of
course. I don't know of any smart way to generalize that to the
multinomial logit.
On 1/24/06, R.E. De Hoyos <[email protected]> wrote:
> Andre's case is different. He observes the complete distribution of
> the dependent variable but has truncated independent variables
> (firm data). Moreover his dependent variable is a discrete one.
>
> Rafa
>
> ----- Original Message -----
> From: "Rose Medeiros" <[email protected]>
> To: <[email protected]>
> Sent: Saturday, January 21, 2006 6:35 PM
> Subject: Re: st: Re: sample selection in multinomial logit
>
>
> > Rafa (and Andre),
> > I am not sure that the approach Rafa recommend is correct. If I recall
> > correctly, in sample selection models, what is being adjusted for is
> > missing values in the dependent variable. The case here appears to be one
> > in which the independent variables are missing. If you don't have any
> > information on the firms that aren't limited liability companies, then you
> > may (I say may in deference to anyone who knows otherwise) be limited to
> > running your analysis on only limited liability companies, since they are
> > the population on which you have data. If you have some data, for example,
> > the type of company they are for the other firms, then you could examine
> > the relationship between your outcome and type of company (or whatever
> > other data you have).
> > HTH,
> > Rose
> >
> >>
> >> ----- Original Message ----- From: "Andr� Paul" <[email protected]>
> >> To: <[email protected]>
> >> Sent: Saturday, January 21, 2006 5:24 PM
> >> Subject: st: sample selection in multinomial logit
> >>
> >>
> >>> Dear all
> >>>
> >>> I would like to estimate the effect of firm size (and some other
> >>> variables) on the outcome of a credit application (accepted, refused,
> >>> withdrawn by the firm, still under examination), using a multinomial
> >>> logit model.
> >>> I can observe the outcome for all firms (the data are exhaustive for a
> >>> certain population), but I have only the firm data (i.e the explanatory
> >>> variables) for limited liability companies, which, I guess, excludes a
> >>> lot of small firms.
> >>> Could someone advise me which statistical method would be appropriated
> >>> in this case and how to handle it with stata?
> >>>
> >>> Thanks a lot,
> >>> Andre
> >>>
--
Stas Kolenikov
http://stas.kolenikov.name
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/