Dear All,
I sent a question yesterday about trying to infer the population
regression parameter between parents and their offspring years of
education from a sample of teen kids only (see below). Maarten buis
sent an interesting suggestion about estimating logits for transitions
between levels. However, I want to find a unique "b" parameter as in:
EduChild=a+b*EduParent
and not one parameter for each transition. I thought about some form
of tobit that would allow different censoring values for different
observations - the problem with the equation above being that some of
the children will have, say, 12 years of education exactly, and some
have 12 years but will carry on with their education, and only those
are censored.
Suggestions welcome.
Many thanks
A. Carrizo
PS: I changed my email address!
----------------------------------
Dear All,
I would like to estimate the relationship between parents and
children's years of education:
EduChild=a+b*EduParent
I only have data for children up to 19 years old, so some of them will
go on studying, so the b coefficient will be biased (at least it will
be if I want to infer the population's beta from this subset of the
population). The dependent variable is thus "censored" for some
observations - for those I know that are still at school when they are
observed.
Is there a way to estimate some form of Tobit model in Stata, where I
can specify different censoring points for each observation and obtain
the parameter b? Is it more convenient to attempt the estimation of b
by OLS?
Thanks in advance
AC
----------------------------------------
From: Maarten buis <[email protected]>
Phuong:
The easiest way to deal with this problem is to estimate what is known
in the social
stratification literature as a Mare model (Mare 1980 and Mare 1981).
Say the educational system
you study has four levels, and everybody has to finish all lower
levels in order to obtain a
certain level, than knowing someones highest eachieved level of
education also implies knowing all
transitions that persons must have passed in order to get there. So a
person who has finished
level two, must have passed the first and second transition. The Mare
model models the probability
of passing a transition. You can estimate one by making a three dummie
variables: one that equals
one if the person passed the first transition and zero if he/she
fails, one that equals one if the
preson passed the second transition, zero if he/she fails, and missing
if he/she failed the first
transition, and one that equals one if a person passed the third
transition, zero if he/she fails,
and missing if he/she failed either the first or second transition.
Estimate a separate -logit- or
-probit- on each variable. See the example below.
Big advantage for you is that it deals with right censoring in a quite
natural way, censored cases
can be dealt as any other as long as you know the highest achieved
level of education at time of
the interview. Disadvantage is that now you don't get one effect for
each explanatory variable but
as many effects as there are transitions.
HTH,
Maarten
Mare, Robert D. 1980. "Social Background and School Continuation
Decisions." Journal of
the American Statistical Association 75(370), pp. 295-305.
Mare, Robert D. 1981. "Change and Stability in Educational
Stratification." American Sociological
Review 46(1), pp. 72-87.
*------------begin example--------------
sysuse nlsw88, clear
/*preliminary data prep*/
tab grade
gen ed = grade>=12
replace ed = 2 if grade >=13 & grade <16
replace ed = 3 if grade >=16
tab ed
tab race
gen white = race == 1
/*generate transition dummies*/
gen ed01 = ed>=1
gen ed12 = ed>=2 if ed>=1
gen ed23 = ed>=3 if ed>=2
/*estimate the Mare model*/
logit ed01 white south
logit ed12 white south
logit ed23 white south
*--------------end example--------------
--- Phuong Lan Nguyen <[email protected]> wrote:
I am working on years of schooling variable for all individuals who are in
school or already completed their education. Since I plan to run ordered
probit regression, I guess I need to have a special command for the
censored values in the ordered probit regression. Does anyone run it
before? Please give me advice on how to set up the dependent variable and
the ordered probit regression.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/