Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: panel data analysis advice
From
Robert Paul <[email protected]>
To
"[email protected]" <[email protected]>
Subject
st: panel data analysis advice
Date
Mon, 10 Mar 2014 14:09:49 -0700 (PDT)
Dear Statalist,
I have demographic and treatment information for patients chronic disease (N=60,000). I got permission to link a subset of my data to income data (18.5%). For this subset I have 20 years panel data.
The data in long format looks
Id year income age …
1 1990 100 45
1 1991 110 45
1 1992 125 45
1 1993 132 45
.
.
.
My aim is
a- to estimate the effect of demographic, treatment, and being chronic disease patient, on patient’s income; and
b- to evaluate differences in income between patients and the general population (when linked to control population)
to address these issues I plan
a- to run a Fixed and Random effects model , to start with then run Hausman test …
b- I will also get a control group for my data - (from general population without chronic disease -matched by demographic vars) --- for this I plan to use Hausman-Taylor that utilizes the vars as instruments and provide parameter estimate for time-invariant variable (major variable of interest – chronic disease patient or not)
Dependent variable – log equivalized income
RHS vars – age at end of follow-up, age^2, age at diagnosis, treatment type
1. Run xtreg logincome age age_square age at diagnosis treatment type dummies . . , fe
2. xtreg logincome age age_square age at diagnosis treatment type dummies . . . . , re
3. xtreg logincome age age_square age at diagnosis treatment type dummies . . . , re vce(robust) or
4. xtreg logincome age age_square age at diagnosis treatment type dummies . . . , re vce(cluster id)
The aim of using vce or cluster is to produce consistent VCE estimator when the disturbances are not identically distributed over the panels.
5. ** Hausman Taylor estimation
. xthtaylor logincome age age_square age at diagnosis treatment_type dummies, endog(age treatment type dummies)
My question, as I am new to panel data analysis, is if I am doing the right way to address my question.
1. Do I need to calculate weights because I am using a subset of the population? If yes, how do I do that?
2. I am not sure – probably using dynamic models would be more appropriate
3. I need advice on my analysis procedure. This is of critical importance for my project. I appreciate your valuable comments.
Thanks
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/