Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Robert Paul <pt162591@yahoo.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | st: panel data analysis advice |
Date | Mon, 10 Mar 2014 14:09:49 -0700 (PDT) |
Dear Statalist, I have demographic and treatment information for patients chronic disease (N=60,000). I got permission to link a subset of my data to income data (18.5%). For this subset I have 20 years panel data. The data in long format looks Id year income age … 1 1990 100 45 1 1991 110 45 1 1992 125 45 1 1993 132 45 . . . My aim is a- to estimate the effect of demographic, treatment, and being chronic disease patient, on patient’s income; and b- to evaluate differences in income between patients and the general population (when linked to control population) to address these issues I plan a- to run a Fixed and Random effects model , to start with then run Hausman test … b- I will also get a control group for my data - (from general population without chronic disease -matched by demographic vars) --- for this I plan to use Hausman-Taylor that utilizes the vars as instruments and provide parameter estimate for time-invariant variable (major variable of interest – chronic disease patient or not) Dependent variable – log equivalized income RHS vars – age at end of follow-up, age^2, age at diagnosis, treatment type 1. Run xtreg logincome age age_square age at diagnosis treatment type dummies . . , fe 2. xtreg logincome age age_square age at diagnosis treatment type dummies . . . . , re 3. xtreg logincome age age_square age at diagnosis treatment type dummies . . . , re vce(robust) or 4. xtreg logincome age age_square age at diagnosis treatment type dummies . . . , re vce(cluster id) The aim of using vce or cluster is to produce consistent VCE estimator when the disturbances are not identically distributed over the panels. 5. ** Hausman Taylor estimation . xthtaylor logincome age age_square age at diagnosis treatment_type dummies, endog(age treatment type dummies) My question, as I am new to panel data analysis, is if I am doing the right way to address my question. 1. Do I need to calculate weights because I am using a subset of the population? If yes, how do I do that? 2. I am not sure – probably using dynamic models would be more appropriate 3. I need advice on my analysis procedure. This is of critical importance for my project. I appreciate your valuable comments. Thanks * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/