Many thanks for this, Austin. I tried the statsby command got the slope and intercept for each person.
While CD4 data is generally transformed in part to linearize the CD4-time relationship, a statistician who
worked with this dataset extensively told me that transformation is not needed in this data, given that
the transformed scale fits no better than the original scale. I would have loved to consult with this
statistician directly, but he unfortunately passed away.
I was thinking of doing individual trends because (1) I've seen another paper on my subject use this approach
and (2) I was unsure of the best way to proceed, which is why I was asking here whether that approach was faulty.
I was considering the slope comparison in HC users vs. non-users as a first look, and then hopefully finding
a more sophisticated strategy in which I could control for potential confounders. It seems from other responses
like GLLAMM might be the way to do that, so now I'm just trying to educate myself on how to go about it. My
data structure is (I think) a bit unusual, since subjects have measurements taken at different times since the
origin as I have defined it, and different numbers of measurements, so I wasn't quite sure how that should affect
the approach that I use.
I really appreciate all of the advice!
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Austin Nichols
Sent: Sunday, August 16, 2009 6:03 PM
To: [email protected]
Subject: Re: st: Thinking through best way to do a longitudinal analysis
Chelsea <[email protected]> :
You can use -statsby- or just calculate directly using a loop over
individuals with -foreach- or using time series operators and the
formula for regression. But why do you think a linear trend is
appropriate? Or are you planning to transform the depvar first? More
importantly, why do you want the individual trends? Start with
-xtmixed- regressions of the depvar on a spline in time interacted
with the dummy for contraception. But contraception is not
exogenous--do you have some strategy for causal inference? I.e. how
do you know any observed difference is the result of contraception
versus some other characteristic associated with contraception?
On Sun, Aug 16, 2009 at 4:23 PM, Polis, Chelsea B.<[email protected]> wrote:
> Thanks so much to both of you for the suggestions and all of these great references! I will download all of these articles and read them. In the meantime, do you think calculating individual slopes is a bad idea? If not, do you know how I might go about doing this in STATA?
>
> Thank you!
> Chelsea
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Cameron McIntosh
> Sent: Friday, August 14, 2009 10:36 PM
> To: STATA LIST
> Subject: RE: st: Thinking through best way to do a longitudinal analysis
>
> Chelsea,
>
> I'll try and point you in what I think might be a good direction. You may want to read up on growth mixture modeling:
>
> Wang, M., & Bodner, T.E. (2007). Growth Mixture Modeling: Identifying and Predicting Unobserved Subpopulations With Longitudinal Data. Organizational Research Methods, 10(4), 635-656.
>
> Dolan, C.V., Schmittmann, V.D., Lubke, G.H. & Neale, M.C. (2005). Regime switching in the latent growth curve mixture model. Structural Equation Modeling, 12, 94-119
> http://users.fmg.uva.nl/cdolan/semmix.pdf
>
> Muthén, B.O. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D.Kaplan (Ed.), Handbook of quantitative methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage.
>
> In this case you "know" your "latent classes" (hormonal contraception or not), but that's no problem. The varying assessment times and missing data could be a bit tricky but there is some literature out there:
>
> Blozis, S.A., & Cho, Y.I. (2008). Coding and centering of time in latent curve models in the presence of interindividual time heterogeneity. Structural Equation Modeling, 15(3), 413-433.
>
> Biesanz. J.C., Deeb-Sossa, N., Papadakis, A.A., Bollen, K.A., Curran, P..J. (2004). The role of coding time in estimating and interpreting growth curve models. Psychological Methods, 9(1), 30-52.
> http://www.unc.edu/~curran/pdfs/Biesanz,Deeb-Sossa,Papadakis,Bollen&Curran(2004).pdf
>
> Duncan, S.C., & Duncan, T.E. (1994). Modeling incomplete longitudinal substance use data using latent variable growth curve methodology. Multivariate Behavioral Research, 29(4), 313-338.
>
> -gllamm- might be able to do what you want (if not, Mplus for sure). Hope this is helpful,
>
> Cam
>
> ----------------------------------------
>> Date: Fri, 14 Aug 2009 13:27:15 -0700
>> From: [email protected]
>> Subject: Re: st: Thinking through best way to do a longitudinal analysis
>> To: [email protected]
>>
>> Another option is to use a linear mixed model approach.
>>
>> Scott Millis
>>
>>
>>
>>
>> --- On Fri, 8/14/09, Polis, Chelsea B. wrote:
>>
>>> From: Polis, Chelsea B.
>>> Subject: st: Thinking through best way to do a longitudinal analysis
>>> To: "[email protected]"
>>> Date: Friday, August 14, 2009, 2:38 PM
>>> I am trying to do an analysis on the
>>> CD4 decline trajectory of 190 HIV+ women, comparing
>>> those who were on hormonal contraception around the time of
>>> HIV seroconversion against those
>>> who weren't. Each subject in my sample has at least
>>> two (and as many as six) CD4 measurements,
>>> the first and the last of which include a time span of at
>>> least one year. I created a
>>> variable to anchor the CD4 measurements in time by
>>> generating a variable that indicates how
>>> many days since HIV seroconversion the CD4 measurement was
>>> taken. The data are not balanced
>>> (since women have anywhere between 2 to 6 measurements) and
>>> CD4 counts were not measured for
>>> each individual at the same point in time after
>>> seroconversion (for example, the first measurement
>>> available for each woman ranges in days since
>>> seroconversion from 69 to 1919).
>>>
>>> I think one way to go about this would be to calculate the
>>> individual slope for each subject and
>>> compare the slopes between contraceptive users and
>>> non-users using a t-test. Is there a command
>>> to obtain those kind of individual regression slopes for
>>> each woman, and would my data have to be
>>> in long or wide format?
>>>
>>> Or am I thinking about this improperly? Would it be
>>> better to construct a longitudinal marginal
>>> model with generalized estimating equations, and if so, can
>>> someone point me in the direction of
>>> a text that might help me figure out how to do that with
>>> what I think is probably an unusual data
>>> structure (many of the examples I have seen in coursework
>>> use data measured at regular intervals)?
>>>
>>> Many thanks,
>>> Chelsea
>>>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/