Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: re:re: data creation for hazard regression
From
Kenisha Russell <[email protected]>
To
"[email protected]" <[email protected]>
Subject
st: re:re: data creation for hazard regression
Date
Mon, 11 Jun 2012 09:58:08 +0000
Hi Austin,
I am using stata 10.1
Thank you for taking the time out to answer my question. I do apologise, if I was not clear, it was my first time posting.
Research goal: Using a discrete time competing risks hazard model I would like to analyse entry into first union (i.e marriage or cohabitation), and pregnancy is one of the explanatory variables.
The data has been transformed into person-months (i.e what I previously referred to as century-months data)
Because I had the year and month in which each child was born, I then executed the steps outlined below:
Step 1: I created childbearing histories
/* Create century months for birth of each child (here maximum # of children is 3).
using a loop running the code first for 1st, then 2nd, then 3rd child */
forval x = 1/3 {
gen CMchild`x'=ym(childy`x', childm`x')
recode CMchild`x'.=999999
}
Step 2: Then in order to create a variable for pregnancy I.
gen CMpregnancy=.
forval x = 1/3 {
replace CMpregnancy`x'=CMchild`x'-7 if CMchild`x'!=999999
replace CMpregnancy`x'==999999 if CMpregnancy`x'===.
}
After stset, and running the above commands my data currently looks like this.
id _t0 _t _d _st _origin CMchild1 CMchild2 CMchild3
3 0 68 0 1 1997m6 583 999999 999999
4 75 278 0 1 1985m10 999999 999999 999999
11 476 0 1 1969m4 248 338 999999
12 258 0 1 1987m6 401 424 509
13 27 230 0 1 1989m10 421 999999 999999
14 0 198 0 1 1992m6 999999 999999 999999
15 68 86 1 1 1986m5 476 999999 999999
I have checked the math and the outcomes seem to be correct, for example for Id # 11 where CMchild==338, CMpregnancy1 calculated from 7 months before was at time 331.
So my question is, if the above is correct, do I now need to stsplit the dataset so that there's one data row per person per month at risk of pregnancy?
If I do split the event, if my reasoning is correct I assume I would need to stop each pregnancy at the point where each child is born.
Is that correct? If so, how would I do that? You suggested that I created a contemporaneous time variable, can you explain how?
Also With regards to your earlier question Austin:
Are you sure every child is a biological child? Yes, I am sure all the children are biological
Are there women with more than 3 children in the data? There are no women with more than 3 children in this data
Do you have any information on gestational age at birth? I have no information about gestational age at birth.
I hope that this time my goal and question is much clearer.
Best,
Kenisha
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/