Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: preserve-restore and stset
From
Grethe Søndergaard <[email protected]>
To
[email protected]
Subject
st: preserve-restore and stset
Date
Mon, 11 Oct 2010 13:45:12 +0200
Dear Statalist
I have a couple of questions about the preserve-restore procedure and stset.
My dataset:
id father-id mother-id death var ...
1 1 10 0 1
2 1 10 1 1
3 1 20 1 1
4 1 20 0 1
5 2 10 1 1
6 2 10 0 1
7 3 30 0 1
8 3 30 1 1
...
save \Temp\hs.dta", replace
I want to compare all maternal half siblings within a family as well
as all paternal siblings within a family. In order to do this, I start
out by creating an id-variable for full siblings, paternal half
siblings or maternal half siblings and afterwards I run preserve-restore:
*full siblings*
egen gruppe = group(father-id mother-id)
*maternal half siblings*
egen mgr = group(mother-id)
*paternal half siblings*
egen fgr = group(father-id)
forvalues x = 1/8{
preserve
*MATERNAL HALF SIBLINGS*
gen strata_mother = `x' if ((mgr==mgr[`x']) & gruppe != gruppe[`x']) |_n==`x'
*strata in which no events occur*:
by strata_mother, sort: egen n_dead_mother = total(death)
replace strata_mother=. if n_dead_mother==0
*strata with only one person*
by strata_mother, sort: egen n_var_mother = total(var)
replace strata_mother=. if n_var_mother==1
*PATERNAL HALF SIBLINGS*
gen strata_father = `x' if ((fgr==fgr[`x']) & gruppe != gruppe[`x']) | _n==`x'
*strata in which no events occur*:
by strata_father, sort: egen n_dead_father = total(death)
replace strata_father=. if n_dead_father==0
*strata with only one person*
by strata_father, sort: egen n_var_father = total(var)
replace strata_father=. if n_var_father==1
drop if strata_mother==. & strata_father==.
if `x' == 1 {
save " \Temp\hs.dta", replace
}
else {
append using " \Temp\hs.dta"
save " \Temp\hs.dta", replace
}
restore
}
I have the following questions:
1. Is there any way to make preserve-restore run faster (my dataset
contains more than 2 mil. observations so it takes about two days to
run it)
2. I am worried that creating “strata_father” after creating
“strata_mother” is problematic. Is it okay to do that in the same
preserve-restore statement?
3. I want to use “strata_father” and “strata_mother” as strata
variables in a cox regression analysis - and I want to perform the
analyses separately for females and males. Since preserve-restore runs
slowly, I want to state this after having run it. However, it seems as
if it doesn’t work to state that I only want to include e.g. males in
stset (if sex==M). As far as I can see, males who experience an event,
but who has no brothers but a half sister still counts as an
event. Is there any way to state in stset, that I only want to compare
males – and that I only want to include events, if the male who
experience it has one or more brothers?
I hope this is clear – but since I am not an experienced user of
Stata, please let me know if you need more details.
Thank you
Grethe
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/