Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: preserve-restore
From
Ulrich Kohler <[email protected]>
To
[email protected]
Subject
Re: st: preserve-restore
Date
Thu, 14 Oct 2010 10:09:13 +0200
Am Donnerstag, den 14.10.2010, 08:16 +0200 schrieb Grethe Søndergaard:
> Dear Statalist
>
> For some reason, the layout of the e-mail I sent to you a couple of
> days ago was quite messy, since a lot of symbols had sneaked into it
> after sending it. Therefore I am sending it again:
>
> I have a couple of questions about the preserve-restore procedure and stset.
>
> My dataset:
> id father-id mother-id death var ...
> 1 1 10 0 1
> 2 1 10 1 1
> 3 1 20 1 1
> 4 1 20 0 1
> 5 2 10 1 1
> 6 2 10 0 1
> 7 3 30 0 1
> 8 3 30 1 1
> ...
> save \Temp\hs.dta", replace
>
>
> I want to compare all maternal half siblings within a family as well
> as all paternal siblings within a family. In order to do this, I start
> out by creating an id-variable for full siblings, paternal half
> siblings or maternal half siblings and afterwards I run preserve-restore:
>
> egen gruppe = group(father-id mother-id)
> egen mgr = group(mother-id)
> egen fgr = group(father-id)
>
>
> forvalues x = 1/8{
> preserve
>
> *MATERNAL HALF SIBLINGS*
> gen strata_mother = `x' if ((mgr==mgr[`x']) & gruppe != gruppe[`x']) |_n==`x'
>
> *PATERNAL HALF SIBLINGS*
> gen strata_father = `x' if ((fgr==fgr[`x']) & gruppe != gruppe[`x']) | _n==`x'
>
> drop if strata_mother==. & strata_father==.
>
> if `x' == 1 {
> save " \Temp\hs.dta", replace
> }
> else {
> append using " \Temp\hs.dta"
> save " \Temp\hs.dta", replace
> }
> restore
> }
>
> I have the following questions:
> 1. Is there any way to make preserve-restore run faster (my dataset
> contains more than 2 mil. observations so it takes about two days to
> run it)
> 2. Is it problematic to create strata_father after creating
> strata_mother in the same preserve-restore statement?
> 3. I want to use strata_father and strata_mother as strata
> variables in a cox regression analysis - and I want to perform the
> analyses separately for females and males. Since preserve-restore runs
> slowly, I want to state this after having run it. However, it seems as
> if it doesnt work to state that I only want to include e.g. males in
> stset (if sex==M). Males who experience an event,
> but who has no brothers but a half sister still counts as an
> event. Is there any way to state in stset, that I only want to compare
> males and that I only want to include events, if the male who
> experience it has one or more brothers?
>
> I hope this is clear but since I am not an experienced user of
> Stata, please let me know if you need more details.
I propose to put -preserve- outside the loop and use -restore, preserve-
at the end. This way, The file is only saved once. -restore, preserve-
restores the file, while keeping it preserved.
preserve
forv x = 1/8 {
...
restore, preserve
}
In terms of speed it should be faster to save each of the "small" files
and do the append afterwards:
forv x = 1/8 {
...
save f`i'
}
use f1
forv x = 2/8 [
append using f`2'
}
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/