May be the solution I was thinking of is not the most efficient. So,
please allow me to 'lay it on the table.'
my do-file in short contains the following (hem.do):
tempfile tmp1 tmp2
use "E:\Data\clinvisDir_3jun05.dta", clear /* id dov dob sex geno etc*/
drop weight height _merge clinic
keep if cohort==1
keep if geno==1|geno==34
recode geno (34=2)
label define genolbl 1 AA 2 SS
label values geno genolbl
sort id dov
save `tmp1'
use "E:\Data\haem_22jun2005.dta", clear /*id dov hba2 hb hbf etc*/
drop haem_age micrf nrbc - rdw
sort id dov
save `tmp2'
use `tmp1', clear
merge id dov using `tmp2'
drop if _merge==2
drop _merge
rename code c
rename numb n
* Mark steady state codes as one, zero otherwise
mark ssc if c==62 | c==111 | c==230 | c==268 | c==369 | /*
*/ c==73 | c==112 | c==231 | c==269 | c==370
label var ssc "Steady state codes"
label define ssclbl 0 "no" 1 "yes", modify
label values ssc ssclbl
** Keep any combination of the above steady state codes
bysort id dov (ssc): generate flag = ssc[1]
drop if flag==0
drop flag
keep if geno==2
drop if dov >= d(1jan90) /*MY PROBLEM LINE*/
** Duplicated hematology over same dates exist (due to merging)
** This can over estimate the mean
collapse (mean) hba2 hb hbf, by(dov id)
** SS steady state
collapse (mean) hba2 hb hbf, by(id)
exit
My challenge is to create the following data set given id and date of
onset from the above:
id date of onset hba2 hb hbf
5 1jan89 x y z
5 2jan90 x y z
5 2jan96
7 11oct97
7 8nov99 x y z
Rather than having a single cut off date for all subjects, I need to
create the mean hba2, hb, hbf up to 1jan89 for the first obs then mean
hba2, hb, hbf up to 2 jan90 and so on. This is my plight. Can it be
done?
Raphael
On 11/24/05, Nick Cox <[email protected]> wrote:
> Without knowing any more than Svend does what
> is in Raphael's -hem.do-, it does seem likely
> that the whole strategy of producing hundreds
> of little data files and then -append-ing them
> could be replaced by using just one file and
> looping over the possibilities in place.
>
> Nick
> [email protected]
>
> Svend Juul
>
> > Raphael wrote:
> >
> > I have a do-file called hem.do which I would like to run. But before I
> > run this do-file I need to drop some observations if dov >= d(1jan90).
> > Hence
> >
> > use data, clear
> > drop if dov >= d(1jan90)
> > do hem.do
> > keep if id==5
> > save newdata1, replace
> >
> > I would like to repeat this process about a thousand times with
> > different dates and ids located in a separate file (SEE SNIPPET) then
> > post to a single file. For example,
> >
> > use data, clear
> > drop if dov >= d(2feb90)
> > do hem.do
> > keep if id==5
> > save newdata2, replace
> >
> > use newdata1, clear
> > forvalues i=2/1000 {
> > append using newdata`i'
> > }
> >
> > SNIPPET
> > id date_of_onset
> > 5 1jan90
> > 5 2feb90
> > 5 6jun96
> > 7 10oct97
> > 7 25dec99
> >
> > Can this be done?
> > --------------------------------------------------------------
> > ----------
> > --------
> >
> > Raphael,
> >
> > Take a look at this; it may - or may not - do what you want.
> >
> > Good luck,
> > Svend
> >
> > // generating testdata
> > clear
> > input id str7 sdov
> > 1 1jan89
> > 2 2jan96
> > 5 2jan90
> > 7 11oct97
> > end
> > gen dov=date(sdov,"dmy",2006)
> > format dov %d
> > drop sdov
> > save data , replace
> >
> > // generating snippet
> > clear
> > input nid str7 sonset
> > 5 1jan90
> > 5 2feb90
> > 5 6jun96
> > 7 10oct97
> > 7 25dec99
> > end
> > gen onset = date(sonset,"dmy",2006)
> > format onset %d
> > drop sonset
> > local N=_N
> > save snippet , replace
> >
> > // combine them, snippet first; it has `N' observations
> > append using data
> > save data2 , replace
> >
> > . list
> >
> > +----------------------------------+
> > | nid onset id dov |
> > |----------------------------------|
> > 1. | 5 01jan1990 . . |
> > 2. | 5 02feb1990 . . |
> > 3. | 5 06jun1996 . . |
> > 4. | 7 10oct1997 . . |
> > 5. | 7 25dec1999 . . |
> > |----------------------------------|
> > 6. | . . 1 01jan1989 |
> > 7. | . . 2 02jan1996 |
> > 8. | . . 5 02jan1990 |
> > 9. | . . 7 11oct1997 |
> > +----------------------------------+
> >
> > forvalues i=1/`N' {
> > use data2 , clear
> > drop if dov >= onset[`i'] & dov < .
> > do hem.do
> > keep if id==nid[`i']
> > save newdata`i' , replace
> > }
> >
> > use newdata1 , clear
> > forvalues i=2/`N' {
> > append using newdata`i'
> > }
> >
> > . list
> >
> > +------------------------------+
> > | nid onset id dov |
> > |------------------------------|
> > 1. | . . 5 02jan1990 |
> > 2. | . . 5 02jan1990 |
> > 3. | . . 7 11oct1997 |
> > +------------------------------+
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
-
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/