Sarah Mustillo
> I have a longitudinal data set in long form with annual obs
> for 6 years on 1450 kids. I am trying to create a cumulative variable
> such that the value for age 10 is the sum of the values at age 9 and age
10, the value
> for age 11 is the sum of the value for 9, 10, & 11 and so on. I can
figure out
> how to do this in wide form, but I am going to have to repeat it for about
50
> different variables. It would be much easier if I could just keep the
> data in long form. I experimented with _n, and by id age, but
> couldn't get what I wanted (e.g., for age 11 I could get the sum of age 10
(_n-1) and
> age 11, but not age 9, 10, &11).
My advice is definitely to try to keep the data long. In Stata, most things
are
easier done long. There are exceptions, as when what you want is
provided by some -egen, r*()- function, but for most longitudinal
stuff, long is better in my experience.
You have an identifier for each child -id-, an -age-, and, generically,
a -response-.
It sounds as if you need just the result of -sum()-, which gives
the cumulative sum. So we need to sort on -id- and then within
-id- on -age-.
bysort id (age) : gen Cresponse = sum(response)
-bysort id- ensures that we do this separately for each -id-.
-bysort id (age)- ensures that we do it separately and in the
right age order.
> Once I create this new variable, is there an easy way to repeat the same
> thing for 50 different variables? I usually just copy and paste in my do
> file, changing variable names, but lately the do files have gotten rather
long. It seems
> there is probably an easy way to have it do the same thing over and over
for
> different variables, but I can't figure it out.
We just need to -sort- once
sort id age
and then use -foreach- to cycle through a varlist
foreach v of var <varlist> {
by id : gen C`v' = sum(`v')
}
You must plug in your <varlist>. It can be in abbreviated form,
using * ? and/or -, etc.
Naturally, you can use your own naming convention, but
this is usually easiest with just some short prefix or suffix
added to the original variable name to give a new name.
This particular problem can also be done with -for-.
My main reservation about -for- is that it doesn't
grow gracefully when extended to more complicated
problems, whereas -foreach- typically does.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/