[email protected]
I have a simple dataset with incomes and other variables in long form.
I need to
compute the covariance between a variable in time 1 with a variable in
time 2.
I know how to do this by transforming the dataset into a wide format,
but I
would rather use the long form.
The dataset is something like this (note that y1 y2 a1 a2 are just the
original
variable by period of time):
id t y a y1 y2 a1 a2
1 1 10 3 10 . 3 .
2 1 20 4 20 . 4 .
3 1 30 5 30 . 5 .
1 2 11 5 . 11 . 5
2 2 22 6 . 22 . 6
3 2 33 7 . 33 . 7
I need the correlation between, say, y1/a2 and y2/a1 by id. Collapsing
the
dataset works fine (giving basically the wide form), but I thought
there might
be a simple way of creating a new variable containing the information
for each
id but from the other period of time, in my case for instance a
variable like aa
that just inverts the order of the original a:
id t aa a1 a2
1 1 5 3 .
2 1 6 4 .
3 1 7 5 .
1 2 3 . 5
2 2 4 . 6
3 2 5 . 7
I hope my question is clear enough. Thank you very much!
>>> I am not clear on how you want this information, but
with just two time periods here is one way of
getting your -aa- variable
gen a = max(a1, a2)
bysort id : gen aa = a[3 - _n]
Within each id, _n takes on two values 1 and 2. _n of 1 gives
3 - _n of 2, and _n of 2 gives 3 - _n of 1.
We don't even need to sort within -id- on time, although that
would do no harm:
bysort id (time) : gen aa = a[3 - _n]
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/