I think you're moving in the right direction
for a -reshape- to wide. As for the ANOVA,
your original data structure looks better.
egen j = concat(s1*), p("_")
drop s1*
reshape wide s2peak, i(animal) j(j) string
Nick
[email protected]
David Airey
> I have a reshape question. I find this one of the hardest
> commands to
> remember how to use.
>
> I cannot find a help example that exactly parallels my situation. I
> have an identifier that is split between variables. This
> situation is
> common in ANOVA where treatment cells may be identified by
> more than
> one factor.
>
> I have data like:
>
> . list, sep(6)
>
> +----------------------------------------+
> | s1level s1s2de~y animal s2peak~e |
> |----------------------------------------|
> 1. | 0 50 1_1_0F 773.75 |
> 2. | 0 100 1_1_0F 1001.63 |
> 3. | 75 50 1_1_0F 472.5 |
> 4. | 75 100 1_1_0F 927.875 |
> 5. | 85 50 1_1_0F 611.375 |
> 6. | 85 100 1_1_0F 654.375 |
> |----------------------------------------|
> 7. | 0 50 1_1_1F 1116.88 |
> 8. | 0 100 1_1_1F 1101.38 |
> 9. | 75 50 1_1_1F 544.875 |
> 10. | 75 100 1_1_1F 567.875 |
> 11. | 85 50 1_1_1F 443.875 |
> 12. | 85 100 1_1_1F 466 |
> |----------------------------------------|
> 13. | 0 50 1_1_2F 309.5 |
> 14. | 0 100 1_1_2F 336.286 |
> 15. | 75 50 1_1_2F 442.625 |
> etc.
>
> where the first two variables s1level and s1s2delay define
> 6 treatment
> conditions from which s2peakvalue was measured fore each animal. I
> would like to reshape this data to calculate a ratio from the
> conditions within each animal. I would like to get a data set that
> looks like:
>
> animal s2peak0_50 s2peak0_100 s2peak75_50 s2peak75_100 s2peak85_50
> s2peak85_100
>
> in order to calculate a ratios of each of variables 4-7 with the
> average of variables 2 and 3. I can do this directly in the
> long form
> by the following code:
>
> egen step = seq(), from(0) to(5) block(1)
> gen ppi2 = ((s2peak[_n-step]+s2peak[_n-step+1])/2 -
> s2peak[_n])/((s2peak[_n-step]+s2peak[_n-step+1])/2)*100
> drop if s1level == 0
>
> +----------------------------------------------------+
> | s1level s1s2de~y animal s2peak~e ppi2 |
> |----------------------------------------------------|
> 1. | 75 50 1_1_0F 472.5 46.77181 |
> 2. | 75 100 1_1_0F 927.875 -4.527213 |
> 3. | 85 50 1_1_0F 611.375 31.12723 |
> 4. | 85 100 1_1_0F 654.375 26.28318 |
> |----------------------------------------------------|
> 5. | 75 50 1_1_1F 544.875 50.87344 |
> 6. | 75 100 1_1_1F 567.875 48.79973 |
> 7. | 85 50 1_1_1F 443.875 59.97971 |
> 8. | 85 100 1_1_1F 466 57.9849 |
> |----------------------------------------------------|
> 9. | 75 50 1_1_2F 442.625 -37.08108 |
> 10. | 75 100 1_1_2F 265 17.92943 |
> 11. | 85 50 1_1_2F 264.5 18.08428 |
> 12. | 85 100 1_1_2F 192.375 40.42141 |
> |----------------------------------------------------|
> 13. | 75 50 1_1_3F 448.875 50.06605 |
> 14. | 75 100 1_1_3F 462.143 48.5901 |
> 15. | 85 50 1_1_3F 576.875 35.82702 |
> etc.
>
> but I'm wondering if reshape to wide and then back to long
> would not be
> more reliable. As long as data are not missing, I currently have no
> problems. Must I, before I go for wide, say something like,
>
> . egen treatment = group(s1level s1s2delay), label
> . drop s1level s1s2delay
>
> +------------------------------+
> | animal s2peak~e treatm~t |
> |------------------------------|
> 1. | 1_1_0F 773.75 0 50 |
> 2. | 1_1_0F 1001.63 0 100 |
> 3. | 1_1_0F 472.5 75 50 |
> 4. | 1_1_0F 927.875 75 100 |
> 5. | 1_1_0F 611.375 85 50 |
> 6. | 1_1_0F 654.375 85 100 |
> |------------------------------|
> 7. | 1_1_1F 1116.88 0 50 |
> 8. | 1_1_1F 1101.38 0 100 |
> 9. | 1_1_1F 544.875 75 50 |
> 10. | 1_1_1F 567.875 75 100 |
> +------------------------------+
> etc.
>
> and only then,
>
> . reshape wide s2peakvalue, i(animal) j(treatment)
>
> animal s2peak~1 s2peak~2 s2peak~3 s2peak~4 s2peak~5
> s2peak~6
> 1. 1_1_0F 773.75 1001.63 472.5 927.875 611.375
> 654.375
> 2. 1_1_1F 1116.88 1101.38 544.875 567.875
> 443.875
> 466
> 3. 1_1_2F 309.5 336.286 442.625 265 264.5
> 192.375
> etc.
>
> but then I lose my way back to the proper long format for
> ANOVA as well
> as the factor labels, etc.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/