Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: choosing how to collapse very large datasets
From
Austin Nichols <[email protected]>
To
[email protected]
Subject
Re: st: choosing how to collapse very large datasets
Date
Thu, 21 Oct 2010 23:34:08 -0400
Hind Sbihi <[email protected]>:
You should also have a participant id, right? Try e.g.
g bysec=floor(time)
collapse hr-rating, by(id songid bysec) fast
egen g=group(id songid)
su g, mean
loc maxg=r(max)
foreach v of var hr-skintemp {
g mean_`v'=.
g trend_`v'=.
g var_`v'=.
qui forv i=1/`maxg' {
qui reg `v' bysec if g==`i'
replace trend_`v'=_b[bysec] if g==`i'
replace mean_`v'=_b[_cons]+22.5*_b[bysec] if g==`i'
replace var_`v'=e(rsme) if g==`i'
}
}
xtreg rating mean* trend* var*, i(id)
On Thu, Oct 21, 2010 at 11:14 PM, Hind Sbihi <[email protected]> wrote:
> Hello stata users
>
> The data I have collected has physiological measurements (variables in col 3 to 7) collected at 256Hz while study participants listen to a song and give the song a rating (last column).
> Because of the chosen frequency we generated 256 observations per second.
> Every study participant (n=50) listens to 45 second excerpt for each of 37 songs.
> The volume of the data set is simply overwhelming at this stage and I am considering different options for starting at least to visualize the data (e.g. rating vs. physiologic responses) before doing any analysis.
>
> My question is: how can I aggregate the data?
> Collapse() seems to be the appropriate command but I am wondering which arguments should go in the command.
> Below is a snapshot of what the data looks like for the first song for one participant.
>
> time songid hr hraccel scr dscr emg resprate skintemp rating
> 0 1 000063.73 -00000.87 000001.72 -00000.00 000003.49 000050.44 000028.15 4
> .0039063 1 000063.73 -00000.87 000001.72 -00000.00 000003.49 000050.44 000028.15 4
> .0078125 1 000063.73 -00000.87 000001.72 -00000.00 000003.49 000050.44 000028.15 4
> .011719 1 000063.73 -00000.87 000001.72 -00000.00 000003.49 000050.44 000028.15 4
> .015625 1 000063.73 -00000.87 000001.72 -00000.00 000003.49 000050.44 000028.15 4
> .019531 1 000063.73 -00000.87 000001.72 -00000.00 000003.49 000050.44 000028.15 4
> .023438 1 000063.73 -00000.87 000001.72 -00000.00 000003.49 000050.44 000028.15 4
> .027344 1 000063.73 -00000.87 000001.72 -00000.00 000003.48 000050.44 000028.15 4
> .03125 1 000063.73 -00000.87 000001.72 -00000.00 000003.48 000050.44 000028.15 4
> .035156 1 000063.73 -00000.87 000001.72 -00000.00 000003.48 000050.43 000028.15 4
>
> Many thanks in advance for your suggestions.
>
> Hind
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/