Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: choosing how to collapse very large datasets
From
hind lazrak <[email protected]>
To
[email protected]
Subject
Re: st: choosing how to collapse very large datasets
Date
Fri, 22 Oct 2010 16:17:53 -0700
Thank you very much Austin for your reply. This helped me a lot.
You are right the participant id is missing because I wanted to first
start by looking at one participant at the time.
Hind
p.s.: due to a system problem with my previous address
([email protected]) I created this new address to close this
thread and acknowledge Austin's helpful reply.
>
> -----Original Message-----
> From: Austin Nichols [mailto:[email protected]]
> Sent: Thursday, October 21, 2010 8:34 PM
> To: [email protected]
> Subject: Re: st: choosing how to collapse very large datasets
>
> Hind Sbihi <[email protected]>:
> You should also have a participant id, right? Try e.g.
>
> g bysec=floor(time)
> collapse hr-rating, by(id songid bysec) fast
> egen g=group(id songid)
> su g, mean
> loc maxg=r(max)
> foreach v of var hr-skintemp {
> g mean_`v'=.
> g trend_`v'=.
> g var_`v'=.
> qui forv i=1/`maxg' {
> qui reg `v' bysec if g==`i'
> replace trend_`v'=_b[bysec] if g==`i'
> replace mean_`v'=_b[_cons]+22.5*_b[bysec] if g==`i'
> replace var_`v'=e(rsme) if g==`i'
> }
> }
> xtreg rating mean* trend* var*, i(id)
>
> On Thu, Oct 21, 2010 at 11:14 PM, Hind Sbihi <[email protected]>
> wrote:
>> Hello stata users
>>
>> The data I have collected has physiological measurements (variables in
> col 3 to 7) collected at 256Hz while study participants listen to a song
> and give the song a rating (last column).
>> Because of the chosen frequency we generated 256 observations per
> second.
>> Every study participant (n=50) listens to 45 second excerpt for each of
> 37 songs.
>> The volume of the data set is simply overwhelming at this stage and I
> am considering different options for starting at least to visualize the
> data (e.g. rating vs. physiologic responses) before doing any analysis.
>>
>> My question is: how can I aggregate the data?
>> Collapse() seems to be the appropriate command but I am wondering which
> arguments should go in the command.
>> Below is a snapshot of what the data looks like for the first song for
> one participant.
>>
>> time songid hr hraccel scr dscr
> emg resprate skintemp rating
>> 0 1 000063.73 -00000.87 000001.72 -00000.00 000003.49
> 000050.44 000028.15 4
>> .0039063 1 000063.73 -00000.87 000001.72 -00000.00
> 000003.49 000050.44 000028.15 4
>> .0078125 1 000063.73 -00000.87 000001.72 -00000.00
> 000003.49 000050.44 000028.15 4
>> .011719 1 000063.73 -00000.87 000001.72 -00000.00
> 000003.49 000050.44 000028.15 4
>> .015625 1 000063.73 -00000.87 000001.72 -00000.00
> 000003.49 000050.44 000028.15 4
>> .019531 1 000063.73 -00000.87 000001.72 -00000.00
> 000003.49 000050.44 000028.15 4
>> .023438 1 000063.73 -00000.87 000001.72 -00000.00
> 000003.49 000050.44 000028.15 4
>> .027344 1 000063.73 -00000.87 000001.72 -00000.00
> 000003.48 000050.44 000028.15 4
>> .03125 1 000063.73 -00000.87 000001.72 -00000.00
> 000003.48 000050.44 000028.15 4
>> .035156 1 000063.73 -00000.87 000001.72 -00000.00
> 000003.48 000050.43 000028.15 4
>>
>> Many thanks in advance for your suggestions.
>>
>> Hind
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/