Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: choosing how to collapse very large datasets

From	hind lazrak <[email protected]>
To	[email protected]
Subject	Re: st: choosing how to collapse very large datasets
Date	Fri, 22 Oct 2010 16:17:53 -0700

Thank you very much Austin for your reply. This helped me a lot.

You are right the participant id is missing because I wanted to first
start by looking at one participant at the time.

Hind
p.s.: due to a system problem with my previous address
([email protected]) I created this new address to close this
thread and acknowledge Austin's helpful reply.



>
> -----Original Message-----
> From: Austin Nichols [mailto:[email protected]]
> Sent: Thursday, October 21, 2010 8:34 PM
> To: [email protected]
> Subject: Re: st: choosing how to collapse very large datasets
>
> Hind Sbihi <[email protected]>:
> You should also have a participant id, right?  Try e.g.
>
> g bysec=floor(time)
> collapse hr-rating, by(id songid bysec) fast
> egen g=group(id songid)
> su g, mean
> loc maxg=r(max)
> foreach v of var hr-skintemp {
> g mean_`v'=.
> g trend_`v'=.
> g var_`v'=.
> qui forv i=1/`maxg' {
>  qui reg `v' bysec if g==`i'
>  replace trend_`v'=_b[bysec] if g==`i'
>  replace mean_`v'=_b[_cons]+22.5*_b[bysec] if g==`i'
>  replace var_`v'=e(rsme) if g==`i'
>  }
> }
> xtreg rating mean* trend* var*, i(id)
>
> On Thu, Oct 21, 2010 at 11:14 PM, Hind Sbihi <[email protected]>
> wrote:
>> Hello stata users
>>
>>  The data I have collected has physiological measurements (variables in
> col 3 to 7) collected at 256Hz while study participants listen to a song
> and give the song a rating (last column).
>>  Because of the chosen frequency we generated 256 observations per
> second.
>> Every study participant (n=50) listens to 45 second excerpt for each of
> 37 songs.
>>  The volume of the data set is simply overwhelming at this stage and I
> am considering different options for starting at least to visualize the
> data (e.g. rating vs. physiologic responses) before doing any analysis.
>>
>>  My question is: how can I aggregate the data?
>>  Collapse() seems to be the appropriate command but I am wondering which
> arguments should go in the command.
>>  Below is a snapshot of what the data looks like for the first song for
> one participant.
>>
>>  time   songid          hr     hraccel         scr        dscr
> emg    resprate    skintemp   rating
>>  0        1   000063.73   -00000.87   000001.72   -00000.00   000003.49
>   000050.44   000028.15        4
>>  .0039063        1   000063.73   -00000.87   000001.72   -00000.00
> 000003.49   000050.44   000028.15        4
>> .0078125        1   000063.73   -00000.87   000001.72   -00000.00
> 000003.49   000050.44   000028.15        4
>> .011719        1   000063.73   -00000.87   000001.72   -00000.00
> 000003.49   000050.44   000028.15        4
>> .015625        1   000063.73   -00000.87   000001.72   -00000.00
> 000003.49   000050.44   000028.15        4
>> .019531        1   000063.73   -00000.87   000001.72   -00000.00
> 000003.49   000050.44   000028.15        4
>> .023438        1   000063.73   -00000.87   000001.72   -00000.00
> 000003.49   000050.44   000028.15        4
>> .027344        1   000063.73   -00000.87   000001.72   -00000.00
> 000003.48   000050.44   000028.15        4
>> .03125        1   000063.73   -00000.87   000001.72   -00000.00
> 000003.48   000050.44   000028.15        4
>> .035156        1   000063.73   -00000.87   000001.72   -00000.00
> 000003.48   000050.43   000028.15        4
>>
>>  Many thanks in advance for your suggestions.
>>
>>  Hind
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: xtile command for calculating tertiles
Next by Date: re: st: RE: Re: standardized betas in Prais-Winsten Regression
Previous by thread: Re: st: choosing how to collapse very large datasets
Next by thread: st: clustering in quantile regressions with sampling weights
Index(es):
- Date
- Thread