I think your approach requires the use of 40-day moving
averages, and selection of 20 either side of the observation
with the highest moving average. This leaves open the
question of what you do if the peak is in the first 20 or
last 20 days.
More generally, this approach seems to be bending over
backwards to deny any dependence structure in your data.
I am not clear that such random sampling really is a good way
to work with repeated measures.
Nick
[email protected]
Knag Anne-Christine
> (the 40 days have to be following days, so it is not the 40
> days of max
> production, just the period of max production)
>
>
> On 6/19/07 2:08 PM, "Maarten Buis" <[email protected]> wrote:
>
> > keep 40 top production dates:
> >
> > gen minusprod = - production
> > bys group minusprod: keep if _n <= 40
> >
> > select 25 random cases:
> >
> > set seed 12345
> > gen random = uniform()
> > bys group random: keep if _n <= 25
> Knag Anne-Christine
> > I have a question about how to pull out random values.
> > The data may be summarized as this:
> >
> > group day daily production
> > 1 42 1025200
> > 1 45 52000
> > 1 etc (up to day 145) etc
> > 2 36 2355000
> > 2 37 450003
> > 2 etc(up to day 150) etc
> > 3 65
> > up to group 9
> >
> > From each of the nine groups daily production I want to
> pull out 25 random
> > values out of the period where peak production occurs. I
> have decided a 40
> > days interval of peak production in each of the groups. The
> datasheet
> > contains all production days, so I need to first be able to
> pull out the 40
> > days and then find 25 random days within this period.
> > Can Stata pull out 25 production days out of the 40 days
> interval and
> > display the results in a table? I would like to run an
> ANOVA between and
> > within groups at these 25 random days in the 40days maximum
> production
> > period.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/