Thanks for this. Posting code that doesn't work together with the
problem is a good Statalist strategy too. If people don't understand the
code or want to start afresh, they can ignore the code and concentrate
on the problem. What is certain is that people can't comment very
helpfully on code they can't see.
To repeat: -egen, tag()- happens to tag the first of any group, first
meaning first in the current sort order. Whether that is the first in
any other sense necessarily may depend on that sort order.
Nick
[email protected]
Marietta Jones
Thank you very much for the suggested solution.
I tried writing my own do file based on your suggestions to my
previous post, but I couldn't get the code to work.
My main problem was that while I knew how to tag the first observation
of a group, I did not know how to successfully tag the last
observation. gsort appears to offer a good solution that I was not
aware of.
On Tue, Aug 12, 2008 at 1:51 PM, Nick Cox <[email protected]> wrote:
> If I understand your question correctly, it yields to a minor
variation
> on the technique documented in
>
> SJ-7-3 pr0033 . . . . . . . . . . . . . . Stata tip 51: Events in
> intervals
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N.
> J. Cox
> Q3/07 SJ 7(3):440--443 (no
> commands)
> tip for counting or summarizing irregularly spaced
> events in intervals
>
> and explained in my posting replying to your previous question:
>
>
<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist
> .0808/date/article-20.html>
>
> local N = _N
> gsort Company_No - Date
> gen avex = .
> qui forval i = 1/`N' {
> egen tag = tag(Investor_id) ///
> if Company_no == Company_no[`i'] & ///
> inrange(Date, Date[`i'] - 90, Date[`i'])
>
> su x if tag, meanonly
>
> replace avex = r(mean) in `i'
>
> drop tag
> }
>
> The obvious variations on the previous example are using -summarize,
> meanonly- to get the mean and specifying the previous 90 days within
an
> -inrange()- condition. (This code includes the current date; modify
> according to taste.)
>
> The not-so-obvious variation is using -egen, tag()- to tag the _last_
> relevant observation for each distinct Investor_id.
>
> -egen, tag()- tags just one of any repetitions of each distinct value
of
> the variable specified. It's not documented, but the observation
tagged
> is in fact the _first_ such occurrence in the dataset in the current
> sort order. Hence use -gsort- first to reverse date order in each
panel,
> so that tagging catches the last such occurrence in time.
>
> Nick
> [email protected]
>
> Marietta Jones
>
> I have a new problem related to the dataset I presented in my original
> email (see below).
>
> For each observation in the dataset, I would like to calculate the
> mean of variable x for the same company_no on a rolling prior 90 day
> basis.
> The extra complexity arises by the fact that there may be more than
> one observations of variable x for the same investor_id for a given
> company_no.
> I would only like to include the last observation per investor_id for
> a given company_no in the calculation of the mean.
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/