[email protected]
> I have an observation to make, I have a data set that is
> sorted by site,
> subject and rnum and I am always creating / manipulating
> variables at the
> subject level (rnum=1,..,20), as follows
>
> /*** 1st run **/
> sort site subject rnum
> forval X = 1/6 {
> genl prevcheck`X'=moa`X'[_n-1] if tag`X'==1
> }
>
> /*** 2nd run **/
> sort site subject rnum
> forval X = 1/6 {
> genl prevcheck`X'=moa`X'[_n-1] if tag`X'==1 , by(site subject)
> }
>
> The variables prevcheck1-prevcheck6 were identical from the
> two runs - can I
> trust that "by(site subject)" will always be redundant once
> Stata registers
> the sorting structure.
This in turn provokes a second question: what is -genl-?
It is a program published by Jeroen Weesie in STB-35 in 1997.
To answer the question, we need to look inside. Here are
the relevant lines of code, given an expression `exp'
defining a new variable `varlist', and simplifying a
bit:
tempvar x
rename `varlist' `x'
sort `by'
local By "by `by': "
quietly `By' replace `x' = `exp'
In Amani's case invoking, for example,
genl prevcheck1=moa1[_n-1] if tag1==1 , by(site subject)
is thus equivalent to (in Stata 7 or 8 terms)
bysort site subject: gen prevcheck1 = moa1[_n-1] if tag1 == 1
and this is, I think, safe given Amani's prior sort.
However, time and Stata move on.
1. Positively, -genl- has a key feature: it automatically
creates a variable label and a characteristic that contains
the defining expression.
2. Negatively, as a program written for Stata 5, -genl-
could not include a key feature included in Stata 7.
To be absolutely sure in all circumstances that your
-sort- order is maintained you can always do this:
bysort site subject (rnum): gen prevcheck1 = moa1[_n-1] if tag1 == 1
Doing this explicitly is, in my view, the best way
to be _sure_ that you get what you want. You might well
be able to pass
site subject (rnum)
as an argument to -by()-, but I've not tried it.
Nick
[email protected]
P.S. Amani just added a footnote:
/*** Sorry I needed to modify to tag`X'[_n] **/
Not so; this does no harm, but it changes nothing. Stata
always understands -varname- to mean -varname[_n]- in
contexts like these.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/