Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: _N in by-groups
From
Phil Schumm <[email protected]>
To
[email protected]
Subject
Re: st: _N in by-groups
Date
Fri, 19 Aug 2011 04:29:39 -0500
On Aug 19, 2011, at 2:41 AM, Matthew White wrote:
> But in this (admittedly silly) example, _N seems to be the number of observations in the data set:
> program dispby, byable(recall)
> disp `0'
> end
> sysuse auto
> bys foreign: dispby _N
> Both times 74 is displayed, instead of 52 in the first by-group and 22 in the second.
As Nick just pointed out, in this example, the string "_N" is being passed to your program, so that within -dispby-, the line
disp `0'
is being expanded to
disp _N
which evaluates to (and displays) 74 in all cases. But this minor issue is a red herring here, for I suspect you would be equally surprised by
program dispby, byable(recall)
syntax [if]
marksample touse
count if `touse'
end
sysuse auto
bys foreign: dispby if _n == _N
Or, for an example using official commands (also with the auto dataset), compare
bys foreign: li make if _n<=5
to
bys foreign: reg mpg weight if _n<=5
The issue you have stumbled across is that merely using -byable()- when you define a program does not automatically mean that Stata interprets _n and _N WRT the by-groups when you call the program. If you want that type of behavior, then you have to program it explicitly.
Now, your original question had to do with efficiency; namely, you were concerned about calling -preserve- and -restore- for each by-group. While it's true that there may be faster ways to accomplish what you're trying to do, it's usually better to wait until you know you have a problem (e.g., via profiling your command under a range of conditions) before doing a lot of extra work to try to improve performance. Nonetheless, I suspect that the answer may be to use -byable(onecall)- and to call one or more of Stata's built-in commands to handle the by-group processing. However, without knowing exactly what you are trying to do, it's not possible to give a specific recommendation.
-- Phil
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/