At 05:49 PM 4/25/2003 -0700, Daniel Sabath wrote:
Hi David,
Thank you. A lot of what you mentioned makes sense. Please see my previous 
reply to Fred and Nick as I think I may have explained myself a little 
clearer there.
Now on to the nitty gritty...
[...]
I am presently unable to analyze everything you wrote, but I will give you 
a few pointers.
It was quite a surprise to find out that the if statement only evaluates 
its conditions once and not on each row. As a result, i'm not sure when it 
would be useful.
It is very useful, usually about things that are above the level of 
individual observations.  Here's an example:
capture assert ~mi(myvar)
if _rc ~=0 {
 disp as error "myvar has missings"
 exit 459
}
At what point are scalars and macros evaluated? Can you reset the value in 
the middle of the run depending on other calculations? IE
x = 0;
replace y = z if x < 10, x++
They are evaluated whenever you reference them.  They are set when you set 
them.  But you can only set them *between* any -generate- or -replace- 
operations.  (-generate- and -replace- operate on variables.)
The code you wrote above is not Stata code.  The "x = 0" will not 
work.  You need to prefix it with...
 "generate" or "replace" if x is a variable
 "scalar" if x is a scalar
 "local" if x is a local macro
(Other possibilities exist, but these are the basics.)
The second statement is fine (assuming y is a variable), up until the ", 
x++".  The latter is not allowed. (There is a ++ operator (in Stata 8) but 
this is not where you use it.)
> 2: Most Stata statements that operate on the data do so on the whole
> dataset at once. (Actually, there is a sequential aspect to the action 
that
> processes the statement, but you usually don't need to think about 
it.)  It
> may help to remember that, for example, in you code...
>   gen k = 2
>   gen l = 3
>
> first, k is created and set to 2 for all observations; then l is created
> and set to 3 for all observations.
I believe that this is one of the fundimental differences (and a hard one 
to get your head around) between stata and other stats languages. The 
implicit loop through the data exists on each *line* of the do file and 
not around the program as a whole. Other languages work on the data a line 
at a time and allow you to make as many calculations / modifications as 
you like before proceeding. Please correct me if I am missing something.
You are correct here.  And this is truly a fundamental difference between 
Stata and the others.  Once you get this, you are on your way to using 
Stata effectively.  It is a more wholistic approach to handling the 
data.  (Also, it may help to remember that what you said applies to 
commands entered interactively.  A do file is just a way of preparing your 
commands.)
But there are a few situations where it either doesn't work as smoothly as 
traditional programming methods, or requires a very different way of 
thinking.  Your task of picking the three maximal values from among several 
variables is one such situation.  It is actually easy to pick the one 
maximal value:
 egen ... rmax()
Picking two or more maximal values is trickier; Scott Merryman gave you one 
possibility.  Another might be to write some "traditional-looking" code 
within a loop that references individual observation.  That is the route of 
last resort.  Yet another was suggested by Nick Cox -- to reshape long and 
then sort.  After the sort, retain the three cases at the end of each 
group.  (Then, if you want, reshape wide.)  This latter method is a good 
example of the "different way of thinking" that is characteristic of Stata.
Incidentally, I believe that none of your code examples contain a reference 
to an individual observation, though you might have been thinking that you 
have.  But don't try.  To reference an individual observation is useful in 
relatively rare situations, but is avoided in general.
Good luck.
--David
David Kantor
Institute for Policy Studies
Johns Hopkins University
[email protected]
410-516-5404
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/