At 05:49 PM 4/25/2003 -0700, Daniel Sabath wrote:
Hi David,
Thank you. A lot of what you mentioned makes sense. Please see my previous
reply to Fred and Nick as I think I may have explained myself a little
clearer there.
Now on to the nitty gritty...
[...]
I am presently unable to analyze everything you wrote, but I will give you
a few pointers.
It was quite a surprise to find out that the if statement only evaluates
its conditions once and not on each row. As a result, i'm not sure when it
would be useful.
It is very useful, usually about things that are above the level of
individual observations. Here's an example:
capture assert ~mi(myvar)
if _rc ~=0 {
disp as error "myvar has missings"
exit 459
}
At what point are scalars and macros evaluated? Can you reset the value in
the middle of the run depending on other calculations? IE
x = 0;
replace y = z if x < 10, x++
They are evaluated whenever you reference them. They are set when you set
them. But you can only set them *between* any -generate- or -replace-
operations. (-generate- and -replace- operate on variables.)
The code you wrote above is not Stata code. The "x = 0" will not
work. You need to prefix it with...
"generate" or "replace" if x is a variable
"scalar" if x is a scalar
"local" if x is a local macro
(Other possibilities exist, but these are the basics.)
The second statement is fine (assuming y is a variable), up until the ",
x++". The latter is not allowed. (There is a ++ operator (in Stata 8) but
this is not where you use it.)
> 2: Most Stata statements that operate on the data do so on the whole
> dataset at once. (Actually, there is a sequential aspect to the action
that
> processes the statement, but you usually don't need to think about
it.) It
> may help to remember that, for example, in you code...
> gen k = 2
> gen l = 3
>
> first, k is created and set to 2 for all observations; then l is created
> and set to 3 for all observations.
I believe that this is one of the fundimental differences (and a hard one
to get your head around) between stata and other stats languages. The
implicit loop through the data exists on each *line* of the do file and
not around the program as a whole. Other languages work on the data a line
at a time and allow you to make as many calculations / modifications as
you like before proceeding. Please correct me if I am missing something.
You are correct here. And this is truly a fundamental difference between
Stata and the others. Once you get this, you are on your way to using
Stata effectively. It is a more wholistic approach to handling the
data. (Also, it may help to remember that what you said applies to
commands entered interactively. A do file is just a way of preparing your
commands.)
But there are a few situations where it either doesn't work as smoothly as
traditional programming methods, or requires a very different way of
thinking. Your task of picking the three maximal values from among several
variables is one such situation. It is actually easy to pick the one
maximal value:
egen ... rmax()
Picking two or more maximal values is trickier; Scott Merryman gave you one
possibility. Another might be to write some "traditional-looking" code
within a loop that references individual observation. That is the route of
last resort. Yet another was suggested by Nick Cox -- to reshape long and
then sort. After the sort, retain the three cases at the end of each
group. (Then, if you want, reshape wide.) This latter method is a good
example of the "different way of thinking" that is characteristic of Stata.
Incidentally, I believe that none of your code examples contain a reference
to an individual observation, though you might have been thinking that you
have. But don't try. To reference an individual observation is useful in
relatively rare situations, but is avoided in general.
Good luck.
--David
David Kantor
Institute for Policy Studies
Johns Hopkins University
[email protected]
410-516-5404
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/