Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: if and if


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: if and if
Date   Fri, 14 Nov 2008 17:19:28 -0000

Further thoughts: 

Suppose _hypothetically_ that blocks 

if exp { 

} 

were to be interpreted observation by observation when <exp> in
principle takes on a different value for each observation, as for
example 

if varname == 2 { 

} 

Now 

1. What happens if the data change during the { }, say some or all of
the values of -varname- are changed, or -varname- is dropped? 

2. What happens if you nest two or more of these say 

if varname == 2 { 
	...
	if anothervarname > 3 { 
		...
	} 
	... 
}

Which observations are to be used in the inner loop? 

3. What happens if we have two types of -if-s say 

if varname == 2 { 
	...
	... if varname == 3 
	...
} 

Which observations are to be used in the inner statement? 

In essence, the key point is that no language feature is totally
independent of any others. It's not that it would be impossible to work
out rules for these circumstances; it's just that I suspect that on
balance they would bite users -- often with highly mysterious bugs --
more than they helped them. Also, the more complicated code you wrote,
the more likely you would be to get lost on what you are doing -- not
that it's impossible now. 

Ashim in effect wants Stata to act as smartly as he can think when he is
on form -- and so do we all. The trouble is that software has to be
written that does not make too easy for us to do very weird stupid
things by accident. 

Nick 
[email protected] 

Nick Cox

Dear Ashim: 

You make the same point and you should not be surprised that my reply is
the same. Sure, you find this "natural" but you did not design Stata.
The ambiguity of 

if exp ... 

meaning one thing when the result of exp is a single true or false value
and another thing when the result is a variable's worth is not in my
view a desirable feature. What's much more important is that Bill Gould
disagreed with you in 1985. 

Those of us who grew up with Fortran, C or similar languages I suggest
find it natural that the expression in 

if exp { 

is single-valued. So, whose taste prevails? 

It may not help much now, but my prediction is that once you have been
using Stata for 17 years, you will have come to agree with Bill Gould's
decision. (It should take much less.) 

Nick
[email protected] 

Ashim Kapoor

Dear Nick,

I think it is lacking because : it seemed to me that it is natural to
think that

if ( var1==2) {

replace var2==3
replace var3==4

}

in this case the program should be SMART enough to do the replacements
in corresponding observations and not only work in the 1st
observations,etc .

I mean if someone DOES NOT know Stata (but he does know that each var
in stata is a column vector ) and the 1st observation rule, what would
he think on the FIRST look at this code. Surely it would seem that the
changes would be made in the corresponding observations.

Seems intuitive to me. And it has everyday use in my programming.
Maybe for other reasons as you point out the people who programmed it
did not let it work that way and they may be right.

On Thu, Nov 13, 2008 at 8:01 PM, Nick Cox <[email protected]> wrote:
> This business of the two -if-s has confused many people.
>
> (Incidentally, I think -cond()- is a red herring as far as this
question
> is concerned. It is without doubt very useful, but it has no bearing
on
> the matter.)
>
> But what Ashim wants is, I think, not "in between" at all. He wants
what
> he calls Type 2 -- the -if- command -- to do the same as what he calls
> Type 1 -- the -if- qualifier -- when both are based on tests on
> variables.
>
> However, it doesn't, and no amount of hoping, wishing, pushing or
> shoving will make it so -- apart from one borderline exception.
>
> His query raises the question of why Stata has the two constructs if
> they are really identical. But they aren't.
>
> It often happens that people used to some construct in other languages
> would like to see Stata behave that way, and vice versa, but that's
the
> way it is. If you go to users' meetings (e.g. today and tomorrow in
San
> Francisco), you can confront the developers and say "Why is Stata
> designed like that?". Sometimes there is a compelling argument against
> one syntax and for another; sometimes the answer is "We had to make a
> choice, and we just liked that syntax". But no syntax can be based on
> users knowing that they mean one thing in one context and one thing in
> another. All syntaxes have to be based on the language implementation
> having only one interpretation of what is meant.
>
> I have got to say, however, that the Stata documentation is not 100%
> innocent here. If you look at -help ifcmd- it gives among other
details
>
> =================
> Typical use:  Example 3
>
>    program ...
>            ...
>            if x==1      local word "one"
>            else if x==2 local word "two"
>            else if x==3 local word "three"
>            else if x==4 local word "four"
>            else         local word "big"
>            ...
>    end
> ==================
>
> But this example is _only_ good programming style if -x- is a scalar.
> (Arguably not even then, as a temporary name would be preferable.) And
> that is not explained, or (to me) self-evident.
>
> Also, the fact that (again, for a variable x)
>
> if x == 1 ...
>
> will be taken as
>
> if x[1] == 1 ...
>
> is not explained in the help, although it is in [P] -if- and the FAQ
> cited earlier in this thread. I think several people would find out
> about this matter earlier and easier if the point were made in the
help.
>
>
> The borderline exception is thus that for a variable x
>
> if x == 1 ...
>
> is exactly equivalent to
>
> ... if x == 1
>
> if and only if you have a single-observation dataset.
>
> There are languages that support something like
>
> if x == 1 ...
>
> as equivalent to
>
> ... if x == 1
>
> -- doesn't SAS support something similar -- but that is in a context
> that implies a loop over observations. Stata's way of supporting loops
> over observations (think of the ways that -generate- and -replace-
work)
> is more usually implicit.
>
> As Ashim pointed out, Stata offers a concise alternative to
>
> replace j=2 if k==2
> replace m=2 if k==2
> replace n=2 if k==2
>
> It is
>
> foreach v of var j m n {
>        replace `v' = 2 if k == 2
> }
>
> (Ashim's own code here is illegal, by virtue of his omitting "var".)
>
> Alternatively, you can go
>
> foreach v in j m n {
>
> Short of Stata re-inventing itself as an ultra-terse language like APL
> or J either seems to me about as concise as anyone might want. I
really
> don't understand why Ashim wants a "better way". What could be better,
> or how is this lacking?
>
> Nick
> [email protected]
>
> Ashim Kapoor
>
> I realize that there are 2 kinds if's in Stata.
>
> Type 1 : would be something like replace j = 2 if k==2
> here the replace in j would happen ONLY in the corresponding
> observation of k. THIS IS WHAT I WANT.
>
> Type 2 : the programming if something like
>
> local j
>
> if `j'==2 {
>
> do something.
>
> }
>
> **************************************************************
>
> I guess I want to do something which is in between the above 2.
>
> I want to say the following : --
>
> replace j=2 if k==2
> replace m=2 if k==2
> replace n=2 if k==2
>
> in ONE shot.
>
> so I try : -
> ************************************** Block A
> if k==2 {
> replace j=2
> replace m=2
> replace n=2
> }
> *********************************************
> This does not work. Because k==2 would mean k==2 in ALL observations.
> While I mean to say make j / m / n = 2 in those observations where k
> is 2.
>
> How do I do this in a quick manner in Block A  ?
>
> I understand that I can always do :-
>
> foreach var of   j m n {
> replace `var'=2 if k==2
> }
>
> But is there a better way ?
>
> The reason I want this is the following  : -
>
> I want
>
> if k==2 {
> replace j=1
> replace m=5
> replace n=89
> }
>
> so the above method fails as I do not have the SAME value 2 to be put
> in each of j / m /n.
>
> Any slick way of doing this ?


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index