I vote for efficiency whenever possible, but
it is not clear that inefficiency is in
fact a major issue here. Stata's still
going to look at every observation to
decide whether it is true that year == `y'.
I tried the following experiments. You
can try too. Method 1 was actually
_slower_ on my machine, but there's not
much in it. The difference could be an artefact of
something or other, but it doesn't seem
a big deal either way. Of course, a couple
of little experiments are just that.
My concern was not so much for saving a few CPU cycles as pointing out
the often-unexpected behavior of predict. In other statistical
software, it is common that a predict will only produce in-sample
values, and you have to ask for anything else. predict without an
e(sample) restriction can produce confusion if, e.g., one would look at
any statistics related to the predicted quantities. If all that is
being done is stuffing certain of those predicted quantities in another
variable and discarding the irrelevant ones, fine. But I have learned
from experience that if it is possible to make the mistake of
considering some aspect of that entire series when only part of it is
relevant, it will eventually happen. So I think a general rule: predict
what you want to predict, and make that explicit if necessary---is
quite a good idea.