Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: problems with e(sample)


From   [email protected] (Vince Wiggins, StataCorp)
To   [email protected]
Subject   Re: st: problems with e(sample)
Date   Mon, 19 Jun 2006 10:19:39 -0500

Zurab Sajaia <[email protected]> asked why e(sample) was not retained when
he created it after a -preserve- and there was a subsequent -restore-.  Several
hypothesis were offered and this led to other questions and answers that beg
for clarification.

What follows is strictly of interest to programmers of estimation commands and
I invite others to stop reading now.

Zurab wrote,

> I ecountered a problem with disapearing e(sample) and couldn't find
> what can be wrong:
>
> I have an eclass program, which sets leaves some ereturn parameters
> as well as setting e(sample)
>
> program myprogram, eclass
> ...
> ereturn post, esample(`touse')
> ereturn local cmd="myprogram"
> ...
>
> count if e(sample)
> end // myprogram
>
> When I run the program -count- within the program displays correct
> number but if I write .count if e(sample) from command line, it
> returns 0. Somehow e(sample) gets cleared although other parameters,
> for example e(cmd) still contains "myprogram".

Several respondents offered that -e(sample)- might not really be a function,
but rather a variable that is hidden from the users and thus could not survive
a -restore-.  (So many list members commented that I am going to break with
tradition and skip attributions throughout this response.)  This is true.
-e(sample)- is nothing more than a cleverly hidden variable exposed through
the -e(sample)- function, and that fact is somewhat exposed when we use
-estimates hold- and see that -e(sample)- is then copied out into the dataset.

We don't, however, need to know that it is a variable to understand why it
does not survive a -restore-.  When we type, 

      . preserve
      . {...}
      . restore

Stata has no idea what happens in {...}.  Data could be dropped, added,
modified, or an entirely new dataset used.  And, that modified dataset is
where Zurab used -ereturn post- to specify his -e(sample)-.  When the
-restore- is performed, that -e(sample)- cannot be assumed to be valid,
whether it is stored as a variable or as some complicated set of rules.

Zurab and several other posters suggested good solutions for getting
-e(sample)- correctly set in such cases, and I won't rehash those.

There were a few other comments that I would like to clarify.

Someone observed that -discard- was "dangerous" because it dropped any current
estimation results,

> I wrote that warning because Zurab's post reminds me of an occassion
> where I had several sets of results saved using -est store- and then
> I came to a spot where Stata recommends that I try -discard-.  I
> tried and then whoosh, a lot of stuff (more than I thought I'd
> bargained for) was gone.
>
> Most commands in Stata implement changes that are quite quickly /
> easily reversed, -discard- can potentially wipe out regresison
> results that take hours to run.  You're of course right in saying
> that the help file should be read but in my case it didn't help v
> much as the -discard- help file says nothing about -est-.

This is intended behaviour.  Quoting from -discard-'s description, 

        In short, discard causes Stata to forget everything current without
        forgetting anything important, such as the data in memory.

While we might quibble over the words "current" and "anything important", but
the first line of the description of -discard- states that e() will be
cleared.

Someone else commented,

> I hadn't realised until now that
>
> 1. 
> preserve 
> ... 
> restore 
>
> is not exactly equivalent to 
>
> 2. 
> tempfile whatever
> save `whatever'
> ... 
> use `whatever', clear 
>
> as with 1. e(class) stuff seems to return, while in 2. it doesn't.

They are actually quite close, when we ignore some of the specialized options
of -preserve-.  So long as estimation results are not cleared or replaced in
{...}, they (with the exception of e(sample)) will be retained in either 1. or
2.  In fact, such the usage in 2. is encouraged in Stata, with or without the
use of -tempfile-s.  Among other things, it is how we can easily obtain
predictions for a dataset other than the one on which we estimated.

 
-- Vince 
   [email protected]

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index