Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: bug in "egen" command?


From   Rembert De Blander <[email protected]>
To   [email protected]
Subject   st: bug in "egen" command?
Date   Fri, 26 Sep 2008 04:26:33 +0200 (CEST)

The problem can be stated as follows:

Consider the panel data setting where the command

<<tsset pid time>>

was issued. Under these circumstances, the command:

<<by pid: egen double i`x' = mean(`x')>>

should be exactly identical to:

<<
  generate double I`x' = 0
  qui levelsof `id', local(idlst)
  foreach lvl of local idlst {
  qui summarize `x' if (`id' == `lvl'), meanonly
  qui replace I`x' = r(mean) if (`id' == `lvl')
  }
>>

Now, the problem, as far as I experienced it, can appear when `x' is a
float variable. Worse, the discrepancy between both command sequences
seems to involve a "random" component, since it differs from run to run.
The latter sequence of commands always produces identical results, but the
'egen' commands output varies. Of course these fluctuations are of the
order of machine precision. Nevertheless they are worrying, since they
constitute 'unexpected' and certainly undocumented behaviour, which can
lead to substantial differences, especially in iterated procedures.

The problem does not occur for any `x', but I have a dataset & sequence of
commands that produce the described behaviour.

Since I am not allowed to post attachments, please mail me for more info:

Rembert_at_DeBlander.eu


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index