Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Re: st: Same code, same machine, same data, different results
From
Christopher Baum <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: Re: st: Same code, same machine, same data, different results
Date
Thu, 6 Sep 2012 14:09:52 +0000
<>
On Sep 6, 2012, at 2:33 AM, Dmitriy wrote:
>
> Do you have any m:m merges by any chance?
>
> DVM
>
> On Wed, Sep 5, 2012 at 2:10 PM, Mattia Landoni <[email protected]> wrote:
>> Dear statalisters,
>>
>> a friend of mine has a bizarre problem. She is running a regression as follows:
>>
>> xi: regress a b c i.d i.e
>>
>> and her output is different every time. Has anyone ever seen a
>> behavior like this? Below are some details.
>>
>> Environment:
>> - Stata 11
>> - Windows 32-bit
>>
>> Precise description:
>> The do-file imports several files from .csv, then merges them, then
>> runs the regression. If I run the do-file, I get certain results. If I
>> issue the same regression command again, I get again the same results,
>> as it should be. However, if I re-run the do-file from the beginning,
>> I get slightly different results and the regression even reports a
>> slightly different number of observations. (Say, 2663 vs. 2666). Every
>> time all the data are taken afresh from the same static .csv sources.
>> There is nothing random about the do-file, that I know. The xi:
>> command generates about 200 i-variables and a few, maybe 10, are
>> dropped because of collinearity. There are more than 2500
>> observations.
This is EXACTLY what happens when you do a m:m merge. (See IMEUS (Baum,2006), 3.7.2 for why you really shouldn't even try).
I once spent 2 hours with one of my (very bright) grad students who was having this kind of problem in his do-file, with the old merge command,
and we tracked it down to a non-unique merge key, in essence what is now called a m:m merge.
I have had an exchange recently with a user on the LinkedIn Stata forum about this issue; he wanted to know whether
Stata had 'fixed' the merge command in Stata 12 so that it did m:m merges correctly. I argued that there was no clear
definition, in database terms, of what you are doing with a m:m merge, so no 'fix' would be forthcoming. He said he relied
on SAS to do it, with PROC SQL, which perhaps has some hardwired rules about how to handle the innate indeterminacy of such
an operation.
KIt
Kit Baum | Boston College Economics & DIW Berlin | http://ideas.repec.org/e/pba1.html
An Introduction to Stata Programming | http://www.stata-press.com/books/isp.html
An Introduction to Modern Econometrics Using Stata | http://www.stata-press.com/books/imeus.html
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/