[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Fraud methods in Stata
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Fraud methods in Stata
Date
Fri, 26 Sep 2008 11:30:44 -0500
I would search for publications of longstanding Stata user Stephen Evans
in this area. He has done very serious work on (possibly fraudulent)
medical data.
That said, I remain puzzled by the implication that outliers are prima
facie evidence of fraud. My own impression is that fraudulent people
wish to create datsets that look genuine and that they are thus unlikely
to add or manufacture outliers, unless those outliers serve their
purpose somehow, but that's just a guess. The main ways in which I can
think of that fraudulent data can sometimes be identified is that often
agreement is "too good to be true" and through looking at the patterns
of first and last digits in data. Another obviously related issue is
plagiarism of published data.
Nick
[email protected]
Williams, Rachael wrote:
I am considering methods of detecting fraud in a hypothetical clinical
trial with a large number of centres, but only a few patients per
centre.
In addition, many variables will be binary.
Would Cook's D be appropriate here?
Is it possible to calculate Mahalanobis' distance in Stata in order to
detect (possibly fraudulent) inliers, outliers and near duplicates in a
dataset?
If anyone has any ideas of other ways to detect possible fraud I would
love to hear from you too!
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |