Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: How to get rid of outliers
From
Xixi Lin <[email protected]>
To
statalist <[email protected]>
Subject
Re: st: How to get rid of outliers
Date
Thu, 24 Oct 2013 12:31:00 -0400
Thanks, that helps a lot!
On Thu, Oct 24, 2013 at 11:55 AM, Sergiy Radyakin
<[email protected]> wrote:
> Xixi, listen to Nick's advice. But if you still want to drop them, here is how:
>
> sysuse nlsw88
> centile wage, c(2.5 97.5)
> local l=r(c_1)
> local r=r(c_2)
> kdensity wage, xline(`l') xline(`r')
> keep if inrange(wage, `l', `r')
>
> Best, Sergiy Radyakin
>
>
> On Thu, Oct 24, 2013 at 10:45 AM, Nick Cox <[email protected]> wrote:
>> If the question is simple
>>
>> How to get rid of outliers?
>>
>> then there is a good simple long answer
>>
>> Don't (usually).
>>
>> and a good simple short answer
>>
>> Don't.
>>
>> There are of course even longer answers in many places. The thread starting at
>>
>> http://www.stata.com/statalist/archive/2007-06/msg00185.html
>>
>> throws a variety of lights on outliers and immodesty leads me to recommend
>>
>> http://www.stata.com/statalist/archive/2007-06/msg00239.html
>>
>> as particularly long-winded, and respect leads me to nominate Richard
>> Goldstein's concise remark
>>
>> http://www.stata.com/statalist/archive/2007-06/msg00240.html
>>
>> as most penetrating of all. But the whole thread is worth looking through
>>
>> One rather long footnote to the thread is provided by
>>
>> SJ-13-3 st0313 . . . . . . . . . . . . . . Speaking Stata: Trimming to taste
>> (help trimmean, trimplot if installed) . . . . . . . . . . N. J. Cox
>> Q3/13 SJ 13(3):640--666
>> tutorial review of trimmed means, emphasizing the scope for
>> trimming to varying degrees in describing and exploring data
>>
>> but the best Stata incantation of all is likely to be -glm-.
>>
>> More generally, modify your model so that outliers are accommodated.
>>
>> Don't modify your data because they are awkward to analyse.
>>
>> Nick
>> [email protected]
>>
>>
>> On 24 October 2013 15:31, Xixi Lin <[email protected]> wrote:
>>> Hi All,
>>>
>>> I know it seems to be a very simple question. But I still wanna ask
>>> how to keep 99%(95%) of the data? Is it just chop off 2 standard
>>> deviations? How to code it then?
>>>
>>> Thanks a lot.
>>>
>>> Best,
>>> Xixi Lin
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/