Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Elimination of outliers
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Elimination of outliers
Date
Mon, 6 Jun 2011 13:39:20 +0100
In general, a very bad idea. Consider transforming your response or
predictors or using a non-identity link function in a generalized
linear model or some flavour of robust regression as more measured
tactics.
Nick
On 6 Jun 2011, at 12:46, "Achmed Aldai" <[email protected]> wrote:
Hi
I am currently working on a do file where I want to eliminate
outliers which have the highest and the lowest values regarding
certain variables. Here it is e.g. at and lt. In general I have
150000 observations and out of these observations I want to delete
25 observations from the upper and lower boundaries. But it might
also be better to do it relatively meaning that I dont take the
highest and lowest 25 but the lower and upper 1% of the
corresponding variables.
gvkey at lt
1001 1120 231
1001 1230 312
1210 57 32
1210 67 25
1354 789 560
1368 650 500
1481 1230 900
2930 21 30
3201 234 213
3201 256 220
3210 267 320
4510 4335 3214
I hope this became clear.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/