Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Confusion with Winsorizing
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Confusion with Winsorizing
Date
Wed, 15 Jan 2014 17:56:18 +0000
A more mundane problem to watch out for is the occurrence of ties. For
example, if 5 + something% of values equal the smallest value, the
smallest Winsorized value will occur with relative frequency 5 +
something%, not 5%.
Nick
[email protected]
On 15 January 2014 17:44, Nick Cox <[email protected]> wrote:
> I would never overwrite existing data with Winsorized data. Doing that
> may already have messed up results irretrievably if there is some
> mistake in what you are doing.
>
> A smaller objection is that the local macros are unneeded here.
> Assuming that the abbreviation -Postt- works,
>
> clonevar PostW = Postt
> forvalues e=1/55 {
> sum Postt if (Postt != 0 & Event`e' == 1) , de
> replace PostW = r(p95) if (Event`e' ==1 & Postt > r(p95))
> replace PostW = r(p5) if (Event`e' ==1 & Postt < r(p5))
> }
>
> That still leaves your major question. An implicit assumption here is
> that the values of 1 for -Event*- are disjoint, i.e. any value of such
> a variable being 1 rules out the same for any other such variable. We
> have no information on that from you.
>
> Nick
> [email protected]
>
>
> On 15 January 2014 17:30, Nima Darbari <[email protected]> wrote:
>> I have written the simple code below to Winsorize a figure in 55
>> different events separately but perhaps due to a funny mistake it
>> doesn't work properly.
>>
>> forvalues e=1(1)55{
>> sum PostturnoverFirm if (PostturnoverFirm !=0 & Event`e' ==1) , de
>> local p95=r(p95)
>> local p5=r(p5)
>> replace PostturnoverFirm = `p95' if (Event`e' ==1 & PostturnoverFirm > `p95')
>> replace PostturnoverFirm = `p5' if (Event`e' ==1 & PostturnoverFirm < `p5')
>> }
>>
>>
>> Bigger than the 95 percentile line works correctly but the smaller
>> than 5 percentile line replaces the figure for almost all of the rest
>> of observations. Does anyone know whats wrong with this?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/