Hi,
Isn't there a winsor ado (written by nick) which can be used to deal with
outliers? In some cases it may be preferable to throwing out the
observations?
rajesh
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Rodrigo A. Alfaro
Sent: 07 June 2007 14:17
To: [email protected]
Subject: st: Re: RE: Re: RE: RE: IQR
It seems to be a 'common' practice when COMPUSTAT
data is used. The dataset is composed by the balance sheet
reports of US firms. It would be difficult to identify in the
data mergers, splits or any sort of change in property that
implies a huge change in the composicion of a firm (in terms
of assets, fixed capital, etc.) then dropping extreme values
in change in assets allows you to 'delete' the unexplained
firms. Also, a similar problem affects the price where
sometime a change in the dividend policy can produce a
jump that makes sense only when the researcher knows
the change in policy. Usually, researchers do not know
about these policies or it is a titatic (and maybe useless)
job trying to include them in the analysis.
Rodrigo.
----- Original Message -----
From: "Nick Cox" <[email protected]>
To: <[email protected]>
Sent: Thursday, June 07, 2007 6:44 AM
Subject: st: RE: Re: RE: RE: IQR
>I am shocked to find my good friend Kit Baum throwing
> away 20% of his data. No doubt this profligacy
> matches his research problem. In environmental science,
> which I know more about,
> throwing out the tails would lose all the bangs and leave
> mostly whimpers, but he is doing economics, where some
> of the extreme values may represent accountancy artefacts.
>
> On -iqr-, since half the work is done, perhaps there is
> a case for a formal update. I will contact the author,
> Larry Hamilton, whose book's various editions have
> served so many Stata users so well. (It got me started.)
>
> But -iqr-'s main function I see as reporting. Rodrigo's
> example of a -foreach- loop cycling over variables
> and -summarize- results is the way to go for selection of subsets
> of data.
>
> Nick
> [email protected]
>
> Rodrigo A. Alfaro
>
>> ///
>> Wow Nick, your translation from 'demotic' can be only
>> compared with the work
>> of Thomas Young. Just kidding, very good job indeed!!
>> Returning to the
>> problem, it would be nice to get a list of return scalars in your new
>> version. For the problem, the limits were the observations
>> are supposed to
>> be outliers can be used after for sample selection or to create new
>> variables.
>>
>> Alternative to the procedure discussed so far, there is
>> another way to
>> 'deal' with the outlier (if you want to), which is cutting
>> the tails "we
>> trimmed firms whose total assets growth rate exceed the 90th
>> percentile or
>> fall short of the 10th percentile of the annual
>> distribution." page 6 of
>> Baum, Caglayan, Ozkan (2003), Working Paper 566, Boston College. For
>> example, tdavis could use the following code to 'drop' the
>> outliers that are
>> above of 5th and 95th percentile of each variable:
>>
>> foreach x of varlist price total_assets inventories {
>> gen double `x'_wo = `x'
>> sum `x', d
>> local u = r(p95)
>> local l = r(p5)
>> replace `x'_wo = . if `x'>`u'
>> replace `x'_wo = . if `x'<`l'
>> }
>
>> From: "Nick Cox" <[email protected]>
>> To: <[email protected]>
>> Sent: Wednesday, June 06, 2007 6:53 PM
>> Subject: st: RE: RE: IQR
>>
>>
>> >I spent a while updating -iqr- to -iqr8-.
>> >
>> > This was unnecessary, because -iqr- works
>> > fine under version control. (How many programs
>> > would run without change in other software after 16 years?)
>> > Nevertheless, few Stata users will now be accustomed to reading
>> > or writing Stata like this:
>> >
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/