Nazaria Solferino
> Hello! I'm a new Stata user and I'm not very good at
> using it yet. I hope some one can help with my
> problem. I've a large dataset, with some outliers, and
> I'd like to manage variables, that I have, only in a
> restricted range(without dropping observations) I've
> thought I could give a zero value to all veriables
> outside a certain range so I mean I should generate a
> newvar=oldvar then replace newvar=0 if outside the
> range. First, is this a sattistical correct procedure?
(Please use informative titles on Statalist messages.)
No. Stata will take the new values of 0 just as literally
as the old outlying values.
One way to exclude outliers is by an -if- condition:
e.g.
regress y x1 x2 x3 if y < 10000
Naturally, there are other approaches to your problem
including
1. a robust technique. I've found -qreg- very good.
2. transformation.
3. -glm- with a nonlinear link (e.g. log).
> Second, if it's correct, how coukld I realize that
> with stat without find each interval with centile
> command for each variable, but realize a general
> program that I can apply to each variables?
That depends partly on what your project is. But
my guess is that -qreg- or -glm- might offer
a more general approach than what you propose
here.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/