Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: normalize variables
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
st: RE: normalize variables
Date
Sun, 11 Apr 2010 17:57:23 +0100
The word "normalize" here evidently means scale to a [0,1] range.
Note first that using -egen- to do this is unnecessary unless you want
to do this panelwise.
su x1, meanonly
gen normal_x1 = (x1 - r(min)) / (r(max) - r(min))
If you want to do this panelwise, it does becomes convenient to use
-egen- as you say.
What I don't understand is how your main question can be answered
without knowing why you want to do this and why you think that you
"must" normalize. The best answer I can offer is that your indexes will
vary depending on whether they calculated w.r.t. the entire dataset or
individual panels, and the choice between them is a scientific or
substantive one.
Nick
[email protected]
[email protected]
I am using panel data analysis and I want to generate an index but first
I
must normalise the variables (x1,x2) contained in the index. I
normalised
them by the following set of commands:
egen min_x1=min(x1)
egen max_x1=max(x1)
gen normal_x1=(x1-min_x1)/(max_x1-min_x1)
So, my question is whether I need to transform the commands to include
the
"by(.)" option i.e.
egen min_x1=min(x1), by(.)
egen max_x1=max(x1), by(.)
gen normal_x1=(x1-min_x1)/(max_x1-min_x1)
and if so, should i include the panel or time variable.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/