Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: normalize variables

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: normalize variables
Date	Sun, 11 Apr 2010 17:57:23 +0100

The word "normalize" here evidently means scale to a [0,1] range. 

Note first that using -egen- to do this is unnecessary unless you want
to do this panelwise. 

su x1, meanonly 
gen normal_x1 = (x1 - r(min)) / (r(max) - r(min)) 

If you want to do this panelwise, it does becomes convenient to use
-egen- as you say. 

What I don't understand is how your main question can be answered
without knowing why you want to do this and why you think that you
"must" normalize. The best answer I can offer is that your indexes will
vary depending on whether they calculated w.r.t. the entire dataset or
individual panels, and the choice between them is a scientific or
substantive one. 

Nick 
[email protected] 

[email protected]

I am using panel data analysis and I want to generate an index but first
I
must normalise the variables (x1,x2) contained in the index. I
normalised
them by the following set of commands:

egen min_x1=min(x1)
egen max_x1=max(x1)
gen normal_x1=(x1-min_x1)/(max_x1-min_x1)

So, my question is whether I need to transform the commands to include
the
"by(.)" option i.e.

egen min_x1=min(x1), by(.)
egen max_x1=max(x1), by(.)
gen normal_x1=(x1-min_x1)/(max_x1-min_x1)

and if so, should i include the panel or time variable.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: normalize variables
  - From: [email protected]

Prev by Date: RE: st: RE: generate lognormal RV less than 20000 observations.
Next by Date: RE: st: Changing labels for multiple variables at the same time
Previous by thread: st: normalize variables
Index(es):
- Date
- Thread