Friedrich's answer gives a good low-level way to do this in Stata.
In programs or do-files, you will get a little more speed or efficiency
by using a convenient option of -summarize-.
su var, meanonly
gen var2 = (var - r(min)) / (r(max) - r(min))
The name -meanonly- is misleading here.
This answer uses the minimum and maximum across all panels, as Allison
seems to be asking for.
Sometimes people will want to scale by the extremes in each panel. In
that circumstance
egen min = min(var), by(panel)
egen max = max(var), by(panel)
gen var2 = (var - min) / (max - min)
is one convenient (if not especially efficient) way to proceed using
official Stata commands.
One canned solution is available from Stas Kolenikov:
_gstd01 from http://web.missouri.edu/~kolenikovs/stata
_gstd01 -- Standardize to [0,1] / / Author: Stas Kolenikov,
[email protected] / This program is an extension to the egen
command / that standardize the specified variable into [0,1] range /
so that 0 corresponds to the minimum value, and 1
I don't know if that Russian email address still works. Stas has been
based in the US for some years now, as his mailings to this list and the
Missouri URL above do indicate. Stas' code is informative:
program define _gstd01
version 6
gettoken type 0 : 0
gettoken g 0 : 0
gettoken eqs 0 : 0
syntax varname [if] [in], [BY(varlist)]
marksample touse
if "`by'"=="" {
tempvar by
qui g byte `by'=0 if `touse'
}
tempname byvar vmin vmax t
tokenize `varlist'
sort `touse' `by' `varlist'
qui by `touse' `by' : g long `byvar'=1 if _n==1
qui replace `byvar'=sum(`byvar')
qui by `touse' `by': g double `vmin'=`varlist'[1]
qui g double `t'=-`varlist'
sort `touse' `by' `t'
qui by `touse' `by': g double `vmax'=-`t'[1]
qui g `type' `g'=(`1'-`vmin')/(`vmax'-`vmin') if `touse'
lab var `g' "`1' standardised to [0,1]"
end
In fact, that can be slimmed down a bit. The variable `byvar' does
nothing and the double sorting to get maxima as well as minima is
unnecessary given that any missing values are segregated by
-marksample-.
*! 1.0.0 NJC 5 Jan 2009 after Stas Kolenikov
program _gstdminmax
version 8
gettoken type 0 : 0
gettoken g 0 : 0
gettoken eqs 0 : 0
syntax varname [if] [in], [BY(varlist)]
marksample touse
tempname vmin vmax
local y `varlist'
qui bysort `touse' `by' (`y') : ///
g `type' `g'= (`y' - `y'[1])/(`y'[_N] -`y'[1]) if `touse'
lab var `g' "`1' standardised to [0,1]"
end
With this code in _gstdminmax.ado on your -adopath- an example of
panelwise scaling would be
egen var2 = stdminmax(var), by(panel)
Overall scaling would omit the option call:
egen var2 = stdminmax(var)
Nick
[email protected]
Friedrich Huebler
Do the commands below yield the values you need?
. sum var
. gen var2 = (var-r(min))/(r(max)-r(min))
<[email protected]>
> I have a question about creating formulas or scripts in STATA for my
panel
> data set.
> I wish to normalise my panel data using the following formula:
> Vi-Vmin/Vmax-Vmin (where Vi is the actual value of a variable, Vmax is
the
> manimum value in a complete data series, and Vmin is the minimum). How
do I
> generate a new list of variables using this formula?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/