Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
AW: AW: AW: st: sort of standardization
From
"Martin Weiss" <[email protected]>
To
<[email protected]>
Subject
AW: AW: AW: st: sort of standardization
Date
Wed, 12 May 2010 17:06:18 +0200
<>
" if the variable is truly continuous (as in your examples),
then there is no reason, on a practical basis, to add anything"
Official Stata is committed to my version, though:
*************
clear*
set seed 1001
set obs 10000
gen x=rnormal()
gen int y=_n
tabstat x y, stat(range)
su x, mean
di in r r(max)-r(min)
su y, mean
di in r r(max)-r(min)
*************
HTH
Martin
-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Richard
Goldstein
Gesendet: Mittwoch, 12. Mai 2010 17:01
An: [email protected]
Cc: Martin Weiss
Betreff: Re: AW: AW: st: sort of standardization
good point -- the "1" should have been "unit of measure" to encompass
everything -- if the variable is truly continuous (as in your examples),
then there is no reason, on a practical basis, to add anything
On 5/12/10 10:56 AM, Martin Weiss wrote:
>
> <>
>
>
> " I think of the range as the min
> to the max *inclusive* of each endpoint;"
>
> Gotcha! But what do non-integer values do to your conviction?
>
> *************
> clear*
> set obs 1000
> set seed 1001
> gen x=rnormal()
> su x
> di in r "Range " %3.2fc r(max)-r(min) " or " %3.2fc r(max)-r(min) +1 " ?"
> *************
>
> HTH
> Martin
>
>
> -----Ursprüngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von Richard
> Goldstein
> Gesendet: Mittwoch, 12. Mai 2010 16:51
> An: [email protected]
> Cc: Martin Weiss
> Betreff: Re: AW: st: sort of standardization
>
> Martin,
>
> look at it this way -- if my min is 1 and my max is 10, then the range
> is 10 (it seems to me), not 9 -- i.e., I think of the range as the min
> to the max *inclusive* of each endpoint; StataCorp apparently disagrees
;-)
>
> Rich
>
> On 5/12/10 10:46 AM, Martin Weiss wrote:
>>
>> <>
>>
>> " local range=r(max)-r(min)+1"
>>
>> Rich, what does the "+1" term do for the "range"? I took the definition
in
>> my code from [R], page 204. Am I missing anything?
>>
>> HTH
>> Martin
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: [email protected]
>> [mailto:[email protected]] Im Auftrag von Richard
>> Goldstein
>> Gesendet: Mittwoch, 12. Mai 2010 16:40
>> An: [email protected]
>> Cc: Ginevra Biino
>> Betreff: Re: st: sort of standardization
>>
>> if I understand correctly what you want, I would do the following within
>> a -foreach- loop:
>>
>> summarize variable
>> calculate the range from r(min) and r(max)
>> divide the old variable by this calculated range inside a -gen-
>>
>> e.g.,
>>
>> foreach var of varlist .... {
>> qui su `var'
>> local range=r(max)-r(min)+1
>> gen `var'3=`var'/`range'
>> }
>>
>> Rich
>>
>> On 5/12/10 10:29 AM, Ginevra Biino wrote:
>>> Dear Statalist,
>>> I have to standardize many variables (in order to run PCA).
>>> Besides generating the n corresponding std(varname) vars, which I have
>>> already done, I also want to generate n new variables obtained dividing
>>> each variable by its range. Can anybody help me?
>>> Ginevra
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/