Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AW: st: sort of standardization


From   "Joseph Coveney" <[email protected]>
To   <[email protected]>
Subject   Re: AW: st: sort of standardization
Date   Thu, 13 May 2010 23:08:09 +0900

I'm not trying to side with Rich on this, but I know of at least one other 
area where such a formula is used for a range--or at least a range-like 
concept--and is not intended to indicate cardinality:  in medical research, 
length of hospital stay is defined as 

    discharge date - admittance (admission) date + 1

and duration of an episode of a drug side effect is apparently defined in 
some quarters as

    recovery time - onset time + 1

See

http://stata.com/statalist/archive/2002-12/msg00262.html

and 

http://www.stata.com/statalist/archive/2003-05/msg00433.html

Joseph Coveney

--------------------------------------------------------------------------------

For sure, but who (else) calls this the range? 

That's just (a version of) the number of distinct values. In some moods,
or in some circles, many of us would call it the cardinality. 

A version of, because to spell out the obvious, even with integers there
are at least three definitions that need not give the same numerical
answer:

1. Number of distinct values observed. 

2. Number of distinct values possible in principle. 

3. max - min + 1. 

Otherwise put, are we talking different terminology or different
concepts?  

Nick 
[email protected] 

Lachenbruch, Peter

I think Rich is thinking of the number of distinct integers between 1
and 10, while the range is generally defined as the largest minus the
smallest.

Nick Cox

The word "range" is surely ambiguous, although the ambiguity does not
bite hard. I have no difficulty in saying both that the range is the
interval [1,10] and that the range is the difference 9. Does that differ
from Rich's view? 

Nick 
[email protected] 

Richard Goldstein

look at it this way -- if my min is 1 and my max is 10, then the range
is 10 (it seems to me), not 9 -- i.e., I think of the range as the min
to the max *inclusive* of each endpoint; StataCorp apparently disagrees
;-)

On 5/12/10 10:46 AM, Martin Weiss wrote:

> " local range=r(max)-r(min)+1"
> 
> Rich, what does the "+1" term do for the "range"? I took the
definition in
> my code from [R], page 204. Am I missing anything?

Richard Goldstein
 
> if I understand correctly what you want, I would do the following
within
> a -foreach- loop:
> 
> summarize variable
> calculate the range from r(min) and r(max)
> divide the old variable by this calculated range inside a -gen-
> 
> e.g.,
> 
> foreach var of varlist .... {
> qui su `var'
> local range=r(max)-r(min)+1
> gen `var'3=`var'/`range'
> }

> On 5/12/10 10:29 AM, Ginevra Biino wrote:

>> I have to standardize many variables (in order to run PCA).
>> Besides generating the n corresponding std(varname) vars, which I
have
>> already done, I also want to generate n new  variables obtained
dividing
>> each variable by its range. Can anybody help me?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index