[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: How to calculate 75 percentile of other individuals on the same

From	"Quang Nguyen" <[email protected]>
To	[email protected]
Subject	Re: st: How to calculate 75 percentile of other individuals on the same
Date	Tue, 2 Oct 2007 14:29:50 -1000

Dear Nick,

Thanks so much! I highly appreciate your kind support.

Have A Wonderful Day!
Many thanks!

Quang

On 10/2/07, n j cox <[email protected]> wrote:
> Note that the general issue is also discussed at
>
> How do I create variables summarizing for each individual properties of
> the other members of a group?
> http://www.stata.com/support/faqs/data/members.html
>
> Apart from sums and means -- when we can use short-cuts hased
> on some rearrangement of, or implication of,
>
> sum for everyone = sum for others + value for this individual
>
> -- this kind of problem usually requires a loop. In the FAQ
> just cited, it is shown that you can do by it looping
> over within-group identifiers, rather than the whole
> dataset.
>
> However, the trade-offs are not very clear to me.
>
> -_pctile- is built in, while any call to -egen- involves
> an interpretative overhead. On the other hand, -_pctile-
> can only emit one 75th percentile at a time, and -egen-
> with -by()- can calculate several at a time by side-stepping
> -_pctile-. The precise trade-offs would probably depend on the size of
> the dataset and the number of groups.
>
> No doubt you could also speed it up using Mata or writing
> more direct code.
>
> Nick
> [email protected]
>
> Quang Nguyen asked
>
> A simplified version of my data looks as follows:
>
> ID      Group     X
> 1       a             5
> 2       a             7
> 3       a             9
> 4       a             8
> 5       b             3
> 6       b             4
> 7       b             9
>  ..........................
>
> I would like to generate a new variable whose value is the 75 percentile of
> other individuals in the same group as the concerned individual. For
> example, for the first individual (ID=1), this will be: 75 percentile
> of {7, 9, 8}.
>
> and Joseph Coveney replied
>
> -findit percentile- turns up a lot to pore over.  But among the results
> is -egen <varname> = pctile(exp), p(#)-, which can take a -by- varlist.
>
> Try something like:
> bysort Group: egen p75 = pctile(X), p(75)
>
> To finish:  an observation is going to lie beneath, above or on a given
> percentile for its group, so there's a smarter (more efficient)
> algorithm, but a brute-force approach is shown below.
>
> clear *
> set more off
> set seed `=date("2007-09-29", "YMD")'
> set obs 100
> generate byte pid = _n
> generate byte group = mod(_n, 10)
> generate double response = uniform()
> *
> * Begin here
> *
> tempvar tmpvar0 tmpvar1
> sort group
> generate double p75 = .
> generate double `tmpvar0' = .
> quietly forvalues i = 1/`=_N' {
>     replace `tmpvar0' = response if _n != `i'
>     by group: egen double `tmpvar1' = pctile(`tmpvar0'), p(75)
>     replace p75 = `tmpvar1' in `i'
>     drop `tmpvar1'
>     replace `tmpvar0' = .
> }
> drop `tmpvar0'
> list in 1/20, noobs sepby(group)
> exit
>
> Although my suggestion was centered around -egen-, which is very often a
> convenience, you can usually do things more efficiently.  For example,
> in this case, -_pctile if . . ., percentiles(75)- and then -replace p75
> = r() in . . . - would avoid redundancy of -by . . .: egen . . .
> pctile()- where all of the other groups' results are calculated and
> discarded each time. There are other ways to polish the suggestion, too,
> and difference would be noticeable with large datasets and many groups.
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>


-- 
"My father gave me the greatest gift anyone could give another person,
he believed in me." - Jim Valvano
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: St: How do fit a cox PH model for categorical outcome variable with 3 levels in the same model
  - From: "Sharon Johnatty" <[email protected]>

References:
- Re: st: How to calculate 75 percentile of other individuals on thesame
  - From: n j cox <[email protected]>

Prev by Date: Re: st: Replace a loop with the help of simulate?
Next by Date: st: St: How do fit a cox PH model for categorical outcome variable with 3 levels in the same model
Previous by thread: Re: st: How to calculate 75 percentile of other individuals on thesame
Next by thread: st: St: How do fit a cox PH model for categorical outcome variable with 3 levels in the same model
Index(es):
- Date
- Thread