|  |  | 
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: exclusion of observations
Your general idea seems to be
1. personal data on earnings
2. a group variable, province
3. you want the median of everybody else's earnings,
within province.
This kind of problem is discussed in
How do I create variables summarizing for each individual properties of 
the other members of a group?
http://www.stata.com/support/faqs/data/members.html
The general recipe is
for each person in the province {
	calculate a statistic from other data for the province
        assign the result to that person
}
For sums, we could use a simple trick
egen total = total(earnings), by(province)
gen others = total - earnings
and for means, we only need one more trick
egen n = count(earnings), by(province)
gen meanothers = others / (n - 1)
But you said median.
Here is a sketch:
bysort province: gen pid = _n
gen median = .
summarize pid, meanonly
quietly forvalues i = 1/`r(max)' {
	egen work = median(cond(pid != `i', income, .)), by(province)
	replace median = work if pid == `i'
        drop work
}
drop pid
For more on the logic, see the FAQ cited. A slightly delicate
detail is how to transfer results to observations for which they
were not calculated. What is above is one way to do it.
Mind you, unless your dataset is very small, this should be similar
to
egen median = median(earnings), by(province)
Nick
[email protected]
[email protected]
I need to construct a variable identifying the median of earnings by
province in South Africa.
The variable I am going to use should be constructed as the median of
the distribution of earnings excluding the person herself.
Which is the command to exclude such an observation?
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/