[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: ranking with weights

From	Steven Samuels <[email protected]>
To	[email protected]
Subject	Re: st: ranking with weights
Date	Tue, 2 Dec 2008 16:16:59 -0500

--

Cindy-- The weights are not likely to be frequency weights (fweights)--they are probability weights (pweights), possibly post-stratified.If they are whole numbers than someone has rounded them. You tillhaven't answered the question: why do you want to rank thehouseholds? Quantities calculated in samples are estimates ofpopulation quantities. What population quantities are you trying toestimate with the ranks? If you are trying to estimate percentiles,the -pctile- command will take pweights.


-Steve

On Dec 2, 2008, at 2:16 PM, Cindy Gao wrote:

Thanks for your reply.
The observations (analytic units) are households. Expenditure isthe monthly expenditure of household. This is household surveydata. The weights are frequency weights, to weight the sample tothe whole country. The weights are likely to vary across forexample regions, to compensate for oversampling or undersampling.
Basically I need to rank all households according to theirexpenditure, from lowest to highest. But, I must take account ofthe weightings. If for example there are 2 households with the sameexpenditure, they must be ranked the same and this rank must takeaccount of weightings. If there were no ties (households with sameexpenditure), I could achieve mission by generating a variable"rank", like -g rank=sum(weight)-. The problem comes because ofties. If i could -expand- my dataset using weights, then i couldsimply say -egen rank =rank(expenditure)- ; the problem is thatdataset is too large for this.

---- Original Message ----
From: Steven Samuels <[email protected]>
Cindy, What are the analytic units (people? regions?). What arethe "weights"? What is "expenditure"? How is it measured. What doyou mean that some regions are "less sampled" than others. It'snot clear, for example, if this is a sample, and if so, of what?So, please describe the study design in detail. Last question:what is the purpose of the ranking?
-On Dec 2, 2008, at 12:54 PM, Cindy Gao wrote:
I am trying to find a way to rank weighted data (since the egenfunction -rank- does not work with weights). A simple way would beorder the data in terms of variable that I have interest in(monthly expenditure) and then create a new variable like -grank1=sum(weight)-. But, there is problem. Some of my observationsare "tied" as they have the same level of expenditure. Using thesimple method I mention means that some observations are rankedabove others even though they have same level of expenditure. Thisis a problem as the weights are large so you find that 2observations are ranked with bug gap in between even though samelevel of expenditure. It is even bigger problem because theweights might be correlated with some other variables I aminterested in (like region, since some regions are less sampledthan other). I also try multiplying the expenditure ranking by theweight, but this gives wrong results (for example they do not addup to weightedtotal). Can anyone help? In other words, I would like for allobservations with same expenditure to have same rank, which Iassume would be some average of all the weighted observationshaving that same expenditure.. I include a sample dataset below:
expenditure weighting rank rank1weighted_rank10 341 1341 341
12                          1065          2.5        1406         ???
12                          98             2.5        1504
15                          254            4          1758
.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: ranking with weights
  - From: Cindy Gao <[email protected]>
- Re: st: ranking with weights
  - From: Steven Samuels <[email protected]>
- Re: st: ranking with weights
  - From: Cindy Gao <[email protected]>

Prev by Date: Re: st: Length for strings, ignoring SMCL tags
Next by Date: RE: st: Length for strings, ignoring SMCL tags
Previous by thread: Re: st: ranking with weights
Next by thread: st: Proper usage of Macros stored in summarize
Index(es):
- Date
- Thread