Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: rank ordered twoway graph


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: RE: rank ordered twoway graph
Date   Fri, 13 Jun 2008 16:39:21 +0100

Here's a program sketch. This is indicative, not definitive, and quite
untested. Although this is set up as a scatter plot, you can use
-recast()- to get other kinds of plot. 

program shehzad 
	version 8.2 
	syntax varlist(min=2 max=2) [if] [in] [, * ] 
	* othervar incomevar [if] [in] [, <twoway_options> ] 

	quietly { 
		marksample touse 
		count if `touse' 
		if r(N) == 0 error 2000

		tokenize "`varlist'" 
		args other income  

		tempvar rankvar 
		egen `rankvar' = rank(-`income') if `touse' 
		local what : var label `income' 
		if `"`what'"' == "" local what "`income'" 
		label var `rankvar' `"rank on `what'"' 
	}

	twoway scatter `other' `rankvar' if `touse', `options' 
end 
	
Nick
[email protected] 	

Shehzad Ali

Thanks, Nick. Your email nicely summarises the available solutions.
Sorry
for being adamant to get a one line solution. 

Nick Cox

Short answer: No, it is not possible. But other solutions are easy. 

For those following along, -qplot- is a program downloadable from the
Stata website given its publication in the Stata Journal. -search qplot-
for locations. 

-qplot- is best thought of as a considerable generalisation of the
official -quantile- command (with one cosmetic exception: -qplot- does
not show the tilted reference line that is a -quantile- default).
-qplot- plots ordered or sorted values (with some flexibility about how
sorted) versus cumulative fraction or rank (ditto), or indeed, via the
-xvariable()- option, versus any specified variable.

Shehzad's problem is different. Shehzad wants to plot something else
versus the ranks of (e.g.) income. This is not among the possibilities
of -qplot- at present. It might be accommodated by adding a
-yvariable()- option, but I am reluctant to do that. There is a fine
line between adding another feature that somebody might want and making
a program yet more complicated to use and understand, and indeed there
is, in this case, a risk of muddying the purpose of the program. 

As previously implied, Shehzad's solution should be at worst two lines'
worth. Apart from what you can with -glcurve- (-findit- for locations),
there is a more direct solution: 

egen rank = rank(income) 
twoway <whatever> <somethingelse> rank 

There is much scope for modifying that simple recipe. For example, the
usual convention is that the richest person is #1, and so forth, and
that is achievable in a flash: 

egen rank = rank(-income)

Further, -egen, rank()- has options. Mapping to variously defined
cumulative fractions is also a step away. There is an FAQ on that: 

FAQ     . . . . . . . . . . Calculating percentile ranks or plotting
positions
        7/02    How can I calculate percentile ranks?
                How can I calculate plotting positions?
                http://www.stata.com/support/faqs/stat/pcrank.html

Shehzad seems to be hankering after a one-line alternative to this
two-line solution. 
That's achievable with some programming, but as I said earlier I don't
think it exists at present. 

Shehzad Ali

Thanks, Nick. This is a very useful command to know. Now my question is:
using qplot I want to generate a graph of y-variable which is rank
ordered
by x-variable (obviously different from Pen's parade). If I used the
option
- xvariable - in - qplot -, it produces an x-axis over the entire range
of x
variable but I am interested only in x-rank to sort my y-var. Is it
possible
to do this using qplot?

Nick Cox

You can make this possible in one line by writing a wrapper for those
commands, but I am not aware of any way to do that which is already
available. -qplot- (-search- for locations) allows plots of income
versus income rank, including a kind of graph which income people
associate with Jan Pen (although the same idea goes back much earlier). 

Shehzad Ali

I want to create a twoway graph (like spike or dropline) where the
x-axis 
is a rank variable (based on say income) and y-axis is the other
variable 
of interest (say health_pay). A simple command - twoway spike health_pay

income - spread the x-axis over the whole range of income but I am 
interested in income rank and not income itself.

I know one way to do it is to generate income rank variable using -
glcurve 
- and then plotting the rank_var against health_pay but I was wondering
if 
there was a direct method.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index