Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: visualization?
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: visualization?
Date
Fri, 30 Sep 2011 15:23:22 +0100
Do you mean Vince Viggins? Sounds like a Dickens character. We saw
Vince Wiggins at the London meeting.
We are starting with these suggestions. I'll add numbers for convenience.
1. If one of the variables is positively skewed, consider plotting
that axis on a log scale.
2. If there are a lot of data points (e.g., n > 1000), adopt a
different strategy such as using some form of partial transparency, or
sampling the data;
3. If one of the variables takes on a limited number of discrete
categories, consider using a jitter or a sunflower plot;
4. If there are three or more variables, consider using a scatterplot matrix;
5. Fitting some form of trend line is often useful;
6. Adjust the size of the plotting character to the sample size (for
bigger n, use a smaller plotting character);
Random comments
1. I take this as standard. I'll add a plea for consideration of any
reasonable non-linear scale, labelled in the original units!
2 and 6. Transparency is on some wishlists for Stata. With lots of
data, you go not only for smaller symbols but more open ones and use
lighter colors.
3. I've played with sunflower plots and gone off them. But if you want
to try them, note that they are undocumented [sic] at -help twoway
sunflower-. For highly discrete or even categorical variables, I like
my -tabplot- (SSC).
4. Agree, although that does not rule some projection from a
multivariate analysis being helpful too.
5. Yes, if "trend" means "smooth". Some special smooths were published in
SJ-10-1 gr0021_1 . . . . . . . . . . Software update for doublesm and diagsm
(help doublesm, diagsm, polarsm if installed) . . . . . . . N. J. Cox
Q1/10 SJ 10(1):164
option to carry out smoothing using restricted cubic splines
added to doublesm and diagsm
SJ-5-4 gr0021 . . . . . . . Speaking Stata: Smoothing in various directions
(help doublesm, diagsm, polarsm if installed) . . . . . . . N. J. Cox
Q4/05 SJ 5(4):574--593
discusses exploratory tools for determining the structure
of bivariate data
Some possible additions:
7. About 1980, there was a sudden fashion for adding convex hulls,
which faded away quickly. I remember often doing it with a pencil on
lineprinter output. But Allan Reese has a nice implementation on SSC
as -cvxhull-. On occasion that helps a lot.
8. When you have a categorical subdivision, try out both several
categories superimposed and a -by()- option to give separate plots. A
third strategy is given in
SJ-10-4 gr0046 . . . . . . . . . . . . . . . Speaking Stata: Graphing subsets
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q4/10 SJ 10(4):670--681 (no commands)
explores graphical comparison of results for two or more
subsets where each subset is plotted in a separate panel,
with the rest of the data as a backdrop
Nick
On Fri, Sep 30, 2011 at 2:56 PM, Stas Kolenikov <[email protected]> wrote:
> There was an interesting question on data visualization on
> Stats.StackExchange (http://stats.stackexchange.com/q/13148/5739):
> what are the efficient strategies for tweaking scatterplots depending
> on the data needs? Too much data make it clogged, too little data such
> as ordinal make it too chunky, too skewed data makes it sit in one
> corner, and there are a multitude of other things that needs to be
> adjusted to make the display really informative.
>
> I would be especially curious to hear from Nick Cox and Michael
> Mitchell, I guess, as the greatest contributors to Stata graphics (and
> of course Vince V, but I don't think I've seen him on the list for a
> while).
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/