It's not an answer to your specific question -- at all -- but I've long
given up treating these bubble plots as serious graphics, even though I
don't have a good answer to the problem they are presumably intended to
solve, expect to say that I prefer quite different kinds of graphs for
trivariate problems.
First, StataCorp seems to have stopped publishing the algorithm used in
the manuals.
I don't want to use what I can't explain.
Second, what is a reader expected to think about: the radius or the
area? One possible answer is that it doesn't much matter, as the reader
is not expected to decode bubble sizes quantitatively. As locally we
often say in examinations, "Discuss." There is some psychological
evidence that readers perceive neither radius nor area but a power more
like 0.7. A more general issue, although it's controversial at the
margins, is that people are relatively lousy at decoding areas from
published graphics.
Third, the range of values for the variable used as weight doesn't have
to be pathologically large to present an insoluble dilemma: the largest
bubbles swamp the others, or the smallest bubbles are unreadable.
Largely because of that, StataCorp's algorithm was _not_ one of drawing
proportional to size, but one with extra fudges. I have no reason to
believe that the current algorithm differs.
Nick
[email protected]
Friedrich Huebler
Thank you, Svend. Your solution works in this particular case because
all observations in group 2 have a larger weight than the observations
in group 1. With my real data there is no such neat relationship
between the different groups. The trial and error approach to getting
the right marker sizes is also not practical with my data because I
have 200 observations in 10 groups.
On Wed, Jul 30, 2008 at 9:46 AM, Svend Juul <[email protected]> wrote:
> -scatter- supports weights. When a scatterplot is drawn, all weighted
> markers have a distinct size. This size changes when the data are
> divided into two or more groups and then drawn as overlaid
> scatterplots. I am looking for a way to retain the size of the
> original markers. The problem is best understood with an example.
>
> clear all
> input x y weight group
> 1 1 1 1
> 2 1 10 1
> 1 2 100 2
> 2 2 1000 2
> end
> scatter y x [w=weight], name(A)
> twoway (scatter y x if group==1 [w=weight]) ///
> (scatter y x if group==2 [w=weight]), name(B)
>
> Compare graphs A and B. In graph A all four markers have a different
> size. In graph B there are two pairs of markers with the same size. I
> would like to have the same marker sizes in graph B as in graph A
> while keeping the colors that identify the two different groups. How
> can I do this?
>
> ===============================================================
>
> You can use the msize() option to modify the marker size. By trial
> and error I got something close to what you want - but actually I
> cannot figure out why 0.2 and 2 are the right factors:
>
> twoway (scatter y x if group==1 [w=weight] , msize(*0.2)) ///
> (scatter y x if group==2 [w=weight] , msize(*2)), name(C)
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/