Jake,
The solution to your problem can be found in the list archive. The
first step is to duplicate the observations so that the weights are
distributed across all groups.
http://www.stata.com/statalist/archive/2009-03/msg01397.html
Friedrich
On Sun, Apr 26, 2009 at 8:48 PM, Jacob Wegelin <[email protected]> wrote:
> produces a plot that is identical to my third example (not second, as
> I said).
>
> There's no need to define local macros for red and green;
> -mcolor(red)- and -mcolor(green)- work just fine.
>
> But, as shown in my third example, when we use two separate -scatter-
> statements the sizes of the plot symbols are computed separately for
> each statement, so that the areas of the resulting symbols are no
> longer comparable between the groups (foreign and domestic).
>
> This is why I'd like instead to specify the color via a variable and
> use a single -scatter- statement. This would be analogous to the
> -mlabel(rep78)- syntax in the first example.
>
> Thanks for any solution
>
> Jake Wegelin
>
>
> On Sun, Apr 26, 2009 at 7:20 PM, Jacob Wegelin <[email protected]> wrote:
>> I believe this is the same as my second example. The two separate
>> -scatter- statements cause the sizes of the plotting circles to be
>> recalculated for each -scatter- statement, so that the areas are no
>> longer proportional to the sample size N. This is why I want to
>> specify the color via a variable, analogous to the -msymbol(rep78)-
>> example.
>>
>> Jake
>>
>> On Sun, Apr 26, 2009 at 6:58 PM, Kieran McCaul <[email protected]> wrote:
>>> .
>>>
>>>
>>> How about:
>>>
>>> local f0 = "red"
>>> local f1 = "green"
>>> twoway (scatter mpg rep78 [fweight=N] if foreign==0, msymbol(Oh)
>>> mcolor(`f0')) ///
>>> (scatter mpg rep78 [fweight=N] if foreign==1, msymbol(Oh)
>>> mcolor(`f1')) ///
>>> , legend(off)
>>>
>>>
>>>
>>>
>>> ______________________________________________
>>> Kieran McCaul MPH PhD
>>> WA Centre for Health & Ageing (M573)
>>> University of Western Australia
>>> Level 6, Ainslie House
>>> 48 Murray St
>>> Perth 6000
>>> Phone: (08) 9224-2701
>>> Fax: (08) 9224 8009
>>> email: [email protected]
>>> http://myprofile.cos.com/mccaul
>>> http://www.researcherid.com/rid/B-8751-2008
>>> ______________________________________________
>>> Epidemiology is so beautiful and provides such an important perspective
>>> on human life and death,
>>> but an incredible amount of rubbish is published. Richard Peto (2007)
>>>
>>> -----Original Message-----
>>> From: [email protected]
>>> [mailto:[email protected]] On Behalf Of Jacob Wegelin
>>> Sent: Monday, 27 April 2009 6:30 AM
>>> To: [email protected]
>>> Subject: st: -twoway scatter- different colors for different
>>> observations
>>>
>>> The following syntax specifies a plot marker or -mlabel- that can take
>>> on a different value at each plotting point (at each observation),
>>> according to the value of variable specified in the -mlabel-
>>> statement.
>>>
>>> clear all
>>> sysuse auto
>>> scatter price mpg, mlabel(rep78) m(i) mlabposition(3)
>>>
>>> But I would like to define a variable that specifies that certain
>>> observations be plotted green, others red. Alternatively that certain
>>> be plotted with color -none- and others with a visible color. How does
>>> one do this?
>>>
>>> To motivate this (with an artificial example constructed to mimic a
>>> real example): The following code plots average mpg by rep78, averaged
>>> separately for foreign and domestic autos. Crucially, the area of the
>>> plotting symbol is proportional to the sample size. I would like to
>>> distinguish visually between domestic and foreign autos, though. Two
>>> *separate* -scatter- statements (the second -twoway- command below)
>>> don't give the desired result, because the plotting symbols are
>>> re-scaled for each -scatter- statement. You can see this by switching
>>> rapidly between the two exported graphs, junk1.pdf and junk2.pdf. And
>>> an attempt to define a string variable -mycolor- which takes on values
>>> "red" and "green", and then to specify -mcolor(mycolor)- analogously
>>> to the -mlabel(rep78)- statement above, returns an error.
>>>
>>> clear all
>>> set more on
>>> sysuse auto
>>> drop if rep78==.
>>> sort foreign rep78
>>> collapse (mean) mpg (count) N=price, by(foreign rep78)
>>> /*
>>> A larger value for N will make the problem easier to see.
>>> */
>>> replace N=50 in 6
>>> list
>>> set scheme lean1
>>> twoway (scatter mpg rep78 [fweight=N], msymbol(Oh))
>>> graph export junk1.pdf, replace
>>> more
>>> twoway ///
>>> (scatter mpg rep78 [fweight=N] if foreign==0, msymbol(Oh)
>>> mcolor(red)) ///
>>> (scatter mpg rep78 [fweight=N] if foreign==1, msymbol(Oh)
>>> mcolor(green)) ///
>>> , legend(off)
>>> graph export junk2.pdf, replace
>>> more
>>> /* The following returns an error */
>>> gen mycolor=""
>>> replace mycolor="red" if foreign==0
>>> replace mycolor="green" if foreign==1
>>> tabulate foreign mycolor
>>> twoway (scatter mpg rep78 [fweight=N], msymbol(Oh) mcolor(mycolor))
>>>
>>>
>>> The [G] GRAPHICS manual under -marker_options- says that one could
>>> define color by specifying a list of elements, as
>>>
>>> -mcolor( red green red)-
>>>
>>> but this would be clumsy and error-prone. There must be a way to use
>>> the values of a variable, as in the -mlabel(rep78)- example?
>>>
>>> Thanks for any insights
>>>
>>> Jacob A. Wegelin
>>> Assistant Professor
>>> Department of Biostatistics
>>> Virginia Commonwealth University
>>> 730 East Broad Street Room 3006
>>> P. O. Box 980032
>>> Richmond VA 23298-0032
>>> U.S.A.
>>> E-mail: [email protected]
>>> URL: http://www.people.vcu.edu/~jwegelin
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/