Roger Harbord
>
> I've encountered some strange behaviour in Nick Cox's
> program -distplot-
> published in STB-51. It won't let me plot the distribution
> functions of
> more than one variable AND use another variable to label
> the points,
> although I can do either alone.
>
> I'll use the automobile data to demonstrate so others can
> try to replicate
> the problem, although there's little purpose to the
> commands in this
> context - with my real data I'm comparing two distributions
> graphically and
> hoped to label the points so I can identify outliers at the
> same time
> (mainly for presentation purposes):
>
> -------------------------------------------
> . use "C:\Program Files\StataSE\auto.dta"
> (1978 Automobile Data)
>
> . distplot length, c(J) s([rep78])
>
> . * (worked fine)
>
> . distplot length displacement, c(JJ) s(oo)
>
> . * (also worked fine)
>
> . distplot length displacement, c(JJ) s([rep78]o)
> variable rep78 not found
> r(111);
>
> . * Just to check I've got the syntax right in the s() option :
> . graph length displacement weight, c(JJ) s([rep78]o)
>
> . * (worked fine though a mess of a graph!)
>
> . * describe my setup :
>
> . which distplot
> c:\ado\stbplus\d\distplot.ado
> *! version 1.5.0 NJC 24 March 1999 [STB-51: gr41]
>
> Anyone any idea what's going wrong here? That error
> message seems most
> bizarre to me - how can Stata suddenly not be able to find rep78 ??
No mystery here: although this behaviour is not
documented, it is a straightforward consequence
of the method used, as may be seen by looking at
the code.
-distplot- temporarily -preserve-s and then restructures
the data when you want a plot of more than one variable.
The rationale for doing this was, presumably, to allow
a plot of more than 20 variables. Occasionally one has
a large bundle of variables all of the same kind and
it is desired to see a spaghetti plot showing their
collective pattern.
The side-effect, however, is that some variables in the
data set may disappear from sight while the graph is being produced.
A work-around would be for me to add an option specifying
variables which must be carried along, or for me to
parse the -sy()- argument and automatically identify
variables which the user clearly needs for the purpose.
I'll put that on a to-think-about list.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/