|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: catplot problem
Tunga refers to a "problem" and declares his results to be "wrong".
There is no problem, except perhaps that Tunga is a little confused, or
expecting a program to behave differently from what its creator intended.
Tunga is using -catplot- from SSC (and on the side -fre- from SSC).
Remember that you are asked to say where user-written program you refer
to come from.
We do not have access to Tunga's data, but we can replicate his
situation like this:
clear
input tunga freq
1. 1 4
2. 4 3
3. 5 6
4. 6 10
5. 7 16
6. 8 9
7. 9 1
8. 10 2
9. end
expand freq
tab tunga
tunga | Freq. Percent Cum.
------------+-----------------------------------
1 | 4 7.84 7.84
4 | 3 5.88 13.73
5 | 6 11.76 25.49
6 | 10 19.61 45.10
7 | 16 31.37 76.47
8 | 9 17.65 94.12
9 | 1 1.96 96.08
10 | 2 3.92 100.00
------------+-----------------------------------
Total | 51 100.00
Now if you go
histogram tunga, discrete freq
-histogram- shows not only bars for the populated categories 1 4/10 but
also gaps with no bars, or bars of zero height, at 2 3.
If you go
catplot bar tunga
no gaps are shown, just bars for 1 4/10. This is puzzling Tunga.
-catplot- is designed, as the name implies and the help explains, for
categorical data. As far as it is concerned, 1 4/10 are labels for
categories that it shows in the sort order of the variable concerned. It
has no consciousness of a gap at 2 3, any more than if you had data for
"aardvark" "bison" and "elephants" it would insert categories at
"cattle" and "donkeys". It is not designed as a clone of -histogram-,
which would be futile as -histogram- already exists. -catplot- is not
intended to show numerical variables on numerical scales.
There is an issue on the side of whether Tunga expects a graphics
program to know what is going on in observations that have been
expressly excluded as a consequence of -if-. There are some subtleties
there, but the behaviour of -histogram- here does not reflect the fact
that there are values for 2 3 in the rest of the dataset, as the above
dataset will show. Rather, -histogram- draws a numeric axis over the
range of the data and then shows (visible) bars where they belong.
It seems that Tunga thinks of his variable as numerical and there
"should be" gaps at 2 3. If so, that's a view accommodated by
-histogram-. Tunga will not get a satisfactory graph with -catplot-.
Note that -catplot bar- here is just a wrapper for -graph bar-, so the
question is in a strong sense about the different views of -graph bar-
and -histogram-. The behaviour you are seeing is what you would get with
graph bar freq, over(tunga)
and is not idiosyncratic to -catplot-.
Nick
[email protected]
Tunga Kantarci
I am having a problem with catplot.
First, I consider the following command:
fre q41a2vr1 if grandom == 1 & frandom == 2
q41a2vr1 -- Het plan van ^name5 q4 age randomisation 1plan 1 rating 3/4
----------------------------------------------------------------------
| Freq. Percent Valid Cum.
-------------------------+--------------------------------------------
Valid 1 helemaal niks | 4 2.78 7.69 7.69
4 | 3 2.08 5.77
13.46
5 | 6 4.17 11.54
25.00
6 | 10 6.94 19.23
44.23
7 | 16 11.11 30.77
75.00
8 | 9 6.25 17.31
92.31
9 | 1 0.69 1.92
94.23
10 ideaal | 3 2.08 5.77 100.00
_______________________________________________
Second, I consider the histogram regarding this command:
hist q41a2vr1 if grandom == 1 & frandom == 2, discrete freq
Also note that for q41a2vr1 I have 10 values from "1 helemaal niks" to "10
ideaal":
fre q41a2vr1
q41a2vr1 -- Het plan van ^name5 q4 age randomisation 1plan 1 rating 3/4
----------------------------------------------------------------------
| Freq. Percent Valid Cum.
-------------------------+--------------------------------------------
Valid 1 helemaal niks | 19 0.95 4.53 4.53
2 | 6 0.30 1.43 5.97
3 | 11 0.55 2.63 8.59
4 | 29 1.45 6.92 15.51
5 | 53 2.64 12.65 28.16
6 | 85 4.24 20.29 48.45
7 | 107 5.34 25.54 73.99
8 | 73 3.64 17.42 91.41
9 | 18 0.90 4.30 95.70
10 ideaal | 18 0.90 4.30 100.00
Total | 419 20.91 100.00
Missing . | 1585 79.09
Total | 2004 100.00
----------------------------------------------------------------------
As seen in the first table and as I see in the histogram no bars, there are
no observations to correspond to "2" and "3".
Now, I run the following command:
catplot bar q41a2vr1 if grandom == 1 & frandom == 2
(alternatively I run the following command, but then check for frandom == 2
in the graph. These are same things: catplot bar frandom q41a2vr1 if grandom
== 1)
The problem arises here: Catplot does not give blanks for "2" and "3" that
the q41a2vr1 variable takes.
And it (although it starts correctly with "1" the q412vr1 takes) takes the
value of
"4" in the histogram and puts it for "2" in the catplot,
"5" in the histogram and puts it for "3" in the catplot,
"6" in the histogram and puts it for "4" in the catplot,
"7" in the histogram and puts it for "5" in the catplot,
"8" in the histogram and puts it for "6" in the catplot,
"9" in the histogram and puts it for "7" in the catplot,
"10" in the histogram and puts it for "8" in the catplot.
Hence what catplot shows is wrong.
It slides what is on the right in the histogram to the left in the catplot.
How can I solve this problem?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/