|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: RE: Histogram, by(var, total)
I should point out that treatment of missing values should get some
consideration.
The code below includes observations with missing values on the -by()-
variable in the total category, but excludes them from the categories
shown separately.
If you wanted to exclude observations with missing values from the
total, you could specify
expand 2 if !missing(foreign)
Nick
[email protected]
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: 07 June 2009 18:15
To: [email protected]
Subject: st: RE: Histogram, by(var, total)
I don't know a direct way to do this, but some trickery produces the
same result. It is best explained by example.
sysuse auto, clear
preserve
local which = _N + 1
expand 2
replace foreign = -1 in `which'/L
label def origin -1 "Total", add
histogram mpg, by(foreign)
restore
Key points:
1. -expand 2- doubles the dataset. The second half that is a copy of the
first half is to be used to work out the "Total" category.
2. If the -by()- variable is integer with value labels, the extra
observations should be assigned an integer value for the -by()- that is
_lower_ than any other observed. You then need to define an appropriate
value label. (In this case, I know that the other values are 0 and 1.)
3. You do _not_ then specify the -total- suboption, as you are using
your own subterfuge to replicate it.
4. -preserve- and -restore- are optional, but note otherwise that this
is a major change to the dataset.
Note that Stata in no sense "knows" that the extra category is a total
category, but that shouldn't matter.
Now what would be done if the -by()- variable were string? At first
sight, we have a problem here because "Total" would not necessarily sort
first in a set of alphanumeric categories. We could use some label like
"All observations" but what then if we have "Aardvarks" as a category?
Here is a better trick (not to rule out the possibility of an even
better trick):
sysuse auto, clear
preserve
decode foreign, gen(Foreign)
local which = _N + 1
expand 2
replace Foreign = " Total " in `which'/L
histogram mpg, by(Foreign)
restore
The -decode- is just to produce an example with an appropriate string
variable. In practice it will exist already. Notice the two small parts
to the trick:
(a) Putting a space before the "Total" makes it more likely to sort to
the beginning of any set of categories. The space " " is a character
too.
(b) Putting a space afterwards ensures that the "Total" is still centred
on the graph (if you care about that).
But we need not worry too much about the string case. If you can't get
the order you want, map the strings to integers with value labels.
Naturally, nothing here is distinctive to histograms.
Nick
[email protected]
*From:* Thoma, Marie E.
I would like to use the "histogram yvar, by(xvar, total)" command to
produce a histogram of the total and stratified variable. However, in
Stata, it places the "total" graph as the last graph and I would like to
have it as the first graph (before the stratified graphs).
Does anyone know how to change this either using this command or another
way to accomplish this same layout?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/