First, to repeat a point made gently by both
Scott Merryman and myself, and explained
prominently in the Statalist FAQ, please
do _not_ use anything other than plain text
to send posts to the list. No HTML, etc.
My original reply to your question was
longer and more detailed, but Scott's was
more direct and nearer what you want.
(1) Cf. Scott's solution, tweaked to show
percents.
Try
. graph bar acexist, over(year) yla(0 .2 "20" .4 "40" .6 "60" .8 "80")
ytitle(percent with audit committee)
This works because it is short-hand for
. graph bar (mean) acexist, over(year) yla(0 .2 "20" .4 "40" .6 "60" .8 "80")
ytitle(percent with audit committee)
(2) Cf. my solution
I made up a dataset with exactly the frequencies
given in your post.
I guess the reason my code produces the wrong results
for you is that you have missing values you didn't
tell us about and which my code assumed not to exist.
More careful code is
egen pc = sum(acexist), by (year)
egen total = sum(acexist < .), by(year)
replace pc = 100 * pc / total
but a more direct approach would be
egen PC = mean(100 * acexist), by(year)
and that doesn't need any tweaking to
ensure the right answer if missings are
present.
After that it is
graph bar PC, over(year) ...
Nick
[email protected]
Katarina Sikavica
dear nick cox and others:
I have just tried to do as suggested in your e-mail below; that is, since I am using stata 8 I typed:
egen pc = sum(acexist), by (year)
egen total = sum(1), by(year)
replace pc = 100 * pc / total
I have also tried:
bysort year: gen pc = sum(acexist)
bysort year: replace pc = 100 * pc[_N] / _N
.... but unfortunately I get twice the same wrong results:
tabdisp year, cell(pc) shows:
year pc
2000 12,09302
2001 28,50467
2002 62.61682
2003 71.02804
2004 74.4186
am I doing something wrong??? help!!
Nick Cox
Katarina's data are like this:
. tab year acexist
| acexist
year | 0 1 | Total
-----------+----------------------+----------
2000 | 29 26 | 55
2001 | 24 61 | 85
2002 | 69 134 | 203
2003 | 56 152 | 208
2004 | 50 160 | 210
-----------+----------------------+----------
Total | 228 533 | 761
-graph bar- won't deliver the reduction she wants, at least not without
some preparation. The reason is a little technical. -graph bar- is based
mainly on a temporary reduction of the data using -collapse-, and
-collapse- doesn't offer that reduction. (It is nearer the territory of
-contract-, but that is a different story.)
There are various solutions to the problem. A first solution is to
generate your own percent variable and then plot that directly. Each
percent is, we recall, a numerator divided by a total, multiplied by
100.
One easy way to get the total is using -egen, total()-. In Stata 8 and
earlier, the function here was -egen, sum()-, not -egen, total()-.
. egen pc = total(acexist), by(year)
. egen total = total(1), by(year)
(what's 1 + 1 + 1 + ... + 1? the total number of observations)
. replace pc = 100 * pc / total
Stata diehards would scoff at this as namby-pamby and do it with -by:-.
. bysort year: gen pc = sum(acexist)
. by year: replace pc = 100 * pc[_N] / _N
Either way, we can check that we are on the right lines by
. tabdisp year, cell(pc)
----------------------
year | pc
----------+-----------
2000 | 47.27273
2001 | 71.76471
2002 | 66.00985
2003 | 73.07692
2004 | 76.19048
----------------------
Then the graph is a line away:
. graph bar (mean) pc, over(year) ytitle(percent with audit committee)
yla(, ang(h))
Or
. twoway bar pc year, ytitle(percent with audit committee)
yla(, ang(h)) barw(0.5)
Another solution employs a user-written program -catplot- from SSC. You
can install that by
. ssc install catplot
-catplot- is just a wrapper for -graph bar- (or -graph hbar- or -graph
dot-). It merely grinds through some reductions not quite trivial
otherwise and then fires up a -graph- command.
You can get a graph in one line with -catplot- without any prior
calculation, although in practice I get there through a sequence of
small experiments:
. catplot bar acexist year, percent(year) stack asyvars yla(, ang(h))
yti(percent without and with audit committee)
legend(order(1 "without" 2 "with"))
A graph I like more follows a reversal of coding:
. gen acexist2 = 1 - acexist
. catplot bar acexist2 year, percent(year) stack asyvars yla(, ang(h))
yti(percent with audit committee) legend(off) bar(2, bcolor(none))
The original announcement of -catplot- contains some related comment.
http://www.stata.com/statalist/archive/2003-02/msg00608.html
Nick
[email protected]
Katarina Sikavica (edited, mainly to ASCII from HTML)
I have just started with Stata graphics and have the following problem
with -graph bar-:
I have a dataset that contains data on the existence of audit committees
-acexist-. In total there are 761 companies, 533 of them having an
audit committee, 228 not. I would like to draw a -graph bar- that shows
the increase in audit committee incidence over -year-. Drawing a -graph
bar- on the increase in the number of audit committees works fine;
however, as the data from 2000 and 2001 are of poor quality I would like
to have percentages of audit committee incidence over the years
2000-2004 (that is: 47.27% (2000); 71.76% (2001); 66.01% (2002); 73.08%
(2003); 76.19% (2004)). Neither of the following commands leads to the
desired results:
. graph bar (sum) acexist, over (year) percent
. graph bar (sum) acexist, over (year) asyvar percentages
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/