It's not clear to me what you mean by "I'm not clear on the
implications" but look at this example:
. sysuse auto, clear
. tab rep78
Repair |
Record 1978 | Freq. Percent Cum.
------------+-----------------------------------
1 | 2 2.90 2.90
2 | 8 11.59 14.49
3 | 30 43.48 57.97
4 | 18 26.09 84.06
5 | 11 15.94 100.00
------------+-----------------------------------
Total | 69 100.00
Think of the percent column as being the probability of picking a car
with the given repair record when picking a car from your sample.
Now:
The row and column proportions are like conditional probabilities,
while the cell proportions are like joint probabilities. From the
table below, we see that the probability of picking a car with repair
record 1 given that you are picking from domestic cars is 4.17% but
the probability of picking a domestic car given that you are picking
from cars with with repair record 1 is 100%. The cell proportion is
the chance you pick a domestic car with repair record 1 from the
sample (2.9%), i.e. P(domestic & rep.rec.1).
If you have survey data, you will use the -svy- analog, and you might
make inferences about whether (for example) a high-school dropout is
more likely to be poor than someone who finished HS (comparing the row
proportions in col 1 for a tab of hsgrad versus poor) or whether a
poor person is more likely to be a HS grad than a non-poor person
(comparing the col proportions in row 2 for a tab of hsgrad versus
poor). The row and column proportions (to say nothing of cell
proportions) measure distinct and interesting concepts, but their use
is entirely up to you.
. tab rep78 for, row col cell nofr
+-------------------+
| Key |
|-------------------|
| row percentage |
| column percentage |
| cell percentage |
+-------------------+
Repair |
Record | Car type
1978 | Domestic Foreign | Total
-----------+----------------------+----------
1 | 100.00 0.00 | 100.00
| 4.17 0.00 | 2.90
| 2.90 0.00 | 2.90
-----------+----------------------+----------
2 | 100.00 0.00 | 100.00
| 16.67 0.00 | 11.59
| 11.59 0.00 | 11.59
-----------+----------------------+----------
3 | 90.00 10.00 | 100.00
| 56.25 14.29 | 43.48
| 39.13 4.35 | 43.48
-----------+----------------------+----------
4 | 50.00 50.00 | 100.00
| 18.75 42.86 | 26.09
| 13.04 13.04 | 26.09
-----------+----------------------+----------
5 | 18.18 81.82 | 100.00
| 4.17 42.86 | 15.94
| 2.90 13.04 | 15.94
-----------+----------------------+----------
Total | 69.57 30.43 | 100.00
| 100.00 100.00 | 100.00
| 69.57 30.43 | 100.00
HTH--Austin
On 3/1/06, Lauren Maxim <[email protected]> wrote:
> Can someone explain conceptually the difference between calculating cell vs.
> row proportions, in a two-way tabulation? I generated quite different
> values and confidence intervals, and I'm not clear on the implications or
> which best addresses my question of interest.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/