Hello, Statalist,
In brief, how does one test a difference in difference of proportions?
My question is re-stated briefly at the end with reference to the
variables I present. A formula and/or reference would be appreciated
if no command exists.
I would like to test a difference in difference of proportions.
-prtest- and -prtesti- do not work (easily) for my data, even for a
simple test of differences. I have data grouped such that, for N
states, I have the number of persons in state i with a condition (the
variable for that count is f) and the population of state i in year y
(pop). A "treatment" is applied and the pre-treatment period is t=0
and the post-treatment period is t=1. One can consider south=1 to be
the treated and south=0 to be the non-treated group.
For example, some observations may look like this:
state year pop f t south
1 1990 1200000 10000 0 0
...
50 1990 3000000 900 0 1
...
1 2000 1500000 21000 1 0
...
50 2000 3900000 2900 1 1
For differences in proportions within, for example, the pre-treatment
period, for states in two regions (south==0 and south==1), I use,
egen f_north_0 = sum(f) if south==0 & t==0
egen pop_north_0 = sum(pop) if south==0 & t==0
egen f_south_0 = sum(f) if south==1 & t==0
egen pop_south_0 = sum(pop) if south==1 & t==0
gen phat_n_0 = f_north_0/pop_north_0 /* proportion in north pre-treatment */
gen phat_s_0 = f_south_0/pop_south_0 /* proportion in south pre-treatment */
gen sp_n_0 = sqrt(phat_n_0*(1 - phat_n_0)/pop_north_0) /* standard
error for phat_n_0 */
gen sp_s_0 = sqrt(phat_s_0*(1 - phat_s_0)/pop_south_0) /* standard
error for phat_s_0 */
egen fn_0 = mean(f_north_0)
egen fs_0 = mean(f_south_0)
egen pn_0 = mean(pop_north_0)
egen ps_0 = mean(pop_south_0)
gen phat_0 = (fn_0 + fs_0)/(pn_0 + ps_0) /* pooled proportion, pre-treatment */
gen qhat_0 = 1 - phat_0
gen sp_0 = sqrt(phat_0*qhat_0*(1/pn_0 + 1/ps_0)) /* standard error of
difference of proportions */
gen z_0 = (fs_0/ps_0 - fn_0/pn_0)/sp_0
(At this point I suppose I could use -prtesti- by summarizing the
relevant variables then typing the results into the prtesti
command...In any case, I think that neither -prtest- nor -prtesti-
will help me with testing a difference in differences.)
This, it would seem, allows me to test the difference in proportions
in the pre-treatment period. Similarly, if I generate similar values
for the post-treatment period, I can test the difference in
proportions in the post-treatment period.
egen f_north_1 = sum(f) if south==0 & t==1
egen pop_north_1 = sum(pop) if south==0 & t==1
egen f_south_1 = sum(f) if south==1 & t==1
egen pop_south_1 = sum(pop) if south==1 & t==1
gen phat_n_1 = f_north_1/pop_north_1
gen phat_s_1 = f_south_1/pop_south_1
gen sp_n_1 = sqrt(phat_n_1*(1 - phat_n_1)/pop_north_1)
gen sp_s_1 = sqrt(phat_s_1*(1 - phat_s_1)/pop_south_1)
egen fn_1 = mean(f_north_1)
egen fs_1 = mean(f_south_1)
egen pn_1 = mean(pop_north_1)
egen ps_1 = mean(pop_south_1)
gen phat_1 = (fn_1 + fs_1)/(pn_1 + ps_1)
gen qhat_1 = 1 - phat_1
gen sp_1 = sqrt(phat_1*qhat_1*(1/pn_1 + 1/ps_1))
gen z_1 = (fs_1/ps_1 - fn_1/pn_1)/sp_1
How can I test (p_hat_s_1 - p_hat_s_0) - (p_hat_n_1 - p_hat_n_0),
given that p_hat_* is a proportion?
My uninformed guess is that it might be ((p_hat_s_1 - p_hat_s_0) -
(p_hat_n_1 - p_hat_n_0)) / s,
where s = some weighted version of sp_0 and sp_1.
Many thanks,
Misha
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/