Title | Treatment assignment times in didregress and xtdidregress | |
Author |
Pei-Chun Lai, StataCorp Chris Cheng, StataCorp |
Difference-in-differences (DID) models are generally used to examine the effect of a policy, accounting for a world before and after the policy implementation while considering group and time effects. After fitting DID models using didregress or xtdidregress, Stata provides estat ptrends to test if the linear trend is parallel between the treatment and control groups in pretreatment periods, estat trendplots to generate diagnostic plots for assessing the parallel-trends assumption, and estat granger to test if treatment effects can be observed prior to the treatment.
It would be difficult to perform these tests or create the diagnostic plots when the exact treatment assignment time cannot be determined because of multiple treatment timings appearing in the dataset. Stata will verify whether observations in the treatment group have different starting times for treatment assignment. If Stata identifies the treatment assignment time “differs”, we will encounter the following error messages when running these postestimation commands:
. estat ptrends
treatment assignment times vary; not allowed with estat ptrend
r(459);
. estat trendplots
treatment assignment times vary; not allowed with estat trendplots
r(459);
. estat granger
treatment assignment times vary; not allowed with estat granger
r(459);
Please note that for the successful execution of these postestimation commands, it is essential to have a “fixed” starting time for treatment assignment. In other words, all observations in the treatment group should have the same treatment assignment time.
How could we discover the assignment time for each observation in the treatment group? For example, let’s load a dataset of hospitals implementing a new procedure:
. use webuse hospdd (Artificial hospital admission procedure data) . codebook month procedure
month Month |
procedure Admission procedure |
Hospital | Month | |||||
ID | April | May June July | Total | |||
1 | 23 | 23 23 23 | 92 | |||
2 | 21 | 21 21 21 | 84 | |||
3 | 19 | 19 19 19 | 76 | |||
4 | 25 | 25 25 25 | 100 | |||
5 | 25 | 25 25 25 | 100 | |||
6 | 25 | 25 25 25 | 100 | |||
7 | 29 | 29 29 29 | 116 | |||
8 | 22 | 22 22 22 | 88 | |||
9 | 20 | 20 20 20 | 80 | |||
10 | 21 | 21 21 21 | 84 | |||
11 | 22 | 22 22 22 | 88 | |||
12 | 20 | 20 20 20 | 80 | |||
13 | 23 | 23 23 23 | 92 | |||
14 | 19 | 19 19 19 | 76 | |||
15 | 19 | 19 19 19 | 76 | |||
16 | 21 | 21 21 21 | 84 | |||
17 | 18 | 18 18 18 | 72 | |||
18 | 11 | 11 11 11 | 44 | |||
Total | 383 | 383 383 383 | 1,532 |
Control Treatment | ||
Group | ||
hospital | 28 18 | |
Time | ||
Minimum | 1 4 | |
Maximum | 1 4 |
Robust | ||
satis | Coefficient std. err. t P>|t| [95% conf. interval] | |
ATET | ||
procedure | ||
(New vs Old) | .8479879 .0321121 26.41 0.000 .7833108 .912665 |
In this dataset, the treatment is the new admission procedure. The result of the tabulate command above indicates that all hospitals in the treatment group implemented the new procedure in April (month=4). The first table from the didregress command’s result shows that the minimum and maximum time of treatment assignment is month=4. Under these conditions, we can successfully conduct the parallel-trends and Granger causality tests, as well as generate diagnostic plots.
However, if we were to manipulate the dataset such that the treatment assignment times differ within the treatment group by having hospital 1 implement the new procedure in May (month=5) while the others begin in April (month=4), we can observe that
. replace procedure=0 if hospital==1 & month==4 (23 real changes made) . tabulate hospital month if procedure==1
Hospital | Month | |||||
ID | April | May June July | Total | |||
1 | 0 | 23 23 23 | 69 | |||
2 | 21 | 21 21 21 | 84 | |||
3 | 19 | 19 19 19 | 76 | |||
4 | 25 | 25 25 25 | 100 | |||
5 | 25 | 25 25 25 | 100 | |||
6 | 25 | 25 25 25 | 100 | |||
7 | 29 | 29 29 29 | 116 | |||
8 | 22 | 22 22 22 | 88 | |||
9 | 20 | 20 20 20 | 80 | |||
10 | 21 | 21 21 21 | 84 | |||
11 | 22 | 22 22 22 | 88 | |||
12 | 20 | 20 20 20 | 80 | |||
13 | 23 | 23 23 23 | 92 | |||
14 | 19 | 19 19 19 | 76 | |||
15 | 19 | 19 19 19 | 76 | |||
16 | 21 | 21 21 21 | 84 | |||
17 | 18 | 18 18 18 | 72 | |||
18 | 11 | 11 11 11 | 44 | |||
Total | 360 | 383 383 383 | 1,509 |
Control Treatment | ||
Group | ||
hospital | 28 18 | |
Time | ||
Minimum | 1 4 | |
Maximum | 1 5 |
Robust | ||
satis | Coefficient std. err. t P>|t| [95% conf. interval] | |
ATET | ||
procedure | ||
(New vs Old) | .8330176 .0374345 22.25 0.000 .7576206 .9084145 |
The treatment column for the first table in the didgress results shows that the minimum treatment timing is month=4 (April) and the maximum treatment timing is month=5 (May). In this scenario, we will not be able to run estat ptrends, estat trendplots, and estat granger afterward because the treatment assignment time varies among the hospitals in the treatment group.
Please note that the explanations provided above also apply to xtdidregress, which is used for handling panel/longitudinal data.