In the spotlight: Heterogeneous DID: A new way to approach an old problem
The first reference to difference in differences (DID) is John Snow's work in the 1860s. He was trying to demonstrate that contaminated water caused cholera and to refute alternative explanations. But he needed to ascertain the existence of a causal mechanism abstracting from any confounding effects. This is at the heart of causal inference. It is an old human desire. We like to know if an action we undertake has a causal consequence and is not confounded by other factors.
DID controls for group- and time-unobservable effects. What is left, after controlling for both of these effects, is the effect of a treatment on the group subject to the intervention, an average treatment effect on the treated (ATET). In Stata 17, we provided DID estimators for cross-sectional data (didregress) and for longitudinal data (xtdidregress). We also provided tools to obtain diagnostics for the assumptions usually required to determine the validity of the approach. We provided diagnostics for parallel or common trend (estat ptrends) and tests of anticipation (estat granger). Both of them have graphical and tabular outputs.
Yet there is another assumption underlying the validity of the estimators in didregress and xtdidregress that needs to be accounted for. Both of the estimators assume that the ATET does not change over time and that, if individuals are treated at different points in time, their ATETs are the same. In other words, the ATETs are assumed to be homogeneous across time and treatment cohorts.
In Stata 18, you can check whether homogeneity is to be trusted. After didregress or xtdidregress, you would type
. estat bdecomp
This gives you the Bacon decomposition. It decomposes the ATET into a weighted average of two-period two-group treatment effects, which lets us know if there is timing or group variation. It also points out “bad” comparisons that occur when running didregress or xtdidregress. These occur when an already treated group acts as a control group.
So what do you do if there is ATET heterogeneity or if you do not want to assume homogeneity? Use the new hdidregress and xthdidregress commands to fit models for which the ATET changes over time or over treatment cohort. You will get ATET estimates that vary over treatment cohort and time, and you may graph these effects.
Let's see it work
Suppose we are interested in how the number of registrations of a dog breed with the American Kennel Club (AKC), registered, is affected by dogs being the protagonists in a movie, movie. We conjecture that the number of registrations increases if the dog breed appears as the protagonist in a movie. We also conjecture that registrations increase if the dog has won the Best in Show award from the Westminster Kennel Club, best, in the 10 years before 2034. We use simulated future data, but there is some past evidence of the effect of movies on dog breed registrations. See, for example, Ghirlanda, Acerbi, and Herzog (2014).
There are 141 dog breeds in our sample, which ranges between the years 2031 and 2040. At the beginning of the sample, none of the breeds are featured in a movie. This changes in 2034, when four breeds are featured in movies. The next year in which we see an increase of breeds featured in movies is 2036, when 7 breeds are featured. In 2037, there is a substantial increase, with 22 breeds featured. The table below illustrates this.
. webuse akc, clear (Fictional dog breed and AKC registration data) . tabulate year movie
Was a movie | ||||
protagonist | ||||
Year | 0 1 | Total | ||
2031 | 141 0 | 141 | ||
2032 | 141 0 | 141 | ||
2033 | 141 0 | 141 | ||
2034 | 137 4 | 141 | ||
2035 | 137 4 | 141 | ||
2036 | 134 7 | 141 | ||
2037 | 119 22 | 141 | ||
2038 | 119 22 | 141 | ||
2039 | 119 22 | 141 | ||
2040 | 119 22 | 141 | ||
Total | 1,307 103 | 1,410 |
There are three treatment cohorts (2034, 2036, and 2037). We conjecture that treatment effects differ and evolve over time. To fit a regression adjustment model after using xtset, we type
. xtset breed year . xthdidregress ra (registered best) (movie), group(breed)
In the first set of parentheses, we define the outcome, registered, and any covariates that affect the outcome directly. In the second set of parentheses, we define the observation-level treatment variable, movie. After the comma, we need to define the group variable in group(); this is a required option. The group variable defines at which level the treatment occurs and also identifies the clustering variable, which in this case is breed. We obtain
. xthdidregress ra (registered best) (movie), group(breed) note: variable _did_cohort, containing cohort indicators formed by treatment variable movie and group variable breed, was added to the dataset. Computing ATET for each cohort and time: Cohort 2034 (9): ......... done Cohort 2036 (9): ......... done Cohort 2037 (9): ......... done Treatment and time information Time variable: year Time interval: 2031 to 2040 Control: _did_cohort = 0 Treatment: _did_cohort > 0
_did_cohort | ||
Number of cohorts | 4 | |
Number of obs | ||
Never treated | 1190 | |
2034 | 40 | |
2036 | 30 | |
2037 | 150 | |
Robust | ||
Cohort | ATET std. err. z P>|z| [95% conf. interval] | |
2034 | ||
year | ||
2032 | -254.8927 266.1024 -0.96 0.338 -776.4439 266.6584 | |
2033 | -257.5329 217.9389 -1.18 0.237 -684.6852 169.6194 | |
2034 | 701.1318 127.0935 5.52 0.000 452.0331 950.2304 | |
2035 | 1099.044 282.0704 3.90 0.000 546.196 1651.892 | |
2036 | 1367.632 225.8702 6.05 0.000 924.9343 1810.329 | |
2037 | 2008.294 237.2396 8.47 0.000 1543.313 2473.275 | |
2038 | 2472.624 278.2949 8.88 0.000 1927.176 3018.072 | |
2039 | 2689.615 504.3324 5.33 0.000 1701.142 3678.088 | |
2040 | 3110.97 568.916 5.47 0.000 1995.915 4226.025 | |
2036 | ||
year | ||
2032 | 216.0259 122.9107 1.76 0.079 -24.87472 456.9265 | |
2033 | -172.5154 372.0776 -0.46 0.643 -901.7741 556.7433 | |
2034 | -218.0495 504.5267 -0.43 0.666 -1206.904 770.8045 | |
2035 | 621.033 156.1306 3.98 0.000 315.0227 927.0434 | |
2036 | 999.0781 180.1055 5.55 0.000 646.0779 1352.078 | |
2037 | 1003.333 250.5916 4.00 0.000 512.1829 1494.484 | |
2038 | 1556.669 451.6914 3.45 0.001 671.3697 2441.967 | |
2039 | 2590.674 662.6979 3.91 0.000 1291.81 3889.538 | |
2040 | 2225.712 486.9917 4.57 0.000 1271.225 3180.198 | |
2037 | ||
year | ||
2032 | -114.582 160.0972 -0.72 0.474 -428.3668 199.2028 | |
2033 | -127.9856 183.3941 -0.70 0.485 -487.4315 231.4603 | |
2034 | 33.40901 168.0312 0.20 0.842 -295.9262 362.7442 | |
2035 | 130.3495 166.2261 0.78 0.433 -195.4477 456.1468 | |
2036 | -10.48288 167.5059 -0.06 0.950 -338.7884 317.8226 | |
2037 | 1717.016 268.5592 6.39 0.000 1190.65 2243.383 | |
2038 | 2086.798 278.0215 7.51 0.000 1541.886 2631.71 | |
2039 | 2473.611 268.186 9.22 0.000 1947.976 2999.246 | |
2040 | 2835.117 378.6699 7.49 0.000 2092.938 3577.296 | |
Notice the note below the command. A variable with the name _did_cohort has been generated. Using the group variable and the observation-level treatment, xthdidregress generated treatment-time cohorts. The new variable creates treatment groups based on the time when a group was first treated. For instance, if a Boxer and a Rottweiler are featured in movies in 2034, they are grouped in the 2034 cohort. The variable also contains a category for a control group. In this case, the control group is formed by the breeds that are not featured in a movie. Cohorts are important inputs for estimation and postestimation commands.
Next appears a table that gives you a sense of the treatment groups and times. You see the time variable, year, and its range, 2031 to 2040. Then we see what defines a treated or a control group. The table below that provides group-level information about the cohort–time groups. The first row tells you the number of cohorts. Following the number of cohorts is a tabulation showing how many observations are in each cohort. For instance, 1,190 observations are never treated in our data. The table gives you a sense of the amount of information available in each cohort and might hint at the variability of cohort-level estimates.
It is difficult to see the trends in ATETs just by looking at all the ATET estimates. We can use estat atetplot to visualize the time profile of the ATETs for each cohort.
. estat atetplot
After fitting the model, we can use estat aggregation to aggregate the ATETs within cohort, time, and exposure to treatment. This command provides a summary of different aspects of ATETs. For example, we use estat aggregation, cohort to summarize the ATETs within cohort. We also specify option graph to obtain a graph of aggregations in addition to the tabular output.
. estat aggregation, cohort graph ATET over cohort Number of obs = 1,410 (Std. err. adjusted for 141 clusters in breed)
Robust | ||
Cohort | ATET std. err. z P>|z| [95% conf. interval] | |
2034 | 1921.33 187.2787 10.26 0.000 1554.271 2288.389 | |
2036 | 1675.093 130.4929 12.84 0.000 1419.332 1930.855 | |
2037 | 2278.136 166.5283 13.68 0.000 1951.746 2604.525 | |
If we want to summarize ATETs within time, we specify option time with estat aggregation.
. estat aggregation, time graph ATET over time Number of obs = 1,410 (Std. err. adjusted for 141 clusters in breed)
Robust | ||
Time | ATET std. err. z P>|z| [95% conf. interval] | |
2034 | 701.1318 127.0935 5.52 0.000 452.0331 950.2304 | |
2035 | 1099.044 282.0704 3.90 0.000 546.196 1651.892 | |
2036 | 1209.68 170.2043 7.11 0.000 876.0858 1543.275 | |
2037 | 1672.655 202.1854 8.27 0.000 1276.379 2068.932 | |
2038 | 2084.658 214.5072 9.72 0.000 1664.232 2505.084 | |
2039 | 2528.847 225.8763 11.20 0.000 2086.138 2971.557 | |
2040 | 2802.171 291.8412 9.60 0.000 2230.173 3374.17 | |
Finally, if we want to summarize ATETs over different lengths of exposure to treatment, we specify option dynamic.
. estat aggregation, dynamic graph Duration of exposure ATET Number of obs = 1,410 (Std. err. adjusted for 141 clusters in breed)
Robust | ||
Exposure | ATET std. err. z P>|z| [95% conf. interval] | |
-5 | -114.582 160.0972 -0.72 0.474 -428.3668 199.2028 | |
-4 | -70.65034 156.3185 -0.45 0.651 -377.029 235.7283 | |
-3 | -.9117242 153.0999 -0.01 0.995 -300.982 299.1585 | |
-2 | 12.79653 144.8216 0.09 0.930 -271.0486 296.6417 | |
-1 | 30.71473 132.8508 0.23 0.817 -229.668 291.0975 | |
0 | 1434.409 206.3277 6.95 0.000 1030.014 1838.804 | |
1 | 1759.461 224.0229 7.85 0.000 1320.385 2198.538 | |
2 | 2147.486 221.903 9.68 0.000 1712.564 2582.408 | |
3 | 2651.452 284.8928 9.31 0.000 2093.073 3209.832 | |
4 | 2366.805 267.4253 8.85 0.000 1842.661 2890.949 | |
5 | 2689.615 504.3324 5.33 0.000 1701.142 3678.088 | |
6 | 3110.97 568.916 5.47 0.000 1995.915 4226.025 | |
Parting words
In Stata, we can fit both heterogeneous and homogeneous ATETs, check model assumptions, and explore results in tabular and graphical forms. To learn more, see [CAUSAL] hdidregress and [CAUSAL] xthdidregress.
Reference
Ghirlanda, S., A. Acerbi, and H. Herzog. 2014. Dog movie stars and dog breed popularity: A case study in media influence on choice. PLOS ONE 9: e106565. https://doi.org/10.1371/journal.pone.0106565.
— by Enrique Pinzón
Director of Econometrics