contdid

An R package for difference-in-differences with a continuous treatment.

Installation

You can install the development version of contdid from GitHub with:

# install.packages("devtools")
devtools::install_github("bcallaway11/contdid")

library(contdid)

Package Status: Alpha Version

This is an alpha version of the contdid package. The core features are implemented and functional, but the package remains under active development. The API may change, and additional functionality is planned.

We welcome feedback and encourage users to report bugs or other issues via the GitHub Issues page.

✅ Currently Supported

🚫 Not Yet Supported

Conceptual Setup

Below, we give several examples of how to estimate causal effect parameters using the contdid package.

At a high level, the interface is basically the same as for the did package and for packages that rely on the ptetools backend, with only a few pieces of additional information being required. First, the name of the continuous treatment variable should be passed through the dname argument.

The cont_did function expects the continuous treatment variable to behave in certain ways:

  1. It needs to be time-invariant.

  2. It should be set to its time-invariant value in pre-treatment periods. This is just a convention of the package, but, in particular, you should not have the treatment variable coded as being 0 in pre-treatment periods.

  3. For units that don’t participate in the treatment in any time period, the treatment variable just needs to be time-invariant. In some applications, e.g., the continuous treatment variable may be defined for units that don’t actually participate in the treatment. In other applications, it may not be defined for units that do not participate in the treatment. The function behaves the same way in either case.

Next, the other important parameters are target_parameter, aggregation, and treatment_type:

Examples

With a continuous treatment, the underlying building blocks are treatment effects that are local to a particular timing group g in a particular time period t that experienced a particular value of the treatment d. These treatment affects are relatively high-dimensional, and most applications are likely to involve aggregating/combining these underlying parameters. We focus on aggregations that (i) average across timing-groups and time periods to given average treatment effect parameters as a function of the dose d or (ii) averages across doses and partially across timing group and time periods in order to give event studies.

For the results below, we will simulate some data, where the continuous treatment D has no effect on the outcome.

# Simulate data
set.seed(1234)
df <- simulate_contdid_data(
  n = 5000,
  num_time_periods = 4,
  num_groups = 4,
  dose_linear_effect = 0,
  dose_quadratic_effect = 0
)
head(df)
#>   id G          D time_period         Y
#> 1  1 2 0.08593221           1 0.3579583
#> 2  1 2 0.08593221           2 5.2354694
#> 3  1 2 0.08593221           3 3.2717079
#> 4  1 2 0.08593221           4 4.3988042
#> 5  2 4 0.17217781           1 5.9743351
#> 6  2 4 0.17217781           2 5.8463051

Case 1: Dose Aggregation

The following code can be used to estimate the ATT(d) and ACRT(d) parameters for the continuous treatment D using the cont_did function. The aggregation argument is set to “dose” and the target_parameter argument is set to “level” for ATT(d) and “slope” for ACRT(d).

cd_res <- cont_did(
  yname = "Y",
  tname = "time_period",
  idname = "id",
  dname = "D",
  data = df,
  gname = "G",
  target_parameter = "slope",
  aggregation = "dose",
  treatment_type = "continuous",
  control_group = "notyettreated",
  biters = 100,
  cband = TRUE,
  num_knots = 1,
  degree = 3,
)

summary(cd_res)
#> 
#> Overall ATT:  
#>      ATT    Std. Error     [ 95%  Conf. Int.] 
#>  -0.0265        0.0301    -0.0855      0.0325 
#> 
#> 
#> Overall ACRT:  
#>    ACRT    Std. Error     [ 95%  Conf. Int.]  
#>  0.1341        0.0485      0.039      0.2293 *
#> ---
#> Signif. codes: `*' confidence band does not cover 0
ggcont_did(cd_res, type = "att")

ggcont_did(cd_res, type = "acrt")

Case 2: Event Study Aggregations

Next, we consider event study aggregations. The first is an event study aggregation for ATT. The second is an event study aggregation for ACRT.

Event study aggregation for ATT:

Notice that the target parameter is set level to target ATT, and the aggregation argument is set to eventstudy.

cd_res_es_level <- cont_did(
  yname = "Y",
  tname = "time_period",
  idname = "id",
  dname = "D",
  data = df,
  gname = "G",
  target_parameter = "level",
  aggregation = "eventstudy",
  treatment_type = "continuous",
  control_group = "notyettreated",
  biters = 100,
  cband = TRUE,
  num_knots = 1,
  degree = 3,
)

summary(cd_res_es_level)
#> 
#> Overall ATT:  
#>      ATT    Std. Error     [ 95%  Conf. Int.] 
#>  -0.0243        0.0289    -0.0808      0.0323 
#> 
#> 
#> Dynamic Effects:
#>  Event Time Estimate Std. Error   [95%  Conf. Band] 
#>          -2  -0.0222     0.0504 -0.1488      0.1044 
#>          -1   0.0116     0.0271 -0.0565      0.0798 
#>           0  -0.0039     0.0299 -0.0790      0.0713 
#>           1  -0.0160     0.0397 -0.1157      0.0837 
#>           2  -0.0839     0.0419 -0.1891      0.0212 
#> ---
#> Signif. codes: `*' confidence band does not cover 0
ggcont_did(cd_res_es_level)

Event study aggregation for ACRT:

Relative to the previous code, notice that the target parameter is set slope to target ACRT.

cd_res_es_slope <- cont_did(
  yname = "Y",
  tname = "time_period",
  idname = "id",
  dname = "D",
  data = df,
  gname = "G",
  target_parameter = "slope",
  aggregation = "eventstudy",
  treatment_type = "continuous",
  control_group = "notyettreated",
  biters = 100,
  cband = TRUE,
  num_knots = 1,
  degree = 3,
)

summary(cd_res_es_slope)
#> 
#> Overall ACRT:  
#>     ATT    Std. Error     [ 95%  Conf. Int.]  
#>  0.1341        0.0584     0.0197      0.2485 *
#> 
#> 
#> Dynamic Effects:
#>  Event Time Estimate Std. Error   [95%  Conf. Band]  
#>          -2  -0.0701     0.0808 -0.2710      0.1308  
#>          -1  -0.2212     0.0892 -0.4431      0.0007  
#>           0   0.1592     0.0586  0.0136      0.3048 *
#>           1   0.0551     0.0828 -0.1509      0.2610  
#>           2  -0.5405     0.1154 -0.8275     -0.2535 *
#> ---
#> Signif. codes: `*' confidence band does not cover 0
ggcont_did(cd_res_es_slope)

Case 3: Data-Driven Nonparametric Model for Treatment Effects

In most applications, it is hard to know the correct functional form for the treatment effects as a function of the dose. In Callaway, Goodman-Bacon, and Sant’Anna (2025), the approach we emphasize comes from Chen, Christensen, and Kankanala (2025), and the contdid package uses their npiv package to implement this approach. Code-wise, the only thing to change is to set the argument dose_est_method="cck". [Note that we currently only support this option for the case with two periods and no staggered adoption. With more periods, you can average the pre- and post-treatment periods to reduce it to a two period case and then run the code below; in fact, this is what we did in the application in our paper.]

# simulate data with only two periods
# add quadratic effect to see how well we can detect it
# (note code will not "know" that the effect is quadratic)
df2 <- simulate_contdid_data(
  n = 5000,
  num_time_periods = 2,
  num_groups = 2,
  dose_linear_effect = 0,
  dose_quadratic_effect = 1
)
df2$D[df2$G == 0] <- 0
head(df2)
#>   id G        D time_period           Y
#> 1  1 2 0.890987           1  1.84272906
#> 2  1 2 0.890987           2  5.30890209
#> 3  2 0 0.000000           1  0.59423237
#> 4  2 0 0.000000           2  2.81443324
#> 5  3 0 0.000000           1  1.77438193
#> 6  3 0 0.000000           2 -0.01032246

cd_res_cck <- cont_did(
  yname = "Y",
  tname = "time_period",
  idname = "id",
  dname = "D",
  data = df2,
  gname = "G",
  target_parameter = "level",
  aggregation = "dose",
  treatment_type = "continuous",
  dose_est_method = "cck",
  control_group = "notyettreated",
  biters = 100,
  cband = TRUE,
)

summary(cd_res_cck)
#> 
#> Overall ATT:  
#>     ATT    Std. Error     [ 95%  Conf. Int.]  
#>  0.3399         0.037     0.2673      0.4125 *
#> 
#> 
#> Overall ACRT:  
#>    ACRT    Std. Error     [ 95%  Conf. Int.]  
#>  0.6595        0.1853     0.2964      1.0226 *
#> ---
#> Signif. codes: `*' confidence band does not cover 0
ggcont_did(cd_res_cck) +
  stat_function(
    fun = function(x) x^2,
    aes(color = "Truth"),
    linetype = "dashed",
    size = 1
  ) +
  scale_color_manual(values = c("Truth" = "red")) +
  labs(color = "")

References

Callaway, Brantly, Andrew Goodman-Bacon, and Pedro H. C. Sant’Anna. 2025. “Difference-in-Differences with a Continuous Treatment.”
Chen, Xiaohong, Timothy Christensen, and Sid Kankanala. 2025. “Adaptive Estimation and Uniform Confidence Bands for Nonparametric Structural Functions and Elasticities.” Review of Economic Studies 92 (1): 162–96.