11.3 Interrupted Time Series
Interrupted time series (ITS) is a special case of RD design, where the break or level change is in the time variable. We can use time-series modeling techniques to model the data at the discontinuity. Time series discontinuities are some of the most believable RD designs in practice, so we’ll cover this special case here. ITS relies on a break in time to operationalize the statistical estimate of the pre-treatment trend line. Let’s unpack this a little more.
Suppose we have an intervention or some kind of treatment that varies by time. We can estimate the pre-intervention trend in the post-treatment period (control group) and compare that with the actual post-treatment data (treatment group). That’s the core idea behind ITS.
Just as with DID estimation, we need clear pre- and post-treatment periods. That is, we need to know when treatment was implemented. In the example depicted in Figure 11.4, a promotional campaign offers a 20% discount on downloading a streaming game starting June 1, 2019. The pre-treatment period is the 50 days before May 31. We can use this pre-treatment period to estimate the trend in the post-treatment period of 50 days after June 1. Since this is a time-series example, we’ll use the total profit in millions.
FIGURE 11.4 Interrupted time series design for downloads over time.
Since we’re estimating the trend, rather than relying on the actual trend, we need to make some decisions about how to model it. First, we need to think about how the intervention will affect the outcome, or how a promotional campaign will affect profit. Is the effect gradual or an immediate step change? In this example, let’s assume that we have both an immediate step change as the price immediately changes and a gradual change as the information about the promotion diffuses through the population.
In this section, we’ll go over a number of different ways to model an ITS design. First, we’ll examine a simple regression model; then, we’ll apply time-series modeling methods to estimate LATE. Before we apply time-series methods, we’ll provide a brief introduction to major concepts and modeling techniques in time series.
11.3.1 Simple Regression Analysis
The easiest way to model the interrupted time series is with regression with dummy variables. Specifically, we can model ITS in the same way as we did with the DID design. The following is the regression equation that we will model:
The outcome variable is profit, so we’ll use an OLS regression. Note that we can fit this with other models. For instance, if the outcome was a count, such as number of downloads daily, we’d use a Poisson regression. If it’s a binary outcome, we’d use a logistic regression.
The time variable is the time elapsed since the start of the study by unit of frequency—daily, in this case. The treatment is a dummy for the pre- or post-intervention period. The time since treatment is the days elapsed since treatment. The treatment effect (ATT) estimate is 269 downloads, which is a significant change (Table 11.2). We calculate this table in R Listing 16.5.
Table 11.2 Summary of the ITS OLS Regression Results
|
Estimate |
Std. error |
z value |
Pr(> |z|) |
---|---|---|---|---|
(Intercept) |
–85.3292 |
10.4486 |
–8.167 |
3.82e-14 *** |
treatment |
270.1814 |
14.5952 |
18.512 |
< 2e-16 *** |
time |
4.7806 |
0.1814 |
26.35 |
< 2e-16 *** |
timetx |
1.2482 |
0.2528 |
4.937 |
1.69e-06 *** |
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 51.59 on 196 degrees of freedom
Multiple R-squared: 0.9864, Adjusted R-squared: 0.9862
F-statistic: 4737 on 3 and 196 DF, p-value: < 2.2e-16
In Figure 11.4, we can see nonlinearity and seasonality. Thus, in this case, a regression model would not lead to the best fit. When we plot the regression model overlaid with our data, we can see that it is a poor fit for the data. (Try this for yourself in the R section in Chapter 16 Listing 16.5.) We need to find a better model that will take into account some of the nonlinearity and seasonality, so that our model produces better elements.
A regression model is not well suited for ITS when the following conditions appear:
Seasonality: Period variation that occurs in fixed intervals
Time-varying confounders: Selection bias of other covariates
Overdispersion: Greater variability in the data than is represented in the model
Autocorrelation: Correlation between the process and itself in prior periods
In the next section, we’ll explore time-series modeling techniques that statistically correct for some of these issues.
11.3.2 Time-Series Modeling
Many types of user behavior (such as sales or downloads) occur over periods of time. As we saw in the DID design, we can operationalize time to understand causal effects.
In this section, we’ll discuss the time-series modeling approaches for an ITS design. To get there, we’ll discuss some basic principles of time series that will help improve our ITS model. Note that we’ll discuss time-series concepts only in relation to quasi-experimental design techniques. If you are interested in using time-series modeling, it might be helpful to review the time-series concepts in Shumway and Stoffer’s (2006) Time Series Analysis and Its Applications: With R Examples.
In the next section, we’ll go over two core concepts in time-series analysis: autocorrelation and stationarity.
11.3.2.1 Autocorrelation and Stationarity
When the outcome variable is time, you’ll deal with a few unique issues that might not be present in other types of independent variables. One of those issues is auto-correlation.
Autocorrelation
Autocorrelation is serial correlation between the values of the process at different times, as a function of the time lag. As discussed in Chapter 6, correlation is how two variables vary linearly with one another. In layman’s terms, autocorrelation is how a set of time-series data varies linearly with itself at different points in time based on a lag of one or more periods. This basically means that prior periods are helping to explain the current data—which is a problem because it breaks common statistical assumptions. Statisticians love to assume that data is random, independent, and identically distributed (iid assumption). Independence refers to the fact that picking one observation does not mean you’re more likely to get another, similar observation. Autocorrelation is pernicious because it breaks all these assumptions.
There are two ways to correct for autocorrelation in ITS design: (1) robust standard errors on the OLS estimates or (2) use of a time-series model. The OLS results from Table 11.2 remain approximately the same with robust standard errors. In this example, there are more problems than just autocorrelation, so we’ll use a time-series model to fit the data.
Stationarity
The other core concept to understand when modeling time-series data is stationarity. Stationarity means that your mean and variance are constant over time. While a nonstationary process can vary greatly in the short run, a stationary process in the long run regresses back to its historical mean and variance. Stationarity is a nice property to see in time-series data, though it’s not always present.
Stationarity is especially important for interrupted time series. The reason is rather obvious: It’s much easier to identify a discontinuity in the data in a stationary process than in a nonstationary process. To identify a nonstationary process in the data, it generally takes a much longer pre-treatment period than post-treatment period, which may or may or may not be available. This approach may lead to errors as well, since theoretically the RD design is still only defined around the cut point.
Consider the DID modeling example from Chapter 10. In that case, the process was stationary and the effect size was large, so the effect was quite clear to the human eye. Conversely, in many other cases with the nonstationary processes, the effect is much more muffled by noise and an upward or downward trend that’s larger than the small effect size. A nonstationary series takes a more experienced hand to correctly model the data, and even then the results may not be believable.
11.3.2.2 ARMA/ARIMA Model
A common approach to dealing with autocorrelation in an ITS design is to use a time-series model like an ARMA/ARIMA model. The ARMA/ARIMA model is a model with two core components: an autoregression (AR) model and a moving average (MA) model.
An autoregression model is a linear regression where we predict series values based on the lagged values of the series itself. In an AR(1) model, we model one period lag. In an AR(2) model, we model two period lags. In the moving average model, we model the future value of the series based on the average of prior values. Thus, an MA(1) model uses one past observation and an MA(2) model uses the average of two past observations. The difference between the ARMA and ARIMA is the “I”—that is, the integrated component indicates that data is replaced with the difference between itself and previous values.
Returning to our sales promotion, we’ll apply an ARIMA model to our ITS example. Now, this may not solve all the underlying problems in our data set, but it may help us better fit our data and find the effects of the intervention.
Table 11.3 shows the result of the ARIMA fit with treatment modeled as both a step and a gradual change. In Chapter 16, Listing 16.6, we implement this ARIMA model.
Table 11.3 Interrupted Time Series with ARIMA Model Fit
Coefficients |
Estimate |
Std Error |
Pr(> |t|) |
---|---|---|---|
ar1 |
–0.1018 |
0.3265 |
0.755 |
ma1 |
0.2905 |
0.3112 |
0.351 |
sar1 |
0.4878 |
0.0907 |
0.000*** |
sma1 |
0.2804 |
0.0997 |
0.000*** |
Treatment |
2.0008 |
1.3723 |
0.145 |
Gradual treatment |
–0.267 |
1.3841 |
0.847 |
From this model, we can see that there was not an effect of the treatment; neither the step change was significant, nor the gradual change was statistically significant. The model has a moving average (ma1), autoregression (ar1), and seasonal components (sar1, sma1). When we actually account for the autocorrelation and seasonality, there is not an effect of the intervention.
Modeling any counterfactual is extremely difficult. We’ll return to this idea many times in upcoming chapters. Small modeling decisions can lead to large changes in effect sizes and other elements. Many of the models can be extremely sensitive to small changes because real data is highly complex, with annual, daily, and weekly seasonality, cyclic (like business cycle) and noncyclic trends (product growth/death), outliers, and more.
Similarly to the RD example, as data analysts we should try a variety of models and check all potential confounders as robustness checks. We presented an example of this work in the previous section, so we’ll skip the confounder validation methods in this section. However, it’s extremely important to not omit these steps in practice.
In the next section, we’ll cover a very useful business tool—seasonality decomposition. When we want to extract the trendline from time-series data, we can use seasonality decomposition to help remove noise and cyclic patterns.
RD in Broader Context
In this chapter, we learned that the RD design can be a powerful tool for finding localized causal effects. RD is a useful tool for a few reasons. First, we are exploiting randomness in the forcing variable, which means that we do not have to implement an experiment ourselves. Second, RD lends itself to creative application, as there are many types of RD designs and the basic assumption is only a break or level change in the treatment variable. Unlike many other causal inference methods, RD is easily invalidated with a good graphing hand. Graphing RD can show us that selection is occurring at the cut point. It can even show us the confounding variables that are preventing causal inference.
One of the strongest use-cases for RD is the ITS design. ITS has the nice properties of an RD design and the nice aspects of using time as the force or treatment variable. ITS, DID, and other designs that have temporal variables can be improved with a better understanding of time-series modeling.
This chapter has offered another method to derive causal insights from observational data. As described in Chapter 3, causal insights are easily actionable and prescriptive, which make them more valuable than other types of insights for altering user behavior.