Observational Data Analysis Techniques

By Joanne Rodrigues
Oct 1, 2020

📄 Contents

␡

11.1 Regression Discontinuity
11.2 Estimating the Causal Effect of Gaining a Badge
11.3 Interrupted Time Series
11.4 Seasonality Decomposition
11.5 Actionable Insights

⎙ Print

< Back Page 4 of 5 Next >

This chapter is from the book 

Product Analytics: Applied Data Science Techniques for Actionable Consumer Insights

Learn More Buy

11.4 Seasonality Decomposition

Seasonality involves known patterns that are repeated over fixed time intervals. Cyclic patterns can occur on different time scales, such as daily, weekly, monthly, and yearly.

Seasonality can bias results in the ITS design in two ways. First, high variation or a period high during the treatment (such as the treatment period that covers Black Friday through Christmas for retailers) can bias results. Second, an autocorrelation effect can occur on top of seasonality, with mean or variance tethered to the prior periods.

In this section, we will not cover how to remove seasonality bias in its models, because of its statistical rigor and narrow applicability. Building a model that removes the effect of seasonality can be achieved with an ARIMA model or done with Fourier terms. Both of these methods are out of the scope of this text. Please refer to Bhaskaran et al. (2013) for more information on how to get seasonality-adjusted estimates.

Now we’ll cover regular seasonality decomposition, which is useful in a variety of business contexts. In most cases, simple seasonality decomposition with understanding of the business cycle seasonality is sufficient to adequately assess the validity of these designs. We’ll offer only a gentle introduction to seasonality decomposition; you can refer to reference texts for more technical rigor. In Chapter 16, Listing 16.7, we cover seasonality decomposition in R.

Seasonality decomposition is an extremely important technique in practice because almost all purchasing data (i.e., behavioral data) varies by hour, day, week, month, or some other time unit. To examine general time trends, it’s often very useful to apply seasonality decomposition. For instance, with our snowmobile website, we may see more traffic and sales on weekends or before or during the Christmas season. We want to understand both how sales vary throughout the year and how the general trend looks over time.

Seasonality should not be confused with a general “cyclic” pattern, which we’ll extract separately. Seasonality differs from a cyclic pattern in that it’s fixed and occurs for a known period. For instance, the business cycle may lead to a secular trend in your data. There may also be nonrepeating cycles, irregular data, errors, and just general random noise. Seasonality decomposition is one way of trying to model and isolate individual components of data. These modeling techniques are generally relatively simple compared to the true complexity of the data. They provide one way to look at and think about the data. We will discuss their limitations at the end of this section.

In this section, we’ll discuss how to extract the seasonal component from time-series data. If we’re trying to understand causal relationships in time-series data, seasonality is always a potential confounder.

Seasonality decomposition has four components: (1) seasonal component, (2) cyclic component, (3) trend, and (4) error. Think back to sales of snowmobiles. The seasonal component would be the difference in monthly sales due to peaks in buying before Christmas. The cyclic component could be the business cycle. The trend component would be the actual movement of sales. The error component includes the irregular or outlier purchases made on certain days.

There are two main types of seasonality decomposition, the additive and multiplicative models. The multiplicative model assumes the height and width of fluctuations are proportional to the level or the average value in the series, while the additive model assumes the width and height of these changes are the same over time. Generally, the multiplicative model is more appropriate for user data.

Additive model:

Y = Seasonal + Cyclic + Trend + Error

Multiplicative model:

Y = Seasonal * Cyclic * Trend * Error

For our data, we’ll use the multiplicative model.

Here we’ll explain how the multiplicative seasonal decomposition is approximately calculated:

Normalize by Mean

We divide all values in the series by the series mean. If the mean is zero, then we do not divide the values.
Moving Averages

A core element of time-series models is calculating moving averages. A moving average is the sum of values from prior periods to predict future values. Suppose we had the following eight values: (2, 6, 5, 7, 1, 0, 8, 2). A three-component moving average would be: (NA, NA, NA, 4.34, 6, 4.34, 2.67, 3, 3.34). We’re summing the value of the prior three values and dividing by 3 to calculate the moving average component in this example.
Calculating the Trend

We calculate a trend based on the moving average. We use an OLS model to predict the outcomes (Y) from the moving average. The outcome is the moving averages and the predicted portion is the trend line.
Calculating the Cycle

We calculate the cycle from the moving average divided by the trend.
Calculating Seasonality

Then seasonality is the true outcomes (Y) divided by the moving averages. The outcome of this is seasonality plus error. To extract just seasonality, we then average all the same seasonal periods. For a yearly seasonal trend, we might sum up all historical December months to find the seasonal effect for December.
Randomness or Error

The error is the residual of the true outcomes divided by the moving averages. We then divide seasonality by this outcome. The remaining value is the error.

Modern algorithms for seasonality decomposition are more complicated than this, but the presentation here gives you a general idea of how seasonality decomposition is calculated.

Let’s take the sales data from the ITS in the sales example and break it into four quarters. Figure 11.5 shows the seasonality decomposition for our trend in the ITS example. There are many seasonal decomposition algorithms that are based on the concepts described here. The R function uses a LOESS method to do this seasonal decomposition, which we will discuss in a little more detail in Chapter 16 section 16.2.3.

FIGURE 11.5 Seasonality decomposition.

This chapter introduced concepts in time series such as stationarity, autocorrelation, and seasonality decomposition. In business contexts, seasonality decomposition is extremely useful as there are often strong seasonal trends that make inference difficult.

NOTE

A final note on temporality and causality: A core aspect of “causality” is temporality. For a process to be causal, generally X must happen before Y. This means that causally related processes might be more likely to be correlated to a lagged version of the Y variable. Note that this could still be a spurious correlation.

We can use the tools in this chapter to check the correlation between the X variables and lags in the Y variable. It adds credibility to a causality argument if a lagged Y by one or two periods is a lot more correlated to X than (1) X is correlated to Y and (2) X is correlated to a greater lagged Y. Chapter 12 describes how to contextualize these results and the other “causal” indicators to look for support for your hypothesis.

< Back Page 4 of 5 Next >

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Email Address

Observational Data Analysis Techniques

This chapter is from the book

This chapter is from the book

This chapter is from the book 

11.4 Seasonality Decomposition

InformIT Promotional Mailings & Special Offers