- Overview
- Statistics and Machine Learning
- The Impact of Big Data
- Supervised and Unsupervised Learning
- Linear Models and Linear Regression
- Generalized Linear Models
- Generalized Additive Models
- Logistic Regression
- Enhanced Regression
- Survival Analysis
- Decision Tree Learning
- Bayesian Methods
- Neural Networks and Deep Learning
- Support Vector Machines
- Ensemble Learning
- Automated Learning
- Summary
Generalized Additive Models
The generalized additive model (GAM) is a type of nonparametric regression. Techniques such as linear regression are parametric, which means they incorporate certain assumptions about the data. When an analyst uses a parametric technique with data that does not conform to its assumptions, the result of the analysis may be a weak or biased model. Nonparametric regression relaxes assumptions of linearity, enabling the analyst to detect patterns that parametric techniques may miss.
There are a number of different nonparametric techniques, but many of them perform poorly with many potential predictors; they tend to be greedy for large sample sizes and may lack stability. Certain methods, such as kernel methods and smoothing splines, are also very difficult to interpret.
The additive model, first proposed in the early 1980s, is a more general form of the linear regression model, which you express as y = b + a1x1 + a2x2 + ... + anxn. In an additive model, you replace the simple terms of the linear equation with more complex functions. In a generalized additive model, the regression equation takes the form of a link function so that the response measure can take the form of any of the family of exponential distributions.
The principal advantage of GAM is its ability to model highly complex nonlinear relationships when the number of potential predictors is large. The main disadvantage of GAM is its computational complexity; like other nonparametric methods, GAM has a high propensity for overfitting.
SAS, Statistica, and Stata all support GAM. There are 17 different packages in open source R that support GAM, but none currently available in Python.