- Overview
- Statistics and Machine Learning
- The Impact of Big Data
- Supervised and Unsupervised Learning
- Linear Models and Linear Regression
- Generalized Linear Models
- Generalized Additive Models
- Logistic Regression
- Enhanced Regression
- Survival Analysis
- Decision Tree Learning
- Bayesian Methods
- Neural Networks and Deep Learning
- Support Vector Machines
- Ensemble Learning
- Automated Learning
- Summary
Support Vector Machines
Support vector machines (SVMs) evolved in the 1990s from pattern recognition research at Bell Labs. They work for either classification or regression, and are very useful when working with highly dimensional data—that is, when the number of potential predictors is very large.
The SVM algorithm depends on kernels, or transformations that map input data into a high-dimensional space. Kernel functions can be linear or nonlinear. After mapping the input data, the SVM algorithm constructs one or more hyperplanes that separate the data into homogeneous subgroups.
Given its robustness with highly dimensional data, SVM is well suited to applications in handwriting recognition, text categorization, or image tagging. In medical science, researchers successfully applied SVM to the detection of tumors in breast images and the classification of complex proteins.
Commercial software packages that support SVM include Alpine Data Labs’ Alpine, IBM SPSS Modeler, Oracle Data Mining, SAS Enterprise Miner, and Statistica Data Miner. Open source options include Apache Spark MLLib, JKernalMachines, LIBSVM, and Vowpal Wabbit. For R users, there are a number of packages, including kernlab, SVMMaj, gcdnet, obliqueRF, MVpower, svcR, and rasclass; for Python users, some SVM capabilities are included in scikit-learn and PyML.