- What Is Data Mining?
- What Data Mining Is Not
- The Most Common Data Mining Applications
- What Kinds of Patterns Can Data Mining Discover?
- Popular Data Mining Tools
- The Dark Side of Data Mining: Privacy Concerns
- Summary
- References
The Most Common Data Mining Applications
Data mining has become a popular tool in addressing many complex business problems and opportunities. It has proven to be very successful and helpful in many areas, some of which are listed and briefly discussed in the following sections. It is difficult to find an industry or a problem area where a significant number of data mining applications have not already been covered in the literature. The goal of many of these data mining applications is to solve complex problems or to explore emerging opportunities in order to create a sustainable competitive advantage.
Marketing and Customer Relationship Management
Customer relationship management (CRM) is an extension of traditional marketing. The goal of CRM is to create one-on-one relationships with customers by developing an intimate understanding of their needs and wants. As businesses build relationships with their customers over time through a variety of interactions (e.g., product inquiries, sales, service requests, warranty calls, product reviews, social media connections), they accumulate tremendous amounts of data. When combined with demographic and socioeconomic attributes, this information-rich data can be used to identify most likely responders/buyers of new products and services (i.e., customer profiling); understand the root causes of customer attrition in order to improve customer retention (i.e., churn analysis); discover time-variant associations between products and services to maximize sales and customer value; and find the most profitable customers and their preferential needs to strengthen relationships and maximize sales.
Banking and Finance
Data mining can help banks and other financial institutions address a variety of problems and opportunities. Data mining can be used to streamline and automate the processing of loan applications by accurately predicting and identifying the most probable defaulters; to detect fraudulent credit card and online-banking transactions; to find new ways to maximize customer value by selling products and services that customers are most likely to buy; and to optimize cash return by accurately forecasting the cash flow on banking entities (e.g., ATMs, bank branches).
Retailing and Logistics
In retailing, data mining can be used to predict accurate sales volumes at specific retail locations in order to determine correct inventory levels; to identify sales relationships between different products (with market-basket analysis) to improve store layout and optimize sales promotions; to forecast consumption levels for different product types (based on seasonal and environmental conditions); to optimize logistics and hence maximize sales; and to discover interesting patterns in the movement of products (especially for products that have a limited shelf life because they are prone to expiration, perishability, and contamination) in a supply chain by analyzing sensory and RFID data.
Manufacturing
Manufacturers can use data mining to predict machinery failures before they occur through the use of sensory data (enabling condition-based maintenance); to identify anomalies and commonalities in production systems to optimize manufacturing capacity; and to discover novel patterns to identify and improve product quality.
Brokerages and Securities Trading
Brokers and traders use data mining to predict when and how much certain stock and/or bond prices will change; to forecast the range and direction of market fluctuations; to assess the effect of particular issues and events on overall market movements; and to identify and prevent fraudulent activities in securities trading.
Insurance
The insurance industry uses data mining techniques to forecast claim amounts for property and medical coverage costs for better business planning; to determine optimal rate plans based on the analysis of claims and customer data; to predict which customers are most likely to buy new policies with special features; and to identify and prevent incorrect claims payments and fraudulent activities.
Computer Hardware and Software
Data mining can be used to predict disk drive failures well before they actually occur, to identify and filter unwanted web content and email messages, to detect and prevent computer network security bridges, and to identify potentially unsecure software products.
Government and Defense
Data mining has a number of government and military applications. It can be used to forecast the cost of moving military personnel and equipment; to predict an adversary’s moves and hence develop more successful strategies for military engagements; to predict resource consumption for better planning and budgeting; and to identify classes of unique experiences, strategies, and lessons learned from military operations for better knowledge sharing throughout the organization.
Travel and Lodging
Data mining has a variety of uses in the travel industry. It can be used to predict sales of different services (e.g., seat types in airplanes, room types in hotels/resorts, car types in rental car companies) in order to optimally price services to maximize revenues as a function of time-varying transactions (commonly referred to as yield management); to forecast demand at different locations to better allocate limited organizational resources; to identify the most profitable customers and provide them with personalized services to maintain their repeat business; and to retain valuable employees by identifying and acting on the root causes for attrition.
Health and Health Care
Data mining has a number of health care applications. It can be used to help individuals and groups pursue healthier lifestyles (by analyzing data collected with wearable health-monitoring devices); to identify people without health insurance and the factors underlying this undesired phenomenon; to identify novel cost–benefit relationships between different treatments to develop more effective strategies; to forecast the level and the time of demand at different service locations to optimally allocate organizational resources; and to understand the underlying reasons for customer and employee attrition.
Medicine
The use of data mining in medicine is an invaluable complement to traditional medical research, which is mainly clinical and biological in nature. Data mining analyses can be used to identify novel patterns to improve the survivability of patients with cancer; to predict the success rates of organ transplantation patients to develop better donor–organ matching policies; to identify the functions of different genes in the human chromosome (known as genomics); and to discover the relationships between symptoms and illnesses (as well as illnesses and successful treatments) to help medical professionals make informed decisions in a timely manner.
Entertainment
Data mining is successfully used in the entertainment industry to analyze viewer data to decide what programs to show during prime time and how to maximize returns by knowing where to insert advertisements; to predict the financial success of movies before they are produced to make investment decisions and to optimize returns; to forecast the demand at different locations and different times to better schedule entertainment events and to optimally allocate resources; and to develop optimal pricing policies to maximize revenues.
Homeland Security and Law Enforcement
Data mining has a number of homeland security and law enforcement applications. Data mining is often used to identify patterns of terrorist behaviors; to discover crime patterns (e.g., locations, timings, criminal behaviors, and other related attributes) to help solve criminal cases in a timely manner; to predict and eliminate potential biological and chemical attacks on the nation’s critical infrastructure by analyzing special-purpose sensory data; and to identify and stop malicious attacks on critical information infrastructures (often called information warfare).
Sports
Data mining has been used to improve the performance of National Basketball Association (NBA) teams in the United States. Major League Baseball teams use predictive analytics and data mining to optimally utilize limited resources for a winning season. (In fact, Moneyball is a popular movie about the use of analytics in baseball.) Most professional sports today employ data crunchers and use data mining to increase their chances of winning.
Data mining applications are not limited to professional sports. For example, Delen at al. (2012) developed models to predict NCAA Bowl Game outcomes, using a wide range of variables about the two opposing teams’ previous games statistics. Wright (2012) used a variety of predictors for examination of the NCAA men’s basketball championship bracket (a.k.a. March Madness). In short, data mining can be used to predict the outcomes of sporting events, to identify means to increase odds of winning against a specific opponent, and to make the most out of the available resources (e.g., financial, managerial, athletic) so that a team can produce the best possible outcomes.