- A Short Story..."The Contest"
- In the Beginning...
- The Thesis
- The Business Context for Unstructured Information Mining
- Capturing Business Objectives and Domain Expertise
- A Common Analytical Methodology
- Applications for Mining the Talk
- The Transformation Process
- Summary
Capturing Business Objectives and Domain Expertise
In our journey of discovery, we have seen one mistake made repeatedly. We have seen static business models and static data models try to be used to model inherently dynamic business processes, particularly at the point of interaction. For example, virtually every customer relationship management system we have come across has a manual classification scheme (or taxonomy) that is meant to be used by the service agent to classify the nature of the customer interaction. This approach has two major flaws. First, as soon as the classification scheme is published, it is out of date, because interactions with your customers are unpredictable and continually changing. Second, even if the classification scheme was representative of your customer interactions, it is unreasonable to expect any number of service agents to classify their interactions with their customers in a consistent way and with high quality. This very often makes such classification data completely useless, or, more dangerously, misleading. This issue is true throughout the business ecosystem where unstructured information exists.
An adaptable data model is critical when incorporating unstructured information mining. One problem commonly encountered is that the analysis typically leads to more questions. In business intelligence or data mining, if the data model is not designed to handle the new question, the data model must be modified and the data manipulated and reloaded, which is often a difficult and cumbersome process, many times taking months. This problem is compounded even more with unstructured information because of its very nature. It is important to be able to add or enhance existing taxonomies, classifications, or extracted data as the information leads you through the discovery process.
Additionally, it is important to combine the right mix of algorithmic assistance with domain expertise. We have found that most people naively want to push a button and magically receive the answer. However, we have come to the conclusion that, particularly with unstructured information, the analytical process cannot be fully automated. Even for structured information, it typically needs an analyst in the loop to interpret the results. For unstructured information, an analyst is required in the loop to help guide the process with a combination of algorithmic assistance, a useful set of metrics to assist in the interpretation, and the analyst's domain expertise. The key is to make efficient use of this expensive and scarce resource, not to eliminate it entirely.