Home > Articles

This chapter is from the book

Enhancing Customer Support Through Hybrid AI: LLMs Meet Clustering and Topic Modeling

Customer support is evolving, and businesses seek more sophisticated and powerful solutions to handle the vast amount of textual data generated in interactions. A hybrid approach, blending the capabilities of LLMs and traditional machine learning techniques, emerges as a robust strategy. We’ll explore a few of these machine learning techniques often utilized in support organizations to make sense of the large amounts of data to help optimize the business.

Clustering and Customer Support

Clustering is an unsupervised learning approach of grouping a set of samples based on their similarity without using any predefined labels or categories. Clustering aims to discover the natural structure or patterns of the data, as well as to reduce its complexity and dimensionality. Clustering can be used for various purposes, such as data exploration, summarization, organization, retrieval, and visualization. There are several different clustering methods:

  • Hierarchical clustering: This method builds a hierarchy of clusters, where each cluster is either a subcluster or a supercluster of another cluster. Hierarchical clustering can be either agglomerative or divisive. Agglomerative clustering starts with each sample as a singleton cluster and then merges the most similar clusters until a single cluster remains. Divisive clustering starts with all documents in one cluster and then splits the most dissimilar clusters until each cluster contains only one sample.

  • Partitioning clustering: This method divides the data points into a predefined number of non-overlapping clusters, where each point belongs to exactly one cluster. K-mean clustering is one of the most popular algorithms for partitioning clustering. Partitioning clustering can be either distance-based or centroid-based. Distance-based clustering assigns each data point to the cluster with the closest or most similar representative, such as the nearest neighbor. Centroid-based clustering assigns each data point to the cluster with the smallest or least average distance to the center or the cluster’s mean, such as K-mean clustering. K-mean clustering classifies samples based on attributes or features into k clusters. It starts with a first group of randomly selected centroids, which are used as the beginning points for every cluster, and then assigns each point to the cluster whose mean has the least squared Euclidean distance and optimizes the centroid based on the distances from the points to it. The hard assignment stops creating and optimizing clustering when either the centroids have stabilized or the defined number of iterations has been reached.

  • Density-based clustering: This method identifies clusters based on the density or the concentration of the data points in the feature space, where regions of separate low-density clusters can be uncovered and assist in identifying unforeseen patterns. Density-based clustering can handle outliers, noise, and arbitrary shapes of clusters. One of the popular algorithms for density-based clustering is DBSCAN (density-based spatial clustering of applications with noise). DBSCAN defines a cluster as a set of densely connected core points; a point is a core point if it has at least a minimum number of points within a given radius or neighborhood.

Clustering is a powerful technique for identifying patterns and insights from large and complex data sets. It can be used to segment customers, optimize services, categorize issues based on their similarities or differences, and provide personalized and efficient solutions. In the field of customer service and support, clustering has been a popular approach for solving some problems, such as:

  • Customer segmentation: Clustering can help discover different groups of customers based on their demographics, preferences, needs, behaviors, or characteristics, such as age, gender, location, income, spending habits, loyalty, satisfaction, or feedback. This can help tailor the marketing strategies, product recommendations, pricing policies, or communication channels for each segment and to improve customer retention and acquisition.

  • Service optimization: Clustering can help optimize the service delivery and support processes based on the complexity, urgency, or frequency of customer requests, issues, or inquiries, such as order status, product information, technical support, billing, or feedback. This can help allocate the appropriate resources, staff, or channels for each service type and improve service efficiency and quality.

  • Support case categorization: Clustering can help resolve customer issues faster and more effectively by grouping similar or related issues based on their causes, symptoms, or solutions, such as product defects, software bugs, network failures, or user errors. When AI technology is used to cluster similar cases together, these groupings can help by offering new insights that are not obvious when looking at cases individually or by product. An example might be multiple unrelated services experiencing login or profile creation issues. Viewed on their own, these could be hard to relate or determine the root cause of the issue, but after clustering them together, it might be more obvious that this is a problem with shared code providing identity services to multiple workloads. This clustering can help diagnose the root causes, find the best solutions or prevent future occurrences of the issues, increase customer satisfaction, and enhance retention.

Topic Modeling and Customer Support

Topic modeling is a technique for extracting hidden topics or concepts from a collection of text documents, such as customer reviews, feedback, complaints, or inquiries. Topic modeling can help discover the main themes or patterns of customer needs, preferences, opinions, or issues and provide valuable insights for customer support improvement, product development, marketing strategy, or sentiment analysis.

There are several different topic modeling methods. These algorithms differ in their assumptions, mathematical models, and implementations, but they all share the same basic idea: finding a low-dimensional representation of the documents and the words in terms of topics and probabilities. The output of a topic modeling algorithm is usually a matrix that shows the relationship between documents and topics, and another matrix that shows the relationship between topics and words. These matrices can be used to infer the topics of new documents, find similar documents, visualize the topics, and extract insights from the text data. These methods include:

  • Latent Dirichlet Allocation (LDA): This is one of the most popular topic modeling methods. LDA is an unsupervised learning algorithm that describes a set of observations as a mixture of distinct categories. These categories are themselves a probability distribution over the features. LDA is most commonly used to discover a user-specific number of topics shared by a collection of documents within a text corpus. Each observation is a document, the features are the presence or occurrence count of each word, the categories are the topics. LDA uses a generative process to assign topic probabilities to each document and word probabilities to each topic, based on the observed word frequencies in the documents. LDA can be applied to large, diverse text corpora and produce interpretable and coherent topics.

  • Non-negative Matrix Factorization (NMF): NMF is a linear algebra method that decomposes a matrix of word-document frequencies into two lower-dimensional non-negative matrices, one representing the word-topic associations and the other representing the topic-document associations. NMF imposes a non-negativity constraint on the matrices, which ensures that the topics and the documents have additive and meaningful components. NMF can be faster and more robust than LDA and can handle sparse and noisy data.

  • Hierarchical Dirichlet Process (HDP): HDP is a Bayesian nonparametric model that extends LDA by allowing the number of topics to be automatically inferred from the data rather than fixed in advance. HDP uses a hierarchical structure of Dirichlet processes to generate a potentially infinite number of topics and assigns them to the documents based on their relevance and specificity. HDP can adapt to the complexity and diversity of the text data and can avoid overfitting or underfitting the topics.

Topic modeling is a valuable technique in the customer service and support field for extracting insights from large volumes of textual data, such as customer reviews, feedback, and support cases. Here’s how topic modeling is leveraged in this domain:

  • Automated support case categorization: Customer support teams often deal with a variety of issues and requests. Topic modeling can be leveraged to automatically categorize support tickets into different topics or categories based on their content. This helps in routing tickets to appropriate product support teams and improves response time and efficiency. Moreover, topic modeling can help automate some processes in the customer support workflow. For example, it can point customers to the self-help knowledge base, diagnostics, or websites with the accurate topic category prediction. This can enhance the customer experience, reduce customer effort, and increase operational efficiency.

  • Identifying emerging issues: Topic modeling can help uncover emerging trends or issues in customer feedback and support cases. It provides actionable insights for companies to address top issues before they escalate proactively.

  • Improving search and retrieval: Topic modeling helps organize and index articles based on the topics for a large knowledge base of support or self-help articles. This improves the search and retrieval process for support agents or engineers and the customers looking for solutions.

  • Customer feedback analysis: Topic modeling can help analyze and summarize customer feedback from multiple channels and platforms. This can help identify the most common and important topics, issues, compliments, complaints, and suggestions that customers express. This can also help products and companies measure and track key performance indicators related to customer support, customer satisfaction, and loyalty. For instance, it can help measure the volume of support cases in different categories, identify resolution time, and assess customer satisfaction for each topic. Furthermore, product teams can prioritize and address customer complaints and grievances more effectively.

  • Content creation and knowledge management. Topic modeling aids in content creation for FAQs, manuals, and support articles. It helps identify the most discussed topics, allowing companies to create relevant and helpful content that addresses common customer queries.

In essence, topic modeling enhances the efficiency and effectiveness of customer service and support operations by providing automated tools for organizing, analyzing, and extracting insights from large volumes of textual customer data.

Hybrid AI Opportunity

Traditional machine learning methods like topic modeling and clustering have their own limitations and challenges. One of the main drawbacks is that they rely on statistical methods that do not account for the semantic and contextual nuances of natural language. For example, topic modeling may fail to distinguish between different meanings or senses of the same word, such as apple as a company but not as a fruit, or group together words that are syntactically similar but semantically different, such as bass as a type of fish but not low-frequency sound in music. Moreover, topic modeling may produce topics that are too broad, too narrow, or not coherent, depending on the choice of parameters and algorithms. In contrast, large language models, such as GPT and Gemini, have demonstrated remarkable proficiency in understanding context, generating human-like responses, and extracting intricate patterns from textual data. In customer support, LLMs can be employed for tasks like sentiment analysis, intent recognition, and even generating responses to common queries.

While LLMs excel in understanding context and generating text, traditional machine learning methods like clustering and topic modeling offer strengths in structuring and organizing information. Clustering can group similar customer queries or issues, facilitating efficient handling by support agents. Topic modeling, on the other hand, extracts underlying themes from a vast dataset, aiding in understanding prevalent customer concerns. Moreover, when computational resources and budget are limited, it is easier and cheaper to leverage traditional machine learning methods like topic modeling and clustering.

In the dynamic landscape of customer support, a hybrid approach, integrating the capabilities of LLMs with the structuring prowess of traditional methods, proves to be a holistic solution. By combining LLMs with topic modeling, more accurate, robust, and interpretable models can be utilized for customer feedback analysis. For instance, language models can help generate more natural and fluent texts from topics and can also help capture the semantic and contextual information that topic modeling may miss. Furthermore, LLMs can help generate new and novel topics that may not be present in the existing data or suggest relevant and personalized content based on the topics of interest of each customer, while topic modeling and clustering can bring more interpretability and flexibility. This hybrid solution addresses the complexities of customer interactions, providing businesses with a powerful tool for improving customer satisfaction and support efficiency.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020