Behind the Scenes: Leveraging SLU to Improve Customer Service

In this age of information, the most important asset that enterprises rely on is data. With rapid improvements in data analysis and visualization techniques, it has become the norm for enterprises to leverage the power of data for streamlining and improving business processes.

However, what we don’t often realize is that contact centers can prove to be one of the most important sources of data for enterprises. The thousands of hours of call recordings are a storehouse of information for consumer attitudes, complaints, and feedback that enterprises can use to gain valuable insights. 

But how to go about it? The answer lies in the burgeoning field of speech analytics. 

Gartner says “Audio mining/speech analytics embrace keyword, phonetic or transcription technologies to extract insights from prerecorded voice streams. This insight can then be used to classify calls, trigger alerts/workflows, and drive operational and employee performance across the enterprise.”

Intelligent solutions like Skit’s Digital Voice Agent can not only handle customer service calls but perform advanced speech analytics in the very near future.

Using advanced Spoken Language Understanding (SLU) algorithms, the recorded speech in contact centers can be analyzed to extract crucial insights that can help enterprises streamline their performance.  

Read on to know more about the three ways in which speech analytics with SLU can help your enterprise.

Provide personalized services

With a continuous focus on innovation, Skit.ai has added the revolutionary “idiolect” layer to existing cutting-edge capabilities. In the world of linguistics, “idiolect” simply means the unique speech style of a group of people that differentiates them from other groups.

The state-of-the-art technology in the idiolect layer will enable VASR to perform advanced speech recognition and analytics to uncover more information about the speaker such as gender, age, language, and accent- and build a unique speaker profile.

Moreover, the application of certain SLU algorithms can help can further insight into the customer’s attitude and state of mind:

  • Sentiment Analysis: These algorithms can detect whether the customer’s attitude is positive, negative, or neutral during the call.
  • Emotion Detection: Such algorithms can help determine the emotions of a customer and their state of mind during the call.

With the combined help of unique customer profiles and SLU-enabled analysis of customer’s speech, it becomes easier to deliver personalized services to the customer- depending on their characteristics and current state of mind. 

Research by Epsilon has indicated that 80% of consumers are more likely to make a purchase from a brand that provides personalized experiences.

With hyper-personalized customer service experiences, you can keep your customers satisfied and reduce customer retention costs in the future.

Gather consumer insights

With multiple agents handling multiple customers in a day, it is not possible for agents to always correctly determine what consumers want or expect. Moreover, customers themselves might often be confused as to what they expect from a brand and what improvements they want in the service or product they receive. As Steve Jobs had once famously quoted:

It’s not the customer’s job to know what they want !”

However, it is crucial for any enterprise to determine the needs of their consumers to provide better services. With COVID changing consumer behavior and expectations, analyzing consumer insights can prove crucial to the path ahead.

“Businesses need to understand how this new world affects all of their touchpoints with the customer if they are to actively reinvent their own future and not be at the mercy of external events.” (PwC)

The advancements in research in Spoken Language Understanding have made it possible to use different techniques to derive important information from analyzing customer service calls. Some algorithms that can be used to derive such insights are:

  • Topic Modeling: This is a technique in SLU with which customer calls can be analyzed to create a list of natural topics that frequently occur in service calls and can help companies realize what services/products frequently need troubleshooting and have scope for improvement.
  • Text Summarization: The duration of calls might often be extremely long. With summarization algorithms, it can become easier to create summaries of calls that can be easily read through/analyzed for consumer insights.
  • Aspect Mining: It refers to a class of SLU algorithms that discovers different aspects or features in data, and along with sentiment analysis, can be used to determine the different sentiments associated with those features. For example, in a customer call, the customer may express a positive sentiment when it comes to pricing but a negative opinion on customer service quality. 

With easy access to consumer insights with SLU, enterprises can easily leverage them to make crucial decisions on how to improve business processes and products in a way that makes their customers happier.

Improve automated quality assurance

By harnessing the power of SLU, it not only becomes possible for Voice AI platforms to provide quality service but also to ensure that call quality is maintained at all times in contact centers- be it a service agent or a virtual agent.

Traditional QA teams depend on the right data to correctly analyze service quality and with contact centers handling an immense amount of calls, the process is bound to become time-consuming and even inefficient.

The use of SLU and speech analytics algorithms can provide structured insights by analyzing calls, which makes it easier for QA teams to act on those insights to streamline contact center processes for increased KPI metrics.

As brands continue to explore innovative ways of connecting with customers, they need to plug in AI technologies into their business processes to glean consumer insights that can be the driver to elevating customer experiences

Indeed, the future is undoubtedly bright for Voice AI platforms that can truly harness the power of Spoken Language Understanding. Even as we talk about these improvements, researchers are working to improve SLU and develop newer techniques that can have an even greater impact on Voice AI systems.

Behind the Scenes: Leveraging SLU to Enhance Customer Experiences

Voice-first platforms are here to stay and without doubt, they will play an important role in accelerating the adoption of technology across personal and commercial spheres. Users are growing increasingly comfortable with voice-first platforms as they are much more hassle free when compared to traditional written modes of communication, and this is reflected in consumer behaviour across industries.  

Data from OC&C Strategy Consultants shows that voice-shopping is expected to jump to $40 billion by 2022 from $2 billion in 2018, suggesting that voice-first platforms might be the next disruptive force in the retail industry.

Voice-activated virtual assistants like Siri or Cortana have become an integral part of our daily lives and enterprises have started implementing Voice AI platforms for enhancing business processes. 

There have been several recent breakthroughs in the field of Spoken Language Understanding (SLU) and this has enabled the rise of SLU-enabled Voice AI platforms that are capable of holding seamless human-like conversations.  



One of the industry sectors that illustrates a supremely successful use case for intelligent virtual assistants is the field of customer service.

“…businesses across industries are also aware of this on-going shift in the technology and customer behavior. In fact, many have already begun their voice journey and are transforming the way how customers interact with their brand.” (Trantor Inc)

With an increasing base of digital consumers worldwide, contact centers have been reeling under the pressure of ensuring good customer service while efficiently handling the immense call load that contact centers face. This has led to the adoption of Voice AI platforms for contact center automation- and advances in SLU have allowed such voice assistants to turn into quality customer service agents.

Here is how SLU-enabled voice AI platforms deliver superior customer experiences:

Increased Ease of Usage 

Since the beginning of human history, voice has been the primary mode of communication for people and has been around for much longer than written communication systems. It is no surprise that humans tend to be “voice-activated” naturally and find it much easier to interact with technology through voice commands.

Now, with rapid progress in AI-enabled speech-to-text and text-to-speech services, seamless voice-driven customer experience is a reality. As the ecosystem around voice enabled technology matures, customers are starting to rely more on voice. (PwC)

SLU has helped the development of such voice-first platforms, allowing your customers to easily connect with virtual assistant platforms in your contact center. 

Such platforms do not require your customers to trudge through the interminable IVR options, enabling them to easily express their concerns or queries in simple spoken language statements, leaving your customers happy about their experience with your brand.

Understanding the Customer

Spoken Language Understanding has taken voice AI to a new level with the ability to simulate a near-human understanding of speech by such platforms. SLU-enabled platforms don’t merely react to a fixed set of commands but rather use various techniques and algorithms to arrive at the true purpose of the customer in making the call.

  • Intent Recognition– No matter how customers frame their queries and statements, intent recognition algorithms are able to decipher the customer’s intent at one go using keywords or action words, without requiring multiple clarification from customers.
  • Named Entity Recognition– These algorithms extract important information from the customer’s speech to recognize important names, places or times that the customer talks about. 

All these SLU techniques have enabled voicebots to easily achieve human-like “understanding” capability that allows them to easily converse with the customer, eliminating the machine-like qualities from a conversation. 

Innovation in voice technology is reshaping consumer behaviour and brands need to pursue creative approaches to accelerate the adoption of Voice AI to align with customer expectations and maintain a competitive edge

Quick Query Resolution

In a world where each second matters, time is of the essence – for you and your customers. If they spend precious time on hold with your contact center while agents are busy, it can only be expected that customers will get frustrated with their experience and shift their loyalties to other competitors.

SLU enables voice AI platforms, which act as virtual agents, to easily access the required information from databases and respond quickly to customer queries. Intelligent solutions like Skit’s Digital Voice Agent have been shown to result in a 50% reduction in average handling time in contact centers.

Zendesk Research Survey discovered that 69% of respondents associated good customer service experience with a quick resolution of their issue. 

This will result in improved customer satisfaction and increased customer retention rates, translating into increased revenues and goodwill for your enterprise.

Consistent Service Experiences

Every time customers engage with your brand, they develop an opinion about the brand. To ensure that the impression your brand gives to consumers is excellent, consistency is key. No matter when and where your customers approach the contact center, their service experience needs to be consistent.

According to Forbes, 71% customers desire a consistent experience across any channel, but only 29% receive it.  About 76% receive conflicting answers to the same questions from different agents which leads to loss of customer confidence. 

Advances in SLU have enabled voice AI platforms to maintain a uniform dialog flow across the board in all customer interactions. From a standard welcome greeting to the last goodbye, everything progresses in a pre-planned flow which gives customers a sense of stability and familiarity. Every time they get in touch with your contact center, they know exactly what to do and how to do it. 

Customers will come to trust your brand as the reliable option and will increasingly engage with your enterprise and not your less-consistent competitors.

There are, thus, several ways in which SLU has enabled voice bots to deliver superior customer experiences that keep your customers pleased and induce loyalty in them. 

Keeping customers happy not only helps enterprises increase customer retention, but also helps reduce customer acquisition costs by increased word of mouth marketing and recommendations from loyal customers.

Voice AI To Resonate With and Retain Customers

Customers dislike long wait hours for query resolution and chatbots aren’t suitable for emergency requests. To ensure better services, Voice AI-led solutions work best.

In 2020, the University of Texas at Austin conducted an interesting experiment wherein 200 participants were invited to reconnect with an old friend through either a phone call or email. Despite admitting that a phone call would be more effective, some participants chose email to feel less awkward. And expectedly, those who connected through a phone call were able to form a stronger bond with their friend. It is the overall interaction experience that counts, be it in personal or professional settings.

Whether you were able to communicate, whether the other party understood your feelings, whether any misunderstandings were cleared and whether in the future both parties will be able to reestablish a connection. In the customer experience journey too, brands have chosen to connect with users over multiple points. There is text messaging, email, social media support, chatbots and the customer care centres/call centres.

Depending on the type of query, each customer is redirected to the specific touch-point. For instance, a customer seeking a bank account statement can simply get it through their net banking application while another customer looking for term insurance policies can get information through a chatbot. But for queries that require detailed insights, say reporting or KYC-related changes, customers are redirected to voice-based customer service executives.

Voice is powerful and unique to human beings. Speech goes back to human beginnings, which is almost a million years ago. The Linguistics Society of America estimates that writing was invented around 3200 B.C. It is the voice that gave rise to text, words and other forms of written communication. Because interacting through voice comes naturally to humans, it is self-taught. It is also easier to communicate thoughts through voice than any other medium simply because it is also upto seven times faster than typing. This means that one can have a longer conversation using voice.

Wait times are long and Interactive Voice Response (IVR) may not be helpful for emergency requests. Imagine your credit card getting stolen. You call up the bank’s customer care, but it takes you two minutes just to get to the appropriate node.

Another five minutes in reaching a customer care representative. And the ordeal still isn’t over because the customer care executive puts you on hold to verify details. Total time elapsed: 12 minutes. By the time the query is resolved, your card has probably been swiped at half a dozen places. An immediate solution is critical to protecting the brand reputation of companies. Ignoring customer grievances can often cost a company its clients. A study by Qualtrics XM Institute in the US found that 53% of consumers have cut spending after a single bad experience with a company.

Customers also complained that they missed a responsive mechanism in grievance resolutions. The answer is obvious. Customers prefer voice-based real interactions because this resolves queries quicker. And the practical solution is Voice AI. Built on the strong backbone of AI and Spoken Language
Understanding (SLU), Voice AI uses human-like mechanisms to receive requests, interpret and provide solutions.

NLP is the technology that the system uses to learn, understand and provide content in human languages. Unlike other solutions in this space, Voice AI is evolutionary. It can adapt to different commands and languages as it learns ‘on the job’ like us humans. Since NLP is at its core, Voice AI first hears the customer speak, converts it to text, filters out the noise, and then processes it with its neural networks. Following this, the system finds out the context of the conversation using AI. Based on this, a response is created and then communicated to the user by a
human-like voice. For instance, an individual who has a chequebook reissue request will have a different state of mind than someone who lost his debit card. Voice AI systems will differentiate between these two grievances and offer immediate support.

How customer expectations evolved

Whenever a customer contacts a company, they expect an instant response. An insurance customer only looking to renew their car policy is inundated with information about new products offerings. The same goes for bank/NBFC customers, where needless personal loans are pushed by the systems during such calls.

The pandemic added to such woes of customers. A Harvard Business Review study showed that there was an over 10% spike in ‘difficult’ callsigns in just two weeks between March 11-26. Customers wanted urgent resolution for travel issues, insurance claims, and payment extensions. Here, having a Voice AI solution can not just improve productivity and efficiency, but also improve customer trust during crisis periods.

Market Potential

The Voice AI market is at a nascent stage across the globe. It forms a part of the conversational AI segment that includes voice assistants and chatbots. With customers more accustomed to conversing with voice-devices at home, the same has translated to preferences in a business setting as well.

Gartner estimated conversational AI platforms would have $2.5 billion revenue in 2020, with a 75% year-on-year (YoY) growth. This is built on the premise that speaking is the most natural form of communication that is only set to deepen further.

Data from research platform Allied Market Research showed that the conversational AI space could potentially touch $32.62 billion by 2030, registering 20% YoY growth between 2021-30. For transaction-heavy sectors like healthcare, Voice AI could help solve existing bottlenecks.

In insurance, for example, a Voice AI could guide a customer to immediately file motor claim requests. Gauging the customer’s reactions, a human-like AI system could help calm their nerves and send help accordingly. Additional requests like towing services and highway pickups could also be provided. Since Voice AI is a system that adapts by interacting with customers constantly, the sooner it is deployed, the better will be the user experience.

Global Power

As part of an endeavour to reach all customer touchpoints, brands have globally deployed text-based solutions. But the varying internet penetration and variation in literacy levels could prove to be bottlenecks in customer experiences.

Using text for communication would not be effective in this case, so voice AI for customer services works best here. When it comes to sectoral requirements too, voice could help reduce the turnaround time for financial requests. Chat is able to process a lot of these queries too, but eventually customers prefer the medium of talking for final resolution. The high wait-times at contact centres of all financial institutions is proof.

Picture this. An insurance customer on an international trip meets with an accident and has to undergo an emergency procedure. But the hospital states that the authorities will need to verify the policy terms or speak to an insurance company official before conducting the surgery.

Here, waiting for an IVR response would simply delay the process, while chatbots may have to reroute the query to seek confirmation. On the other hand, a Voice AI would be able to disclose the policy details after authenticating the customer’s KYC details.

Identifying customers based on Know-Your-Customer and personal contact details is the next phase of growth for Voice AI systems. Once a customer’s voice is recorded for a couple of transactions, this would be used for all future conversations to ascertain and authenticate that it is indeed the registered user who is contacting the company. This would be useful for leisure services as well. For instance, seeking a special child seat at a restaurant on arrival often leads to chaos. Sending in written requests seldom works. Here, having a Voice AI that can decipher the messages and relay them back to the restaurant for timely service can be effective.

Future of Businesses

There was a phase where automated messaging was touted as the most preferred form of communication. This changed when voice assistants started seeping into the system. Globally, urban consumers have gotten used to voice assistants at home through connected devices and smart speakers. In fact, the number of Indians using voice queries daily on Google is nearly twice the global average. Since voice is a popular choice for customers’ personal use, this automatically translates into similar trends for businesses, too. A Deloitte study said that by 2030 there will be a proliferation of voice-led technology across the globe and that 30% of sales will happen via voice by 2030. Through voice-led interactions, sales will not only be more intelligent, but companies will also be able to refer to these calls to investigate user complaints.

The premise is clear. Voice is intuitive, easy to use, and has a quicker turnaround time. For customer-facing companies, it is a technology that can no longer be ignored. What’s better? Employees stay happy too. Goodbye to calls from irate customers, abusive user messages, and long working hours during busy seasons. Voice AI could become their complementary solution and improve their quality of work as well.

In a world where emotional intelligence and personalised interventions hold more value than automated responses, Voice AI will spearhead the change. The ones who adapt quicker and deploy voice will be the real winners in the long run.



To download this whitepaper as a PDF, click below.

Download the Whitepaper

Voice vs. Text: A Fundamental Difference in Approach

While chatbot vendors are now trying to offer an embedded solution that contains text and voice, these models cannot be clubbed into one platform.

By 2022, close to 200 million jobs would be lost globally due to the Covid crisis. A lot of those unemployed will need to make changes in their monthly payments like home loan tenure, convert their credit bills into EMIs, and remove value-added services. Such customers connecting with a company during an emotionally volatile state may not just be looking for a solution, but could also be seeking a sympathetic ear.

In such a situation, a vulnerable customer will prefer speaking to someone to differ payments rather than type a series of requests for each liability.

For instance, converting a credit bill into an EMI is one command that is executed after typing out a few details. Then comes home loan tenure increase, which requires another set of instructions. Here, it is faster and smoother for the information to be captured via voice.

Let’s take another example. A customer who lost a parent to Covid can file a death claim online. But speaking to a service executive who could empathetically listen to their concerns could soothe nerves during distress. Not only can the voice-led channel help minimise claim delays by specifying the exact documents needed, but the customer can also understand the formalities over a single call. We have read how the current customer service models are missing out on the primacy of voice. There is a perception in the market that having a single solution for text and voice will help bridge the gap. But simply building a voice solution over existing text solutions may hamper the user experience.

In customer service, voice is designed to understand the nuance and gravity of a request. This is true especially for emergency situations where customers may not have the time nor the mind space to sit and type requests like finding a network hospital or an unauthorised transaction through a bank account Trivializing voice and offering it as a ‘good-to-have’ solution by chat providers is counter-intuitive because voice is a specialized solution that encompasses the layers chat requires, plus catches peculiar behaviours like tone and pauses in speech. The demand for Voice AI has grown exponentially in the
past few years.

According to a report by Statista, the number of digital voice assistants is likely to reach 8.4 billion units by 2024. So, it makes sense that companies want to adapt to this growing trend.

Voice is convenient, especially because humans speak and perceive things differently over speech than text. For instance, an indecisive food-delivery customer who keeps changing his/her order may find it easier to finalise an order over voice rather than typing and selecting products. Having a voice conversation also enables them to make a faster decision on what food to order.

While the thrust is on ‘omnichannel’ presence by brands, deploying voice effectively could help resolve a lot of customer complaints across product and service categories. Being present across customer touchpoints is good, but resolving queries constructively and consistently on a single voice-led platform is better.

How is voice different from text?

Chabots follow a flow wherein the text input is fed into the spoken language understanding engine. This engine understands the input/query and decides on the next course of action. Based on the context of the conversation, the response is prepared in a text format but relayed back to the user.

Voice AI, on the other hand, has two engines specifically available to understand speech. One is a speech-to-text engine, and the other is an automatic speech recognition engine.

The last part of this process is the dialogue manager, which acts as the orchestrator of the entire conversation. This is the block that manages the flow of data among the above three blocks and the flow of the conversation. And all these processes happen within milliseconds over the cloud, so it is device agnostic.

The end goals of voice and text are also fundamentally different. Text is intended to resolve basic customer requests and redirect complicated questions to customer service personnel. For instance, a customer looking to book a restaurant table is able to ask multiple questions in one go through voice. These could be the waiting times at certain points of the day, the chef’s menu, and specific details about the dishes (ingredients, spicy, vegan alternatives, etc.). An added layer of benefit are newer concepts like paralinguistics being used in the Voice AI ecosystem. This involves communication other than spoken words, including tone, pitch, pauses, and gestures. For sales teams of customer-facing brands, this offers a tremendous opportunity to gauge a customer’s interest in the product and gauge their buy intent.

Once Voice AI determines who is more inclined to buy a product/service, additional time can be spent to explain to convince the customer. This essentially means that cross-selling products will be far easier and effective if these Voice AI solutions are deployed. Some sectors that could take advantage of this concept are hospitality chains, restaurants, and financial institutions selling retail products like credit cards and quick personal loans.

It is often noticed that customers need to be nudged to reveal information, a process that can be done effortlessly over voice. Say, a newly launched shoe brand wants deeper feedback on the products. Using the customer database, a caller could be contacted using Voice AI to seek a detailed response on the pros and cons of the shoes. A customer may like the product quality but may have found its pricing to be steep while another customer may be looking for newer colour options.

Customers seldom fill long review forms that are sent post-purchase, hence bringing voice into this equation helps in better assessment. Based on the collective feedback, companies will also be able to tweak their product offerings accordingly, leading to improvement in sales. Customers, too, feel satisfied that their opinions have been taken into consideration.

A clubbed solution isn’t effective

Voice is an ideal turf for AI to learn, evolve, and constantly upskill by taking due note of user sentiments and emotions. And the best part? The user doesn’t need to be able to write a language fluently. Voice AI provides the unmatched ability to interact through casual conversations.

Critical user feedback, including anger, can’t be spotted immediately on text. This is essential for companies involved in product development, where continuous feedback generation is the key to success. As stated earlier, chatbots rely on key terms such as bad, poor, or terrible to deduce that the experience is unsatisfactory. Voice, on the other hand, listens attentively to different users to understand their sentiments.

Vendors offering ‘text+voice’ combo products do not understand the performance requirements of Voice AI systems. Low latency or quick processing of data to offer the right answers is crucial. Right now, there seems to be a rush among brands to implement AI for customer service. But the key here is to operationalize a solution that is accurate and solves a given problem. The thing to remember is that context and slang change with geographies. They are different in different markets. This means each Voice AI system needs to be modified to suit the audiences in that location. This is where the expertise of market providers, such as Skit, comes in handy.

The emerging dynamics of voice

Voice works best for context-led conversations where tone and inflection can convey a response without using actual words. And as the technology develops, its use-cases have also been evolving. In areas like sales and product testing, Voice AI could be used to pitch the product better and sound more persuasive.

Customers are also more likely to interact for a longer duration with a Voice AI system that understands his/her specific needs. These conversations are also useful for training the internal systems and for conducting quality checks at a later stage. For example, a fintech company developing a buy-now-pay-later (BNPL) product could use an advanced Voice AI system to capture the purchasing patterns of a customer. Since it is responsive, the customer can also cross-question Voice AI on the relevance of the terms and conditions of the BNPL feature and default penalties.

And if the Voice AI notices that target customers are enquiring repeatedly about penalties, this can be relayed back to the brand so that its messaging can be tweaked to include the terms upfront.

Here, deploying voice to recognize and identify customer details will help prevent such risks. This is because the AI can identify regular pauses and also spot any nervous tones indicating the presence of fraudsters on the call. Psychological concepts like entrainment could be complementary to the existing services where customer interactions can be improved. For instance, an angry customer could be pacified through Voice AI speaking in a calmer voice tone. Similarly, if a customer will be understood even if he/she switches to a different language midway into the call.

Voice solutions are getting richer. While a lot of vendor solutions already exist in the market, specialized products are far and few in between. A Voice AI product that is constantly tested for different use-cases across sectors is what will be suitable for commercial use. In markets like the US where financial frauds lead to brands losing millions of dollars in revenue and also reputation loss, Voice AI could come handy in adding a layer of voice-led authentication.

Here, deploying voice to recognize and identify customer details will help prevent such risks. This is because the AI can identify regular pauses and also spot any nervous tones indicating the presence of fraudsters on the call.

Psychological concepts like entrainment could be complementary to the existing services where customer interactions can be improved. For instance, an angry customer could be pacified through Voice AI speaking in a calmer voice tone. Similarly, if a customer will be understood even if he/she switches to a different language midway into the call.

Voice solutions are getting richer. While a lot of vendor solutions already exist in the market, specialized products are far and few in between. A Voice AI product that is constantly tested for different use-cases across sectors is what will be suitable for commercial use.



To download this whitepaper as a PDF, click below.

Download the Whitepaper