Outbound IVR Robocaller Vs AI-Powered Digital Voice Agents 

One in four Americans (28 %) have at least one debt in collections. It underscores the significance of debt collection services. As more consumers depend on credit for multiple purchases from homes to vehicles, household appliances, and sometimes everyday living expenses, debt collection services are playing an even more significant role in the availability and recovery of credit.

Though use of IVR outbound Robocaller or outbound IVR is largely demotivated for debt collection through TCPA and FDCPA, we would discuss little bit about use of IVR outbound Robocaller for debt collection in this blog. 

Over the last couple of decades, it was perhaps wise to deploy Outbound IVRs, Voice Blasters, or Robocallers. The technology helped companies send pre-recorded phone messages to hundreds of consumers at once. 



In the last couple of decades, they have helped companies reduce calling errors, call costs, and improve productivity. But with rapid advancements in technology, especially Voice AI, the competitive landscape has changed rapidly in favor of intelligent voice conversations. 

In this blog, we delve into the core of the issue to explain why Intelligent Voice Agents are the way to deliver superior business performance and customer experience.

Explore how Voice AI solutions are Transforming Debt Collections

Understanding IVRs and why they fail to deliver real value

Typically, an Outbound IVR (Interactive Voice Response) is used to proactively reach out to a large number of customers in a personalized manner using different interaction channels, such as voice messages. The most common use cases are feedback, promotions, announcements, reminders, etc. 

Robocaller or outbound IVR has essentially two components in it; a dialer capability and a text-to-speech engine (Advanced Outbound IVRs) or a recorded voice message (Robocaller). Businesses can upload thousands of contacts in the dialer and configure certain parameters such as number and time of retry attempts, time of call etc. Dialer calls up these contacts and play a voice message which consumers can listen to. At the end of call, consumer can provide keypad based number input to listen to the message again and certain other things.

For 1990s this technology was a game-changer and led to huge improvement in efficiency, however, today it is ineffective and unnecessary, to say the least. 

Even the best outbound IVRs ail from persistent challenges as enumerated below:

  1. Unidirectional Communication: IVRs are capable of only unidirectional communication with a limited DTMF (Keypad-based) feedback mechanism.
  2. Low Engagement: IVRs have extremely low engagement rates owing to their non-conversational unidirectional communication.
  3. Right party contact: Inability to capture conversational inputs and run verification to check for right-party communication. Today, you cannot pass on debt related information to the wrong contact even inadvertently.
  4. Lack of ability to capture important dispositions: Robocallers or outbound IVR can’t capture meaningful dispositions that can be used downstream, such as:
    • Willingness to pay, and expected date and mode of payment
    • Refusal to pay and associated reasons
    • Debt dispute and reasons
    • Willingness to pay partially and offer payment arrangements.
    • Ability to capture call-back date and time for busy customers.
  5. Lack of insights for segmentation: inability to segment the pool of consumers based on disposition to help debt collection companies make meaningful strategic decisions.
  6. Inability to reach out to consumers on their preferred time: Since Robocaller cannot capture disposition for busy consumers, it cannot intelligently call back or arrange call back from human agents.
  7. Payment assistance and goal completion: can not help or guide the willing consumer to make the payment during the call.
  8. Human-Agent Dependence: for a large chunk of calls, the agent are needed to reach to a meaningful end result.
  9. Compliance adherance: Since every call campaign is triggered manually, compliance is left with the operator who is running the campaigns.
  10. Customer Experience: being extremely impersonal, they miserably fail at contributing to CX.

IVRs, even at their best, do not contribute to CX or major productivity gains, whereas a bad IVR experience can prove very costly. The State of IVR in 2018 noted that 83% of customers would avoid a company after a poor experience with an IVR. 

The more pressing problem still remains:

“How to automate the mundane, repetitive and non-value additive tasks human agents are doing”

For a long time, we did not have an answer, or we did not have a commercially viable technology solution, but today we have, and it is Intelligent Voice AI Agent.

Explore how AI-enabled Voice AI Agents are the Perfect Solution to Meet Compliance Requirements

Understanding Digital Voice Agents

Digital Voice agents are AI-powered virtual agents that allow customers to converse intelligently, without having to punch 1,2,3,4 on their screen to hold meaningful contextual conversation. It is able to converse with your consumers just like your human agents. It is capable of understanding, interpreting, and then analyzing conversational voice input expressed by an individual and responding to them in an everyday language.

A Virtual Voice Agent goes beyond understanding words, and determines what the consumer is saying based on underlying semantics, without relying on specific keywords. Using machine learning, a Virtual Voice Agent is continuously improving itself and the customer experience. Read more about Digital Voice AI agent here.

Unlike Siri and Alexa, which are designed to handle everyday context-less tasks such as setting up an alarm or playing the songs, AI-powered digital voice agents are trained specifically to handle complex problems, and understand what a customer may want in all probable scenarios, making them highly effective in solving customer problems and requests. 

A Comparative look: Digital Voice Agent Vs Outbound IVR

4 Core Benefits: Why Top Collection Agencies are Deploying Digital Voice Agents 

For any company, AI-enabled Digital Voice Agents are a quantum leap from aging outbound IVRs. There is no comparison. Digital Voice Agents are AI-enabled, making them improve exponentially with time. One can surmise the amount of competitive leg-up companies can create as they start early. Here are the core business benefits of deploying Digital Voice Agents over IVRs:

  1. Reducing Cost and Improving Speed of Collections: The Digital Voice Agents can make or handle hundreds of concurrent calls at scale, economically, and in just an hour. Not only that, voice agents, being a machine, are very punctual and reach out to debtors that request a callback or make reattempts right on time when the probability of connecting to contact is highest. All this is done within the prescribed compliance framework.
  2. Superior Recovery and Collection Efforts: Better collection and recovery demand persistent efforts. When nudged at the right time, a debtor who is willing but unable to pay now might pay a few months down the line. Thus, what matters is how persistently collection agencies can reach out to a certain segment of debtors, ideally disposed to pay. It’s a piece of cake for Digital Voice Agent to schedule follow-up calls, honoring the regulatory guidelines, spread over weeks/months, and ensure better recovery rates. With timely and adequate calls going out to customers, and 24*7 support, the right voice-tech solution checks all the boxes to improve collections and recovery.
  3. Minimize Errors, ensure Compliance and Security: With a myriad of ever-changing regulations, disparate for each state, it is challenging for agents to keep abreast and be flawless. Training and development are costly, but Digital Voice Agents are easy to update and ensure perfect compliance. IVRs play a limited role, as unidirectional communications have a low impact.
  4. Human-Agent Bandwidth Prioritization: The beauty of deploying an Augmented Voice Intelligence is that it can call all the customers and filter the cases of the complex cases that need human agent interference. In the present system, agents call the entire list, be it a simple case or a complex one, not creating desired value in the process. For the dispositions where human intervention is required, Voice Agent can segment the portfolio so that relevant human agents can be assigned the downstream tasks. This prioritization of bandwidth unlocks massive value for the collection companies.

For more information and free consultation, let’s connect over a quick call; Book Now!

Also, for more information visit our Collections Page.

Digital Voice Agents: What, Why and How

Systems that can handle mundane tasks have existed for several years. But in the recent past, we have seen an uptick in conversational assistants such as Siri, Alexa, Google Home, and Samsung Bixby. These systems handle human conversations and respond in a human-like manner. In fact, it has become an internal part of our daily lives.

The speech and voice recognition market is expected to grow from USD 8.3 billion in 2021 to USD 22.0 billion by 2026; it is expected to grow at a CAGR of 21.6 % during the forecast period.

When it comes to CX, the always-on customers expect more when it comes to customer service. They need personalized and faster resolutions. They can no longer wait for minutes together to connect with an agent or navigate through complex IVR menus.

Solutions like voice bots are disrupting customer service as they promise the same level of customer experience as a human agent. Advanced AI-powered Digital Voice Agents can help CX leaders elevate their customer experience while reducing costs, thereby solving two of the biggest challenges faced by them on a daily basis. The solution is scalable and more efficient than other channels like email and IVR.

What is a Digital Voice Agent?

A Digital Voice Agent is a conversational robot (commonly known as a voice bot), that has the ability to interact with a user and take a certain set of actions in order to meet an end goal. It is very similar to voice assistants like Apple Siri, Google Assistant, Alexa we use on a daily basis.

But what’s the difference? 

Voice assistants are designed to handle one or two turns of the conversation to meet generic day-to-day goals.

Example of a single turn conversation

Digital Voice Agents, on the other hand, are designed to solve specific problems which require much more than two turns of conversation, just the way we humans solve queries by first asking multiple questions to understand the context and all the required information to solve any problem.

For example, a lost credit card is blocked by asking a series of standard questions: the first couple of questions to verify the caller, and the next set of questions to confirm which credit card to be blocked and then followed by an action where the customer is sent a new credit card. Typically, this is a 6-7 turn conversation that generic voice assistants are not designed to handle. Specialized voice bots are required to be trained to handle such tasks.

So, How does Skit’s Digital Voice Agent work?

Fundamentally, there are at least four components (engines) to any voice bot:

ASR (Automatic Speech Recognition): This converts the voice into text transcription. This is alternatively called Speech-to-text or STT Engine.

SLU (Spoken Language Understanding): This is the brain of the voice bot. It extracts intents and entities (data points) from the text sentence produced by ASR and then comes up with the best possible action. That action can be performed in terms of voice reply or sending a document or a text message, or transferring the call or raising a ticket etc.

TTS (Text to Speech): The block that translates the text into voice for generating a reply. 

Dialogue Manager (Orchestrator): The block that manages the flow of data among the above three blocks and the flow of the conversation.

All these processes happen in real-time and within milliseconds. This is only one turn of the conversation and this process gets repeated for subsequent turns.

All these processes are performed in the cloud after the voice packets are received from a user. So it doesn’t really matter which device the caller is using, whether it’s a smartphone or a feature phone or a wired telephone. Skit’s Digital Voice Agents leverage all these layers to seamlessly plug into contact centers and augment the work of human agents.

How are Digital Voice Agents different from Chatbots?

Technically, an AI-powered voice bot has two extra engines that a chatbot doesn’t need. Since chatbots do not deal with voice, the two engines related to voice (ASR and STT) are not required. The text input is fed directly to NLU and the intents and entities are extracted and the response is synthesized in text format and relayed back to the user.

Furthermore, voice queries on call bring with it certain challenges like noisy backgrounds, different accents and dialects of speaking the same language, language disfluencies and unique way of adding filler words and pauses, barge-in by a person while the other one is speaking; all of which directly impact accuracy. 

And for the same reason, voice bots are much more difficult to build. Everything has to be real-time within milliseconds and there is little to no room for error, else communication experience is hurt.

What sets voice bots apart is that they’re faster. Voice is the quickest and most natural form of human communication—faster than typing or navigating drop-down menus with a mouse. It continues to be one of the most sought-after by end customers seeking support.

What are the common applications of Digital Voice Agents and how does it add value?

The key to improving customer service is not just automating cognitively routine communications, but augmenting human agents and freeing up their time. This creates great self-service options, increases customer satisfaction and makes your employees more productive.

At a broad level, a Digital Voice Agent can be used whenever businesses want to communicate with their customers en-masse. However, let’s make it simple for you. There are two types of business communications:

Inbound communication

This is when a customer tries to call a business to get their queries resolved. For example, to register a complaint, to activate or deactivate a service etc.

Companies have contact centres to resolve the customer queries where human agents are trained to resolve the customer complaints coming from various channels such as calls, emails, social media etc.

How does a Digital Voice Agent add value here?

Automate mundane support queries: It can automate the simple repetitive queries end-to-end such as knowing the account balance in case of banking, the status of the order in case of e-commerce etc. Your human agents can now move to solve more complex queries. So your average service levels will drastically improve as your customers will be served without any waiting time.

Reduce average handling time: For more complex queries, Digital Voice Agents help reduce the average handling time of the human agent by collecting basic tasks, for example, caller verification, collecting basic information such as order number etc that is mandatory for the human agent to solve the query. After performing the preliminary checks the call can be transferred to the human agent with the context of the query and data collected so far.

Outbound communication

This is when a business tries to reach out to customers for a variety of reasons such as lead qualification calling, welcome calling, reminder calls, renewal calls.

How does a Digital Voice Agent add value here?

Lead Qualification: Since the Digital Voice Agent is a scalable machine, it can reach out to thousands of prospects concurrently in real-time as soon as the prospect has shown interest in the product or service to gauge interest and thereafter transfer the call to live agent in real-time to convert the customer. In the case of semi-qualified leads, it can mark those and send them to nurturing workflows. Your human agents are only given the more qualified leads to work on and hence human agent productivity shoots multifold.

Reminder calling: The Digital Voice Agent can place the automated calls to your existing customers based on pre-defined triggers such as on the nth day of the month or if the payment is not received by this day of the month etc. It eliminates the need for human agents for such simple tasks. It can take a propensity to pay or renew, the date by which it will be done, objection & FAQ handling, the reason for non-payment etc.

” About 75% of companies plan to invest in automation technologies such as AI and process automation in the next few years. AI, chatbots, voice bots and automated self-service technologies free up call centre employees from routine tier-1 support requests and repetitive tasks, so they can focus on more complex issues.” (Source: Deloitte)

Broadly, various kinds of voice bots are among the most popular automation solutions, and are quickly becoming a must-have for any contact centre. Skit’s Digital Voice Agents take it up a notch by being able to forge seamless human-AI partnerships for contact center modernization and optimization.

What are Digital Voice Agents good at compared to humans?

On-demand Scalability: Humans cannot be replicated on-demand. When we want to add a number of agents in the contact center, it takes its own sweet time of hiring, onboarding, and training. And it has to be repeated for every single agent we hire.

Digital Voice Agents can be scaled up and down as and when required with marginal cost.

Economic & Reliable: Employing human resources for repetitive mundane tasks is costlier. There would be a high cost of hiring, training, retraining, associated with a higher churn rate. And that has to be done for every human resource we employ. Bots on the other hand need to be built and trained only once and the benefit of incremental learning and retraining is huge and available across the board.

We all know that machines are exceptional at performing repetitive tasks with high efficiency and high reliability. If a Digital Voice Agent is asked by a customer not to call during office hours or to call at specific times in future, it can do so without fail. Humans are not so good at it.

Available 24×7: Machines don’t get tired or complain either. Sad but true that they don’t have a family to go to or need time to sleep. So you can be available to your customers round the clock.

Looking up for information in a knowledge base: Digital Voice Agents can easily fetch information from a knowledge base for answering a wide range of support queries. 

Consistent learning and training at scale: Apart from using Artificial Intelligence for answering questions, Skit’s Digital Voice Agents also leverage different machine learning models and past conversations to automatically improve the quality of answers.

Voice AI for Banking: Streamline Outbound Calling

The recent pandemic has reshaped consumer banking behaviours in many ways and has skyrocketed digital transformation in the banking sector. With social distancing becoming the new normal, most consumers prefer utilizing digital banking services over visiting the branch, even for important tasks. This in turn has spurred the evolution of agile business models backed by technologies like Artificial Intelligence (AI), Big Data, Blockchain etc. These technologies are also critical for cost reduction, an increasing priority for banks due to weak investment returns and market uncertainty.

As COVID-19 accelerates digital adoption across banks, CX will act as a major differentiator to help leapfrog competition by engaging customers with tailored and intelligent value propositions based on deep customer insights. In order to do so, banks need to transform their technology capabilities across the complex landscape of their technical assets, to deliver unique and highly personalized experiences at the right time, at scale. 

With the spike in the usage of digital banking, banks have also seen an influx of inbound calls. More customers are picking up their phones to get queries resolved. A similar trend is being seen in the number of outbound calls made by the banks for repayment reminders, Know-Your-Customer(KYC), and account registration.

Streamlining Outbound Calling 

Technologies such as Voice AI are empowering banks to automate inbound contact centres. This has enabled them to reduce average call waiting times, improve customer satisfaction scores and free agent bandwidth. While streamlining inbound calls is extremely critical for CX, equal attention needs to be given to streamlining outbound calls. 

Banks make thousands of calls each day to customers for various reasons. These calls can be for welcoming new customers, reminding them about a due payment, lead qualification, and more. By engaging with customers at the right time, banks strengthen their existing relationship with the customer which directly helps them in creating trust and building loyalty.

However, since all the calls are made by agents manually, banks are unable to meet the required goals. They’re in dire need to optimize the process and make it more efficient. To provide customers with a consistent experience they need to leverage new-age technologies like Voice AI.

Voice bots that are powered using Voice AI can converse with customers in a natural and multi-turn conversational style. The experience is very human-like. Voice bots can trigger outbound calls to engage with customers 24*7 in a scalable manner. You can completely customize the calls according to different parameters like frequency, during specific events, and more.

Lead Qualification

Banks receive millions of leads every month through various sources including the website, social media, partnerships and advertisements. Usually, agents call each lead up to understand the customer’s requirements better and gauge their interest level. However, a major problem is that a huge chunk of these leads are junk and agents end up spending their important time speaking to the wrong users rather than prioritizing the interested ones. This has a major impact on the number of conversions. 

However, voice bots can greatly help solve this problem for banks. Since the problem is with the qualification process, it can be completely handled by the voice bot without any human intervention. By seamlessly integrating with the CRM, the voice bot can fetch the customer’s phone number and trigger an outbound call. During the call, the voice bot asks the user different questions required to qualify them for a product. In case the customer has any questions, voice bots can also resolve them. If interested, the voice bot can directly transfer the call to the agent or schedule a convenient time for a callback. In case the call is missed, voice bots can also make periodic follow-up calls. 

According to the data collected by the voice bot, agents can prioritize their calls. This way they end up reaching the interested users first, significantly increasing the chances of conversion.

Let’s understand with an example. Assume, a user applies for a credit card online. They enter a few basic details like name, monthly salary, age, contact details, and more. Once the details are submitted, it is transferred to the voice bot. The bot fetches the contact details and triggers an outbound call. It asks the user multiple questions including the credit limit they were looking for, whether they have an existing bank account with them and more. All this information is automatically updated on the CRM. Agents can then go through all these users and filter out the interesting ones suitable for calling. 

Customer Activation

Converting a potential lead into a customer is not enough for banks. To generate revenue out of them, they need to ensure that they’re using their different products and services. For this, they need to focus on customer activation. They need to employ different strategies to help customers move faster in their life cycle. But onboarding thousands of customers every day requires a lot of resources and time. For banks to provide their users with a personalized onboarding experience and engage with them at regular intervals affordably, they’ll need to leverage the power of technology and automation.

When a retail customer opens a savings account, s/he doesn’t only get access to the account but other services such as net banking, debit card, phone banking and more. However, most customers don’t end up using these services. This is why banks need to onboard them and send periodic reminders to nudge them to use the product. Few banks do have dedicated in-house or outsourced teams who handle this. However, the process is not scalable and is extremely difficult to follow for all the customers. 

So, how can banks solve this? 

To onboard customers and engage with them across the customer journey, banks can leverage voice bots. Firstly, the bot can call each customer and onboard them by taking them through each service, answering FAQs, and resolving questions in case any. By educating them it removes the initial friction the customer might have in trying a particular service. Further, a voice bot can call the customer after a certain period to understand their experience and suggest different services. This helps banks in delivering personalized engagement across the customer lifecycle consistently. 

To further improve customer activation, banks can – 

Map the customer journey – Banks can map out all the important stages to ensure they engage with customers at the right time. For example, for a credit card user, different stages can be –

  • Credit Card Activation
  • First transaction
  • Reward Redemption

Customer Segmentation – To deliver a personalised customer and effective communication, banks need to segment their customers. Without this banks can end up spamming users with notifications each day making for a very poor experience.

Improving Propensity 

Most banks use product propensity to increase customer’s lifetime value and reduce attrition. For example, if there’s a customer X who’s been using the bank’s credit card services for multiple years, the bank can upsell a home loan to them at a special interest rate. Hence, by leveraging rich customer insights and segmentation, banks can with minimal effort upsell and cross-sell related products. This acts as an important lever for growth by directly contributing to the total revenue.

However, we cannot ignore the fact that even with data analytics and machine learning models, the number of customers who actually end up buying a product or showing interest is substantially lower. This is a huge problem for agents who usually are the ones who end up calling these customers. They end up wasting a lot of their important time. This is also one of the reasons why banks haven’t set up dedicated teams. 

One effective way to solve this is by doing a pre-qualification through a voice bot. Voice bots can call the customer and share the offer details. The bot can collect the interest level of the customer, get additional details required to process the offer and also answer common questions. By doing this pre-qualification, agents end up only speaking to customers who’re interested in the offer.

Not only does it save agent bandwidth but also increases agent productivity and reduces operational costs.

Friendly Payment Reminders 

Banks continually invest in resources and implement strategies to improve their payment collection rate. This is because even a marginal drop has a negative impact on their business and increases collection costs.

While often underutilized, the simplest way to ensure customers pay in a timely manner is by triggering reminders a few days before the repayment (be it credit cards or loans). This can be through different channels including calls, text messages and emails. By doing this customers can make repayments on time and avoid unwanted hassle and late payment charges. 

Banks can further increase the effectiveness of their reminders by using a voice bot. Unlike playing a recorded message, voice bots can allow banks to send personalized reminders, collect information and even help them to make payments in real-time. For example, if a user wants to make a repayment, voice bots can send a payment link on Whatsapp or text message. The voice bot can also help customers enable automatic payments or change the payment type. 

By enhancing the repayment experience, banks can significantly improve the collection rate and reduce collection costs. 

What’s Next? 

Banks have taken many rapid decisions to meet the changing customer needs. Be it ramping up security, digital banking capabilities or launching products that fit customer’s needs. This is the reason why they were so quick to adapt to the changes made by the pandemic. However, they need to continually innovate and launch new initiatives that focus on customer’s needs and their banking experience.

About Skit

Skit is an Augmented Voice Intelligence Platform, helping businesses modernize their contact centers and customer experience by automating and improving voice communications at scale. By enabling preemptive, intelligent problem solving and seamless live interactions, we have automated over 15 million calls for global enterprises across industries. We help our customers streamline their contact center operations, reduce costs, and also enhance customer experience and engagement.

Connect with us if you’re interested in learning more about the platform and how it can modernize and transform your contact center.

Voice AI To Resonate With and Retain Customers

Customers dislike long wait hours for query resolution and chatbots aren’t suitable for emergency requests. To ensure better services, Voice AI-led solutions work best.

In 2020, the University of Texas at Austin conducted an interesting experiment wherein 200 participants were invited to reconnect with an old friend through either a phone call or email. Despite admitting that a phone call would be more effective, some participants chose email to feel less awkward. And expectedly, those who connected through a phone call were able to form a stronger bond with their friend. It is the overall interaction experience that counts, be it in personal or professional settings.

Whether you were able to communicate, whether the other party understood your feelings, whether any misunderstandings were cleared and whether in the future both parties will be able to reestablish a connection. In the customer experience journey too, brands have chosen to connect with users over multiple points. There is text messaging, email, social media support, chatbots and the customer care centres/call centres.

Depending on the type of query, each customer is redirected to the specific touch-point. For instance, a customer seeking a bank account statement can simply get it through their net banking application while another customer looking for term insurance policies can get information through a chatbot. But for queries that require detailed insights, say reporting or KYC-related changes, customers are redirected to voice-based customer service executives.

Voice is powerful and unique to human beings. Speech goes back to human beginnings, which is almost a million years ago. The Linguistics Society of America estimates that writing was invented around 3200 B.C. It is the voice that gave rise to text, words and other forms of written communication. Because interacting through voice comes naturally to humans, it is self-taught. It is also easier to communicate thoughts through voice than any other medium simply because it is also upto seven times faster than typing. This means that one can have a longer conversation using voice.

Wait times are long and Interactive Voice Response (IVR) may not be helpful for emergency requests. Imagine your credit card getting stolen. You call up the bank’s customer care, but it takes you two minutes just to get to the appropriate node.

Another five minutes in reaching a customer care representative. And the ordeal still isn’t over because the customer care executive puts you on hold to verify details. Total time elapsed: 12 minutes. By the time the query is resolved, your card has probably been swiped at half a dozen places. An immediate solution is critical to protecting the brand reputation of companies. Ignoring customer grievances can often cost a company its clients. A study by Qualtrics XM Institute in the US found that 53% of consumers have cut spending after a single bad experience with a company.

Customers also complained that they missed a responsive mechanism in grievance resolutions. The answer is obvious. Customers prefer voice-based real interactions because this resolves queries quicker. And the practical solution is Voice AI. Built on the strong backbone of AI and Spoken Language
Understanding (SLU), Voice AI uses human-like mechanisms to receive requests, interpret and provide solutions.

NLP is the technology that the system uses to learn, understand and provide content in human languages. Unlike other solutions in this space, Voice AI is evolutionary. It can adapt to different commands and languages as it learns ‘on the job’ like us humans. Since NLP is at its core, Voice AI first hears the customer speak, converts it to text, filters out the noise, and then processes it with its neural networks. Following this, the system finds out the context of the conversation using AI. Based on this, a response is created and then communicated to the user by a
human-like voice. For instance, an individual who has a chequebook reissue request will have a different state of mind than someone who lost his debit card. Voice AI systems will differentiate between these two grievances and offer immediate support.

How customer expectations evolved

Whenever a customer contacts a company, they expect an instant response. An insurance customer only looking to renew their car policy is inundated with information about new products offerings. The same goes for bank/NBFC customers, where needless personal loans are pushed by the systems during such calls.

The pandemic added to such woes of customers. A Harvard Business Review study showed that there was an over 10% spike in ‘difficult’ callsigns in just two weeks between March 11-26. Customers wanted urgent resolution for travel issues, insurance claims, and payment extensions. Here, having a Voice AI solution can not just improve productivity and efficiency, but also improve customer trust during crisis periods.

Market Potential

The Voice AI market is at a nascent stage across the globe. It forms a part of the conversational AI segment that includes voice assistants and chatbots. With customers more accustomed to conversing with voice-devices at home, the same has translated to preferences in a business setting as well.

Gartner estimated conversational AI platforms would have $2.5 billion revenue in 2020, with a 75% year-on-year (YoY) growth. This is built on the premise that speaking is the most natural form of communication that is only set to deepen further.

Data from research platform Allied Market Research showed that the conversational AI space could potentially touch $32.62 billion by 2030, registering 20% YoY growth between 2021-30. For transaction-heavy sectors like healthcare, Voice AI could help solve existing bottlenecks.

In insurance, for example, a Voice AI could guide a customer to immediately file motor claim requests. Gauging the customer’s reactions, a human-like AI system could help calm their nerves and send help accordingly. Additional requests like towing services and highway pickups could also be provided. Since Voice AI is a system that adapts by interacting with customers constantly, the sooner it is deployed, the better will be the user experience.

Global Power

As part of an endeavour to reach all customer touchpoints, brands have globally deployed text-based solutions. But the varying internet penetration and variation in literacy levels could prove to be bottlenecks in customer experiences.

Using text for communication would not be effective in this case, so voice AI for customer services works best here. When it comes to sectoral requirements too, voice could help reduce the turnaround time for financial requests. Chat is able to process a lot of these queries too, but eventually customers prefer the medium of talking for final resolution. The high wait-times at contact centres of all financial institutions is proof.

Picture this. An insurance customer on an international trip meets with an accident and has to undergo an emergency procedure. But the hospital states that the authorities will need to verify the policy terms or speak to an insurance company official before conducting the surgery.

Here, waiting for an IVR response would simply delay the process, while chatbots may have to reroute the query to seek confirmation. On the other hand, a Voice AI would be able to disclose the policy details after authenticating the customer’s KYC details.

Identifying customers based on Know-Your-Customer and personal contact details is the next phase of growth for Voice AI systems. Once a customer’s voice is recorded for a couple of transactions, this would be used for all future conversations to ascertain and authenticate that it is indeed the registered user who is contacting the company. This would be useful for leisure services as well. For instance, seeking a special child seat at a restaurant on arrival often leads to chaos. Sending in written requests seldom works. Here, having a Voice AI that can decipher the messages and relay them back to the restaurant for timely service can be effective.

Future of Businesses

There was a phase where automated messaging was touted as the most preferred form of communication. This changed when voice assistants started seeping into the system. Globally, urban consumers have gotten used to voice assistants at home through connected devices and smart speakers. In fact, the number of Indians using voice queries daily on Google is nearly twice the global average. Since voice is a popular choice for customers’ personal use, this automatically translates into similar trends for businesses, too. A Deloitte study said that by 2030 there will be a proliferation of voice-led technology across the globe and that 30% of sales will happen via voice by 2030. Through voice-led interactions, sales will not only be more intelligent, but companies will also be able to refer to these calls to investigate user complaints.

The premise is clear. Voice is intuitive, easy to use, and has a quicker turnaround time. For customer-facing companies, it is a technology that can no longer be ignored. What’s better? Employees stay happy too. Goodbye to calls from irate customers, abusive user messages, and long working hours during busy seasons. Voice AI could become their complementary solution and improve their quality of work as well.

In a world where emotional intelligence and personalised interventions hold more value than automated responses, Voice AI will spearhead the change. The ones who adapt quicker and deploy voice will be the real winners in the long run.



To download this whitepaper as a PDF, click below.

Download the Whitepaper

Voice vs. Text : A Fundamental Difference in Approach

While chatbot vendors are now trying to offer an embedded solution that contains text and voice, these models cannot be clubbed into one platform.

By 2022, close to 200 million jobs would be lost globally due to the Covid crisis. A lot of those unemployed will need to make changes in their monthly payments like home loan tenure, convert their credit bills into EMIs, and remove value-added services. Such customers connecting with a company during an emotionally volatile state may not just be looking for a solution, but could also be seeking a sympathetic ear.

In such a situation, a vulnerable customer will prefer speaking to someone to differ payments rather than type a series of requests for each liability.

For instance, converting a credit bill into an EMI is one command that is executed after typing out a few details. Then comes home loan tenure increase, which requires another set of instructions. Here, it is faster and smoother for the information to be captured via voice.

Let’s take another example. A customer who lost a parent to Covid can file a death claim online. But speaking to a service executive who could empathetically listen to their concerns could soothe nerves during distress. Not only can the voice-led channel help minimise claim delays by specifying the exact documents needed, but the customer can also understand the formalities over a single call. We have read how the current customer service models are missing out on the primacy of voice. There is a perception in the market that having a single solution for text and voice will help bridge the gap. But simply building a voice solution over existing text solutions may hamper the user experience.

In customer service, voice is designed to understand the nuance and gravity of a request. This is true especially for emergency situations where customers may not have the time nor the mind space to sit and type requests like finding a network hospital or an unauthorised transaction through a bank account Trivializing voice and offering it as a ‘good-to-have’ solution by chat providers is counter-intuitive because voice is a specialized solution that encompasses the layers chat requires, plus catches peculiar behaviours like tone and pauses in speech. The demand for Voice AI has grown exponentially in the
past few years.

According to a report by Statista, the number of digital voice assistants is likely to reach 8.4 billion units by 2024. So, it makes sense that companies want to adapt to this growing trend.

Voice is convenient, especially because humans speak and perceive things differently over speech than text. For instance, an indecisive food-delivery customer who keeps changing his/her order may find it easier to finalise an order over voice rather than typing and selecting products. Having a voice conversation also enables them to make a faster decision on what food to order.

While the thrust is on ‘omnichannel’ presence by brands, deploying voice effectively could help resolve a lot of customer complaints across product and service categories. Being present across customer touchpoints is good, but resolving queries constructively and consistently on a single voice-led platform is better.

How is voice different from text?

Chabots follow a flow wherein the text input is fed into the spoken language understanding engine. This engine understands the input/query and decides on the next course of action. Based on the context of the conversation, the response is prepared in a text format but relayed back to the user.

Voice AI, on the other hand, has two engines specifically available to understand speech. One is a speech-to-text engine, and the other is an automatic speech recognition engine.

The last part of this process is the dialogue manager, which acts as the orchestrator of the entire conversation. This is the block that manages the flow of data among the above three blocks and the flow of the conversation. And all these processes happen within milliseconds over the cloud, so it is device agnostic.

The end goals of voice and text are also fundamentally different. Text is intended to resolve basic customer requests and redirect complicated questions to customer service personnel. For instance, a customer looking to book a restaurant table is able to ask multiple questions in one go through voice. These could be the waiting times at certain points of the day, the chef’s menu, and specific details about the dishes (ingredients, spicy, vegan alternatives, etc.). An added layer of benefit are newer concepts like paralinguistics being used in the Voice AI ecosystem. This involves communication other than spoken words, including tone, pitch, pauses, and gestures. For sales teams of customer-facing brands, this offers a tremendous opportunity to gauge a customer’s interest in the product and gauge their buy intent.

Once Voice AI determines who is more inclined to buy a product/service, additional time can be spent to explain to convince the customer. This essentially means that cross-selling products will be far easier and effective if these Voice AI solutions are deployed. Some sectors that could take advantage of this concept are hospitality chains, restaurants, and financial institutions selling retail products like credit cards and quick personal loans.

It is often noticed that customers need to be nudged to reveal information, a process that can be done effortlessly over voice. Say, a newly launched shoe brand wants deeper feedback on the products. Using the customer database, a caller could be contacted using Voice AI to seek a detailed response on the pros and cons of the shoes. A customer may like the product quality but may have found its pricing to be steep while another customer may be looking for newer colour options.

Customers seldom fill long review forms that are sent post-purchase, hence bringing voice into this equation helps in better assessment. Based on the collective feedback, companies will also be able to tweak their product offerings accordingly, leading to improvement in sales. Customers, too, feel satisfied that their opinions have been taken into consideration.

A clubbed solution isn’t effective

Voice is an ideal turf for AI to learn, evolve, and constantly upskill by taking due note of user sentiments and emotions. And the best part? The user doesn’t need to be able to write a language fluently. Voice AI provides the unmatched ability to interact through casual conversations.

Critical user feedback, including anger, can’t be spotted immediately on text. This is essential for companies involved in product development, where continuous feedback generation is the key to success. As stated earlier, chatbots rely on key terms such as bad, poor, or terrible to deduce that the experience is unsatisfactory. Voice, on the other hand, listens attentively to different users to understand their sentiments.

Vendors offering ‘text+voice’ combo products do not understand the performance requirements of Voice AI systems. Low latency or quick processing of data to offer the right answers is crucial. Right now, there seems to be a rush among brands to implement AI for customer service. But the key here is to operationalize a solution that is accurate and solves a given problem. The thing to remember is that context and slang change with geographies. They are different in different markets. This means each Voice AI system needs to be modified to suit the audiences in that location. This is where the expertise of market providers, such as Skit, comes in handy.

The emerging dynamics of voice

Voice works best for context-led conversations where tone and inflection can convey a response without using actual words. And as the technology develops, its use-cases have also been evolving. In areas like sales and product testing, Voice AI could be used to pitch the product better and sound more persuasive.

Customers are also more likely to interact for a longer duration with a Voice AI system that understands his/her specific needs. These conversations are also useful for training the internal systems and for conducting quality checks at a later stage. For example, a fintech company developing a buy-now-pay-later (BNPL) product could use an advanced Voice AI system to capture the purchasing patterns of a customer. Since it is responsive, the customer can also cross-question Voice AI on the relevance of the terms and conditions of the BNPL feature and default penalties.

And if the Voice AI notices that target customers are enquiring repeatedly about penalties, this can be relayed back to the brand so that its messaging can be tweaked to include the terms upfront.

Here, deploying voice to recognize and identify customer details will help prevent such risks. This is because the AI can identify regular pauses and also spot any nervous tones indicating the presence of fraudsters on the call. Psychological concepts like entrainment could be complementary to the existing services where customer interactions can be improved. For instance, an angry customer could be pacified through Voice AI speaking in a calmer voice tone. Similarly, if a customer will be understood even if he/she switches to a different language midway into the call.

Voice solutions are getting richer. While a lot of vendor solutions already exist in the market, specialized products are far and few in between. A Voice AI product that is constantly tested for different use-cases across sectors is what will be suitable for commercial use. In markets like the US where financial frauds lead to brands losing millions of dollars in revenue and also reputation loss, Voice AI could come handy in adding a layer of voice-led authentication.

Here, deploying voice to recognize and identify customer details will help prevent such risks. This is because the AI can identify regular pauses and also spot any nervous tones indicating the presence of fraudsters on the call.

Psychological concepts like entrainment could be complementary to the existing services where customer interactions can be improved. For instance, an angry customer could be pacified through Voice AI speaking in a calmer voice tone. Similarly, if a customer will be understood even if he/she switches to a different language midway into the call.

Voice solutions are getting richer. While a lot of vendor solutions already exist in the market, specialized products are far and few in between. A Voice AI product that is constantly tested for different use-cases across sectors is what will be suitable for commercial use.



To download this whitepaper as a PDF, click below.

Download the Whitepaper